bestNormalize: Flexibly calculate the best normalizing transformation for a vector

The bestNormalize R package was designed to help find a normalizing transformation for a vector. There are many techniques that have been developed in this aim, however each has been subject to their own strengths/weaknesses, and it is unclear on how to decide which will work best until the data is oberved. This package will look at a range of possible transformations and return the best one, i.e. the one that makes it look the most normal.

This package also introduces a new normalization technique, Ordered Quantile normalization (orderNorm()), which transforms the data based off of a rank mapping to the normal distribution, which allows us to guarantee normally distributed transformed data (if ties are not present).

Installation

You can install bestNormalize from github with:

# install.packages("devtools")
devtools::install_github("petersonR/bestNormalize")

Example

In this example, we generate 1000 draws from a gamma distribution, and normalize them:

library(bestNormalize)
set.seed(100)
x <- rgamma(1000, 1, 1)

# Estimate best transformation
BN_obj <- bestNormalize(x)
BN_obj
#> Best Normalizing transformation with 1000 Observations
#>  Estimated Normality Statistics (Pearson P / df, lower => more normal):
#>  - Box-Cox: 0.8188 
#>  - Lambert's W: 1.28 
#>  - Yeo-Johnson: 5.8284 
#>  - orderNorm: 0.0066 
#>  
#> Based off these, bestNormalize chose:
#> OrderNorm Transformation with 1000 nonmissing obs and no ties 
#>  - Original quantiles:
#>    0%   25%   50%   75%  100% 
#> 0.000 0.253 0.693 1.437 7.431

# Perform transformation
gx <- predict(BN_obj)

# Perform reverse transformation
x2 <- predict(BN_obj, newdata = gx, inverse = TRUE)

# Prove the transformation is 1:1
all.equal(x2, x)
#> [1] TRUE

bestNormalize: Flexibly calculate the best normalizing transformation for a vector

Installation

Example

Copy Link

Version

Install

Monthly Downloads

Version

License

Issues

Pull Requests

Stars

Forks

Repository

Maintainer

Last Published

Functions in bestNormalize (0.2.2)