bestNormalize v0.2.2
Monthly downloads
Normalizing Transformation Functions
Estimate a suite of normalizing transformations, including
a new technique based on ranks which can guarantee normally distributed
transformed data if there are no ties: Ordered Quantile Normalization.
The package is built to estimate the best normalizing transformation for
a vector consistently and accurately. It implements the Box-Cox
transformation, the Yeo-Johnson transformation, three types of Lambert
WxF transformations, and the Ordered Quantile normalization
transformation.
Readme
bestNormalize: Flexibly calculate the best normalizing transformation for a vector
The bestNormalize
R package was designed to help find a normalizing transformation for a vector. There are many techniques that have been developed in this aim, however each has been subject to their own strengths/weaknesses, and it is unclear on how to decide which will work best until the data is oberved. This package will look at a range of possible transformations and return the best one, i.e. the one that makes it look the most normal.
This package also introduces a new normalization technique, Ordered Quantile normalization (orderNorm()
), which transforms the data based off of a rank mapping to the normal distribution, which allows us to guarantee normally distributed transformed data (if ties are not present).
Installation
You can install bestNormalize from github with:
# install.packages("devtools")
devtools::install_github("petersonR/bestNormalize")
Example
In this example, we generate 1000 draws from a gamma distribution, and normalize them:
library(bestNormalize)
set.seed(100)
x <- rgamma(1000, 1, 1)
# Estimate best transformation
BN_obj <- bestNormalize(x)
BN_obj
#> Best Normalizing transformation with 1000 Observations
#> Estimated Normality Statistics (Pearson P / df, lower => more normal):
#> - Box-Cox: 0.8188
#> - Lambert's W: 1.28
#> - Yeo-Johnson: 5.8284
#> - orderNorm: 0.0066
#>
#> Based off these, bestNormalize chose:
#> OrderNorm Transformation with 1000 nonmissing obs and no ties
#> - Original quantiles:
#> 0% 25% 50% 75% 100%
#> 0.000 0.253 0.693 1.437 7.431
# Perform transformation
gx <- predict(BN_obj)
# Perform reverse transformation
x2 <- predict(BN_obj, newdata = gx, inverse = TRUE)
# Prove the transformation is 1:1
all.equal(x2, x)
#> [1] TRUE
Functions in bestNormalize
Name | Description | |
autotrader | Prices of 6,283 cars listed on Autotrader | |
bestNormalize-package | bestNormalize: Flexibly calculate the best normalizing transformation for a vector | |
orderNorm | Calculate and perform Ordered Quantile normalizing transformation | |
yeojohnson | Yeo-Johnson Normalization | |
boxcox | Box-Cox Normalization | |
lambert | Lambert W x F Normalization | |
bestNormalize | Calculate and perform best normalizing transformation | |
binarize | Binarize | |
No Results! |
Vignettes of bestNormalize
Name | ||
bestNormalize.Rmd | ||
No Results! |
Last month downloads
Details
Type | Package |
Date | 2017-11-13 |
URL | https://github.com/petersonR/bestNormalize |
License | GPL-3 |
VignetteBuilder | knitr |
LazyData | true |
RoxygenNote | 6.0.1 |
NeedsCompilation | no |
Packaged | 2017-11-14 16:19:48 UTC; rpterson |
Repository | CRAN |
Date/Publication | 2017-11-14 16:26:23 UTC |
Include our badge in your README
[](http://www.rdocumentation.org/packages/bestNormalize)