bestNormalize v0.2.2

0

Monthly downloads

0th

Percentile

Normalizing Transformation Functions

Estimate a suite of normalizing transformations, including a new technique based on ranks which can guarantee normally distributed transformed data if there are no ties: Ordered Quantile Normalization. The package is built to estimate the best normalizing transformation for a vector consistently and accurately. It implements the Box-Cox transformation, the Yeo-Johnson transformation, three types of Lambert WxF transformations, and the Ordered Quantile normalization transformation.

Readme

bestNormalize: Flexibly calculate the best normalizing transformation for a vector

The bestNormalize R package was designed to help find a normalizing transformation for a vector. There are many techniques that have been developed in this aim, however each has been subject to their own strengths/weaknesses, and it is unclear on how to decide which will work best until the data is oberved. This package will look at a range of possible transformations and return the best one, i.e. the one that makes it look the most normal.

This package also introduces a new normalization technique, Ordered Quantile normalization (orderNorm()), which transforms the data based off of a rank mapping to the normal distribution, which allows us to guarantee normally distributed transformed data (if ties are not present).

Installation

You can install bestNormalize from github with:

# install.packages("devtools")
devtools::install_github("petersonR/bestNormalize")

Example

In this example, we generate 1000 draws from a gamma distribution, and normalize them:

library(bestNormalize)
set.seed(100)
x <- rgamma(1000, 1, 1)

# Estimate best transformation
BN_obj <- bestNormalize(x)
BN_obj
#> Best Normalizing transformation with 1000 Observations
#>  Estimated Normality Statistics (Pearson P / df, lower => more normal):
#>  - Box-Cox: 0.8188 
#>  - Lambert's W: 1.28 
#>  - Yeo-Johnson: 5.8284 
#>  - orderNorm: 0.0066 
#>  
#> Based off these, bestNormalize chose:
#> OrderNorm Transformation with 1000 nonmissing obs and no ties 
#>  - Original quantiles:
#>    0%   25%   50%   75%  100% 
#> 0.000 0.253 0.693 1.437 7.431

# Perform transformation
gx <- predict(BN_obj)

# Perform reverse transformation
x2 <- predict(BN_obj, newdata = gx, inverse = TRUE)

# Prove the transformation is 1:1
all.equal(x2, x)
#> [1] TRUE

Functions in bestNormalize

Name Description
autotrader Prices of 6,283 cars listed on Autotrader
bestNormalize-package bestNormalize: Flexibly calculate the best normalizing transformation for a vector
orderNorm Calculate and perform Ordered Quantile normalizing transformation
yeojohnson Yeo-Johnson Normalization
boxcox Box-Cox Normalization
lambert Lambert W x F Normalization
bestNormalize Calculate and perform best normalizing transformation
binarize Binarize
No Results!

Vignettes of bestNormalize

Name
bestNormalize.Rmd
No Results!

Last month downloads

Details

Type Package
Date 2017-11-13
URL https://github.com/petersonR/bestNormalize
License GPL-3
VignetteBuilder knitr
LazyData true
RoxygenNote 6.0.1
NeedsCompilation no
Packaged 2017-11-14 16:19:48 UTC; rpterson
Repository CRAN
Date/Publication 2017-11-14 16:26:23 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/bestNormalize)](http://www.rdocumentation.org/packages/bestNormalize)