Learn R Programming

Transform (version 1.0)

yjTransform: Yeo- Johnson Transformation for Normality

Description

yjTransform performs Yeo- Johnson transformation for normality of a variable and provides graphical analysis.

Usage

yjTransform(data, lambda = seq(-3,3,0.01), plot = TRUE, alpha = 0.05, 
  verbose = TRUE)

Value

A list with class "yj" containing the following elements:

method

method to estimate Yeo-Johnson transformation parameter

lambda.hat

estimate of Yeo-Johnson transformation parameter

statistic

Shapiro-Wilk test statistic for transformed data

p.value

Shapiro-Wilk test p.value for transformed data

alpha

level of significance to assess normality

tf.data

transformed data set

var.name

variable name

Arguments

data

a numeric vector of data values.

lambda

a vector which includes the sequence of candidate lambda values. Default is set to (-3,3) with increment 0.01.

plot

a logical to plot histogram with its density line and qqplot of raw and transformed data. Defaults plot = TRUE.

alpha

the level of significance to check the normality after transformation. Default is set to alpha = 0.05.

verbose

a logical for printing output to R console.

Author

Muge Coskun Yildirim, Osman Dag

Details

Denote \(y\) the variable at the original scale and \(y'\) the transformed variable. The Yeo-Johnson power transformation is defined by:

$$y' = \left\{ \begin{array}{ll} \frac{(y+1)^\lambda-1}{\lambda} \mbox{ , if $\lambda \neq 0, y \geq 0$} \cr \log(y+1) \mbox{ , if $\lambda = 0, y \geq 0$} \cr \frac{(1-y)^{2-\lambda}-1}{\lambda-2} \mbox{ , if $\lambda \neq 2, y < 0$} \cr -\log(1-y) \mbox{ , if $\lambda = 2, y < 0$} \end{array} \right.$$

References

Asar, O., Ilk, O., Dag, O. (2017). Estimating Box-Cox Power Transformation Parameter via Goodness of Fit Tests. Communications in Statistics - Simulation and Computation, 46:1, 91--105.

Yeo, I.K., Johnson, R.A. (2000). A New Family of Power Transformations to Improve Normality or Symmetry. Biometrika, 87:4, 954--9.

Examples

Run this code


data <- cars$dist

library(Transform)
out <- yjTransform(data)
out$lambda.hat # the estimate of Yeo- Johnson parameter based on Shapiro-Wilk test statistic 
out$p.value # p.value of Shapiro-Wilk test for transformed data 
out$tf.data # transformed data set


Run the code above in your browser using DataLab