powerTransform
uses the maximum likelihood-like approach of Box and Cox (1964) to select a transformatiion of a univariate or multivariate response for normality, linearity and/or constant variance. Available families are the default Box-Cox power family, and the Yeo-Johnson and skew power familes that may be useful when a response is not strictly positive. powerTransform
passes information to estimateTransform
, and only the former will be of interest for most users.powerTransform(object, ...)
## S3 method for class 'default':
powerTransform(object, family="bcPower", ...)
## S3 method for class 'lm':
powerTransform(object, family="bcPower", ...)
## S3 method for class 'formula':
powerTransform(object, data, subset, weights, na.action, family="bcPower",
...)
estimateTransform(X, Y, weights=NULL, family="bcPower", start=NULL,
method="L-BFGS-B", ...)
## S3 method for class 'default':
estimateTransform(X, Y, weights=NULL, family="bcPower", start=NULL,
method="L-BFGS-B", ...)
## S3 method for class 'skewPower':
estimateTransform(X, Y, weights=NULL, ...)
lm
, a formula, or a matrix or vector; see below.lm
.lm
.lm
."bcPower"
the default for the Box-Cox power family; "yjpower"
for the Yeo-Johnson family, and "skewPower"
for the two-parameter skew power faestimateTransform
which does the actual computing, or to the optim
function, which does the maximization.optim
for the
maximization. The default "L-BFGS-B"
appears to work well.powerTransform
or class skewpowerTransform
if family="skewPower"
that
inherits from powerTransfrom
is returned, including the components listed below.
Several methods are available for use with powerTransform
objects. The coef
method returns
the estimated transformation parameters, while coef(object, round=TRUE)
will
return the transformations rounded to nearby convenient values within 1.96
standard errors of the mle, if any exist.
The vcov
function returns the estimated covariance matrix of the
estimated
transformation parameters. A print
method is used to
print the estimates and summary
method provides more information including likelihood ratio type
tests that all power parameters equal one and that all transformation
parameters equal zero, for log transformations, and for a convenient rounded value
not far from the mle. In the case of the skew power family, these tests are based on the profile log-likelihood obtained by maximizing over the start parameter, thus treating the start as a nusiance parameter of lesser interest than the pwoer parameter. testTransform
can be called
directly to test any other value for $\lambda$ or for skew power $\lambda$ and $\gamma$. There is a plot.powerTransform
method for plotting the transformed values, and also a contour.skewpowerTransform
method to obtain a contour plot of the two-dimensional log-likelihood for the skew power parameters when the response in univariate. Finally, the boxCox
method can be used to plot the univariate log-likleihood for the Box-Cox or Yeo-Johnson power families, or the profile log-likelihood of each of the parameters in the skew power family.
The components of the returned object areoptim
.optim
.optim
.powerTransform
is used to estimate normalizing/linearizing/variance stabilizing transformations
of a univariate or a multivariate response in a linear regression. For a univariate response,
a formula like z~x1+x2+x3
will estimate a transformation for the response
z
from a family of transformations indexed by one parameter for Box-Cox and Yeo-Johnson transformations,
or two parameters for the skew power family,
that makes the residuals from the regression of the transformed z
on the predictors
as closed to normally distributed as possible.
For a formula like cbind(y1,y2,y3)~x1+x2+x3
, the three variables on
the left-side are all transformed, generally with different transformations
to make all the residuals as close to
normally distributed as possible. This is not the same as three univariate transformations becuase the variables transformed are allowed to be correlated. cbind(y1,y2,y3)~1
would specify transformations
to multivariate normality with no predictors. This generalizes the multivariate
power transformations suggested by Velilla (1993) by allowing for different
families of transformations, and by allowing for predictors. Cook and Weisberg (1999)
and Weisberg (2014) suggest the usefulness of transforming
a set of predictors z1, z2, z3
for multivariate normality and for transforming
for multivariate normality conditional on levels of a factor, which is equivalent
to setting the predictors to be indicator variables for that factor.
Specifying the first argument as a vector, for example
powerTransform(ais$LBM)
, is equivalent to
powerTransform(LBM ~ 1, ais)
. Similarly,
powerTransform(cbind(ais$LBM, ais$SSF))
, where the first argument is a matrix
rather than a formula is equivalent to specification of a mulitvariate linear model
powerTransform(cbind(LBM, SSF) ~ 1, ais)
.
Three families of power transformations are available.
The Box-Cox pwoer family of power transformations,
family="bcPower"
,
equals $(U^{\lambda}-1)/\lambda$
for $\lambda$ $\neq$ 0, and
$\log(U)$ if $\lambda =0$. A scaled version of this transformation is used in computing with all the families to make the Jacobian of the transformation equal to 1.
If family="yjPower"
then the Yeo-Johnson transformations are used.
This is is Box-Cox transformation of $U+1$ for nonnegative values,
and of $|U|+1$ with parameter $2-\lambda$ for $U$
negative.
If family="skewPower"
then the skew power family of transformations suggested
by Hawkins and Weisberg (2015) is used. This is a two-parameter family that would
generally be applied with a response with occasional negative values; see skewPower
for the details and examples. This family has a power parameter $\lambda$ and a non-negative start parameter $\gamma$, with $\gamma = 0$ equal to the Box-Cox transformation.
The function testTransform
is used to obtain likelihood ratio
tests for
any specified value for the transformation parameter(s).testTransform
,
optim
, bcPower
, skewPower
, transform
, boxCox
. Documentation for
skewPower
includes examples of the use of the skew power family.# Box Cox Method, univariate
summary(p1 <- powerTransform(cycles ~ len + amp + load, Wool))
# fit linear model with transformed response:
coef(p1, round=TRUE)
summary(m1 <- lm(bcPower(cycles, p1$roundlam) ~ len + amp + load, Wool))
# Multivariate Box Cox
summary(powerTransform(cbind(len, ADT, trks, sigs1) ~ 1, Highway1))
# Multivariate transformation to normality within levels of 'hwy'
summary(a3 <- powerTransform(cbind(len, ADT, trks, sigs1) ~ hwy, Highway1))
# test lambda = (0 0 0 -1)
testTransform(a3, c(0, 0, 0, -1))
# save the rounded transformed values, plot them with a separate
# color for each highway type
transformedY <- bcPower(with(Highway1, cbind(len, ADT, trks, sigs1)),
coef(a3, round=TRUE))
pairs(transformedY, col=as.numeric(Highway1$hwy))
Run the code above in your browser using DataLab