Last chance! 50% off unlimited learning
Sale ends in
boxcoxCensored(x, censored, censoring.side = "left",
lambda = {if (optimize) c(-2, 2) else seq(-2, 2, by = 0.5)}, optimize = FALSE,
objective.name = "PPCC", eps = .Machine$double.eps,
include.x.and.censored = TRUE, prob.method = "michael-schucany",
plot.pos.con = 0.375)
NA
), undefined (NaN
), and infinite (-Inf, Inf
)
values are allowed but will be removed.x
are censored. This must be the
same length as x
. If the mode of censored
is "logical"
, TRUE
values
correspond to elements of "left"
(the default) and "right"
.optimize=FALSE
, the default value is
lambda=seq(-2, 2, by=0.5)
. When optimize=TRUE
, lambda
lambda
(optimize=FALSE
; the default), or to compute
the optimal power transformation within the bounds specified by
"PPCC"
(probability plot correlation coefficient; the default),
"Shapiro-Wilk"
(the Shapiro-Wilk goodness-of-fit statistic), and
"Log-Lik
lambda
is less
than eps
, lambda is assumed to be 0 for the Box-Cox transformation.
The default value is eps=.Machine$double.eps
.x
and the corresponding values of censored
with the
returned object. The default value is include.x.and.censored=TRUE
objective.name="PPCC"
. Possible values are
"kaplan-meier"
(product-limit metobjective.name="PPCC"
.
The default value is plot.pos.con=0.375
. See the DETAILS section
for boxcoxCensored
returns a list of class "boxcoxCensored"
containing the results.
See the help file for boxcoxCensored.object
for details.boxcoxTransform
for more information on data
transformations.
Box and Cox (1964) proposed choosing the appropriate value of $\lambda$ based on
maximizing the likelihood function. Alternatively, an appropriate value of
$\lambda$ can be chosen based on another objective, such as maximizing the
probability plot correlation coefficient or the Shapiro-Wilk goodness-of-fit
statistic.
Shumway et al. (1989) investigated extending the method of Box and Cox (1964) to
the case of Type I censored data, motivated by the desire to produce estimated
means and confidence intervals for air monitoring data that included censored
values.
In the case when optimize=TRUE
, the function boxcoxCensored
calls the
Rfunction nlminb
to minimize the negative value of the
objective (i.e., maximize the objective) over the range of possible values of
$\lambda$ specified in the argument lambda
. The starting value for
the optimization is always $\lambda=1$ (i.e., no transformation).
The next section explains assumptions and notation, and the section after that
explains how the objective is computed for the various options for
objective.name
.
Assumptions and Notation
Let $\underline{x}$ denote a random sample of $N$ observations from
some continuous distribution. Assume $n$ ($0 < n < N$) of these
observations are known and $c$ ($c=N-n$) of these observations are
all censored below (left-censored) or all censored above (right-censored) at
$k$ fixed censoring levels
objective.name="PPCC"
)
When objective.name="PPCC"
, the objective is computed as the value of the
normal probability plot correlation coefficient based on the transformed data
(see the description of the Probability Plot Correlation Coefficient (PPCC)
goodness-of-fit test in the help file for gofTestCensored
). That is,
the objective is the correlation coefficient for the normal
quantile-quantile plot for the transformed data.
Large values of the PPCC tend to indicate a good fit to a normal distribution.
Objective Based on Shapiro-Wilk Goodness-of-Fit Statistic (objective.name="Shapiro-Wilk"
)
When objective.name="Shapiro-Wilk"
, the objective is computed as the value of
the Shapiro-Wilk goodness-of-fit statistic based on the transformed data
(see the description of the Shapiro-Wilk test in the help file for
gofTestCensored
). Large values of the Shapiro-Wilk statistic tend to
indicate a good fit to a normal distribution.
Objective Based on Log-Likelihood Function (objective.name="Log-Likelihood"
)
When objective.name="Log-Likelihood"
, the objective is computed as the value
of the log-likelihood function. Assuming the transformed observations in
Equation (4) above come from a normal distribution with mean $\mu$ and
standard deviation $\sigma$, we can use the change of variable formula to
write the log-likelihood function as follows.
For Type I left censored data, the likelihood function is given by:
enormCensored
).
Thus, when optimize=TRUE
, Equation (6) or (10) is maximized by iteratively
solving for $\lambda$ using the MLEs for $\mu$ and $\sigma$.
When optimize=FALSE
, the value of the objective is computed by using
Equation (6) or (10), using the values of $\lambda$ specified in the
argument lambda
, and using the MLEs of $\mu$ and $\sigma$.boxcoxCensored.object
, plot.boxcoxCensored
,
print.boxcoxCensored
,
boxcox
, Data Transformations, Goodness-of-Fit Tests.# Generate 15 observations from a lognormal distribution with
# mean=10 and cv=2 and censor the observations less than 2.
# Then generate 15 more observations from this distribution and
# censor the observations less than 4.
# Then Look at some values of various objectives for various transformations.
# Note that for both the PPCC objective the optimal value is about -0.3,
# whereas for the Log-Likelihood objective it is about 0.3.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
x.1 <- rlnormAlt(15, mean = 10, cv = 2)
censored.1 <- x.1 < 2
x.1[censored.1] <- 2
x.2 <- rlnormAlt(15, mean = 10, cv = 2)
censored.2 <- x.2 < 4
x.2[censored.2] <- 4
x <- c(x.1, x.2)
censored <- c(censored.1, censored.2)
#--------------------------
# Using the PPCC objective:
#--------------------------
boxcoxCensored(x, censored)
#Results of Box-Cox Transformation
#Based on Type I Censored Data
#---------------------------------
#
#Objective Name: PPCC
#
#Data: x
#
#Censoring Variable: censored
#
#Censoring Side: left
#
#Censoring Level(s): 2 4
#
#Sample Size: 30
#
#Percent Censored: 26.7%
#
# lambda PPCC
# -2.0 0.8954683
# -1.5 0.9338467
# -1.0 0.9643680
# -0.5 0.9812969
# 0.0 0.9776834
# 0.5 0.9471025
# 1.0 0.8901990
# 1.5 0.8187488
# 2.0 0.7480494
boxcoxCensored(x, censored, optimize = TRUE)
#Results of Box-Cox Transformation
#Based on Type I Censored Data
#---------------------------------
#
#Objective Name: PPCC
#
#Data: x
#
#Censoring Variable: censored
#
#Censoring Side: left
#
#Censoring Level(s): 2 4
#
#Sample Size: 30
#
#Percent Censored: 26.7%
#
#Bounds for Optimization: lower = -2
# upper = 2
#
#Optimal Value: lambda = -0.3194799
#
#Value of Objective: PPCC = 0.9827546
#-----------------------------------
# Using the Log-Likelihodd objective
#-----------------------------------
boxcoxCensored(x, censored, objective.name = "Log-Likelihood")
#Results of Box-Cox Transformation
#Based on Type I Censored Data
#---------------------------------
#
#Objective Name: Log-Likelihood
#
#Data: x
#
#Censoring Variable: censored
#
#Censoring Side: left
#
#Censoring Level(s): 2 4
#
#Sample Size: 30
#
#Percent Censored: 26.7%
#
# lambda Log-Likelihood
# -2.0 -95.38785
# -1.5 -84.76697
# -1.0 -75.36204
# -0.5 -68.12058
# 0.0 -63.98902
# 0.5 -63.56701
# 1.0 -66.92599
# 1.5 -73.61638
# 2.0 -82.87970
boxcoxCensored(x, censored, objective.name = "Log-Likelihood",
optimize = TRUE)
#Results of Box-Cox Transformation
#Based on Type I Censored Data
#---------------------------------
#
#Objective Name: Log-Likelihood
#
#Data: x
#
#Censoring Variable: censored
#
#Censoring Side: left
#
#Censoring Level(s): 2 4
#
#Sample Size: 30
#
#Percent Censored: 26.7%
#
#Bounds for Optimization: lower = -2
# upper = 2
#
#Optimal Value: lambda = 0.3049744
#
#Value of Objective: Log-Likelihood = -63.2733
#----------
# Plot the results based on the PPCC objective
#---------------------------------------------
boxcox.list <- boxcoxCensored(x, censored)
dev.new()
plot(boxcox.list)
#Look at QQ-Plots for the candidate values of lambda
#---------------------------------------------------
plot(boxcox.list, plot.type = "Q-Q Plots", same.window = FALSE)
#==========
# Clean up
#---------
rm(x.1, censored.1, x.2, censored.2, x, censored, boxcox.list)
graphics.off()
Run the code above in your browser using DataLab