egammaCensored(x, censored, method = "mle", censoring.side = "left",
ci = FALSE, ci.method = "profile.likelihood", ci.type = "two-sided",
conf.level = 0.95, n.bootstraps = 1000, pivot.statistic = "z",
ci.sample.size = sum(!censored))
NA
), undefined (NaN
), and
infinite (Inf
, -Inf
) values are allowed but will be removed.x
are censored.
This must be the same length as x
. If the mode of censored
is
"logical"
, TRUE
values correspond to elements of method="mle"
)."left"
(the default) and "right"
.ci=FALSE
."profile.likelihood"
(profile likelihood; the default),
"normal.approx"
(normal approximation)"two-sided"
(the default), "lower"
, and
"upper"
. This argument is ignored if ci=FALSE
.conf.level=0.95
. This argument is ignored if
ci=FALSE
.ci.type="bootstrap"
. This
argument is ignored if ci=FALSE
and/or ci.method
does not
equal ci.method="normal.approx"
or
ci.method="normal.approx.w.cov"
(see the DETAILS section). The possibpivot.statistic="t"
and
ci.method="normal.approx"
. The default value is the number of
uncensored observations."estimateCensored"
containing the estimated parameters
and other information. See estimateCensored.object
for details.x
or censored
contain any missing (NA
), undefined (NaN
) or
infinite (Inf
, -Inf
) values, they will be removed prior to
performing the estimation.
Let $\underline{x}$ denote a vector of $N$ observations from a
gamma distribution with parameters
shape=
$\alpha$ and scale=
$\beta$.
The relationship between these parameters and the mean $\mu$
and coefficient of variation $\tau$ of this distribution is given by:
method="mle"
)
For Type I left censored data, the likelihood function is given by:
ci.method="profile.likelihood"
)
This method was proposed by Cox (1970, p.88), and Venzon and Moolgavkar (1988)
introduced an efficient method of computation. This method is also discussed by
Stryhn and Christensen (2003) and Royston (2007).
The idea behind this method is to invert the likelihood-ratio test to obtain a
confidence interval for the mean $\mu$ while treating the coefficient of variation
$\tau$ as a nuisance parameter. Equation (7) above
shows the form of the likelihood function $L(\mu, \tau | \underline{x})$ for
multiply left-censored data, where $\mu$ and $\tau$ are defined by
Equations (3) and (4), and Equation (10) shows the function for
multiply right-censored data.
Following Stryhn and Christensen (2003), denote the maximum likelihood estimates
of the mean and coefficient of variation by $(\mu^*, \tau^*)$. The likelihood
ratio test statistic ($G^2$) of the hypothesis $H_0: \mu = \mu_0$
(where $\mu_0$ is a fixed value) equals the drop in $2 log(L)$ between the
ci.method="normal.approx"
)
This method constructs approximate $(1-\alpha)100%$ confidence intervals for
$\mu$ based on the assumption that the estimator of $\mu$ is
approximately normally distributed. That is, a two-sided $(1-\alpha)100%$
confidence interval for $\mu$ is constructed as:
ci.sample.size
determines the value of $m$ and by
default is equal to the number of uncensored observations.
This is simply an ad-hoc method of constructing
confidence intervals and is not based on any published theoretical results.
When pivot.statistic="z"
, the $p$'th quantile from the
standard normal distribution is used in place of the
$p$'th quantile from Student's t-distribution.
The standard deviation of the mle of $\mu$ is
estimated based on the inverse of the Fisher Information matrix.
Bootstrap and Bias-Corrected Bootstrap Approximation (ci.method="bootstrap"
)
The bootstrap is a nonparametric method of estimating the distribution
(and associated distribution parameters and quantiles) of a sample statistic,
regardless of the distribution of the population from which the sample was drawn.
The bootstrap was introduced by Efron (1979) and a general reference is
Efron and Tibshirani (1993).
In the context of deriving an approximate $(1-\alpha)100%$ confidence
interval for the population mean $\mu$, the bootstrap can be broken down into the
following steps:
egammaCensored
, the number of bootstraps$B$is
determined by the argumentn.bootstraps
(see the section ARGUMENTS above).
The default value ofn.bootstraps
is1000
.ecdfPlot
), and then create a confidence interval for$\mu$based on this estimated cdf.egammaCensored
calls the Rfunction quantile
to compute the empirical quantiles used in Equations (17)-(19).
The percentile method bootstrap confidence interval is only first-order
accurate (Efron and Tibshirani, 1993, pp.187-188), meaning that the probability
that the confidence interval will contain the true value of $\mu$ can be
off by $k/\sqrt{N}$, where $k$is some constant. Efron and Tibshirani
(1993, pp.184-188) proposed a bias-corrected and accelerated interval that is
second-order accurate, meaning that the probability that the confidence interval
will contain the true value of $\mu$ may be off by $k/N$ instead of
$k/\sqrt{N}$. The two-sided bias-corrected and accelerated confidence interval is
computed as:
ci.method="bootstrap"
, the function egammaCensored
computes both
the percentile method and bias-corrected and accelerated method
bootstrap confidence intervals.egammaAltCensored
, GammaDist, egamma
,
estimateCensored.object
.# Chapter 15 of USEPA (2009) gives several examples of estimating the mean
# and standard deviation of a lognormal distribution on the log-scale using
# manganese concentrations (ppb) in groundwater at five background wells.
# In EnvStats these data are stored in the data frame
# EPA.09.Ex.15.1.manganese.df.
# Here we will estimate the shape and scale parameters using
# the data ON THE ORIGINAL SCALE, using the MLE and
# assuming a gamma distribution.
# First look at the data:
#-----------------------
EPA.09.Ex.15.1.manganese.df
# Sample Well Manganese.Orig.ppb Manganese.ppb Censored
#1 1 Well.1 <5 5.0 TRUE
#2 2 Well.1 12.1 12.1 FALSE
#3 3 Well.1 16.9 16.9 FALSE
#...
#23 3 Well.5 3.3 3.3 FALSE
#24 4 Well.5 8.4 8.4 FALSE
#25 5 Well.5 <2 2.0 TRUE
longToWide(EPA.09.Ex.15.1.manganese.df,
"Manganese.Orig.ppb", "Sample", "Well",
paste.row.name = TRUE)
# Well.1 Well.2 Well.3 Well.4 Well.5
#Sample.1 <5 <5 <5 6.3 17.9
#Sample.2 12.1 7.7 5.3 11.9 22.7
#Sample.3 16.9 53.6 12.6 10 3.3
#Sample.4 21.6 9.5 106.3 <2 8.4
#Sample.5 <2 45.9 34.5 77.2 <2
# Now estimate the shape and scale parameters
# using the MLE, and compute a confidence interval
# for the mean using the profile-likelihood method.
#---------------------------------------------------
with(EPA.09.Ex.15.1.manganese.df,
egammaCensored(Manganese.ppb, Censored, ci = TRUE))
#Results of Distribution Parameter Estimation
#Based on Type I Censored Data
#--------------------------------------------
#
#Assumed Distribution: Gamma
#
#Censoring Side: left
#
#Censoring Level(s): 2 5
#
#Estimated Parameter(s): shape = 0.6370043
# scale = 30.8707533
#
#Estimation Method: MLE
#
#Data: Manganese.ppb
#
#Censoring Variable: Censored
#
#Sample Size: 25
#
#Percent Censored: 24%
#
#Confidence Interval for: mean
#
#Confidence Interval Method: Profile Likelihood
#
#Confidence Interval Type: two-sided
#
#Confidence Level: 95%
#
#Confidence Interval: LCL = 12.25151
# UCL = 34.35332
#----------
# Compare the confidence interval for the mean
# based on assuming a lognormal distribution versus
# assuming a gamma distribution.
with(EPA.09.Ex.15.1.manganese.df,
elnormAltCensored(Manganese.ppb, Censored,
ci = TRUE))$interval$limits
# LCL UCL
#12.37629 69.87694
with(EPA.09.Ex.15.1.manganese.df,
egammaCensored(Manganese.ppb, Censored,
ci = TRUE))$interval$limits
# LCL UCL
#12.25151 34.35332
Run the code above in your browser using DataLab