The function aldvmm fits adjusted limited dependent variable mixture models
of health state utilities. Adjusted limited dependent variable mixture
models are finite mixtures of normal distributions with an accumulation of
density mass at the limits, and a gap between 100% quality of life and
the next smaller utility value. The package aldvmm uses the
likelihood and expected value functions proposed by Hernandez Alava and
Wailoo (2015) using normal component distributions and a multinomial logit
model of probabilities of component membership.
aldvmm(
formula,
data,
subset = NULL,
psi,
ncmp = 2,
dist = "normal",
optim.method = NULL,
optim.control = list(trace = FALSE),
optim.grad = TRUE,
init.method = "zero",
init.est = NULL,
init.lo = NULL,
init.hi = NULL,
se.fit = FALSE,
model = TRUE,
level = 0.95,
na.action = "na.omit"
)aldvmm
returns an object of class "aldvmm". An object of class "aldvmm" is a list containing the following objects.
coefa numeric vector of parameter estimates.
hessiana numeric matrix object with second partial derivatives of the likelihood function.
cova numeric matrix object with covariances of parameters.
na scalar representing the number of observations that were used in the estimation.
ka scalar representing the number of components that were mixed.
df.nullan integer value of the residual degrees of freedom of a null model including intercepts and standard errors.
df.residualan integer value of the residual degrees of freedom..
iteran integer value of the number of iterations used in optimization.
convergencean integer value indicating convergence. "0" indicates successful completion.
gofa list including the following elements.
lla numeric value of the negative log-likelihood \(-ll\).
aica numeric value of the Akaike information criterion \(AIC = 2n_{par} - 2ll\).
bica numeric value of the Bayesian information criterion \(BIC = n_{par}*log(n_{obs}) - 2ll\).
msea numeric value of the mean squared error \(\sum{(y - \hat{y})^2}/(n_{obs} - n_{par})\).
maea numeric value of the mean absolute error \(\sum{|y - \hat{y}|}/(n_{obs} - n_{par})\).
preda list including the following elements.
ya numeric vector of observed outcomes in 'data'.
yhata numeric vector of fitted values.
resa numeric vector of residuals.
se.fita numeric vector of the standard error of fitted values.
lower.fita numeric vector of 95% lower confidence limits of fitted values.
upper.fita numeric vector of 95% upper confidence limits of fitted values
proba numeric matrix of expected
probabilities of group membership per individual in 'data'.
inita list including the following elements.
esta numeric vector of initial parameter estimates.
loa numeric vector of lower limits of parameter estimates.
hia numeric vector of upper limits of parameter estimates.
calla character value including the model call captured by
match.call.
formulaan object of class "formula" supplied to argument
'formula'.
termsa list of objects of class "terms" for the model of component means ("beta"), probabilities of component membership ("delta") and the full model ("full").
contrastsa nested list of character values showing contrasts of factors used in models of component means ("beta") and probabilities of component membership ("delta").
dataa data frame created by
model.frame
including estimation data with additional attributes.
psia numeric vector with the minimum and maximum utility
below 1 in 'data'.
dista character value indicating the used component distributions.
labela list including the following elements.
lcoefa character vector of labels for objects including
results on distributions (default "beta") and the probabilities of
component membership (default "delta").
lcpara
character vector of labels for objects including constant distribution
parameters (default "sigma" for dist = "normal").
lcmpa character value of the label for objects including results on different components (default "Comp")
lvara
list including 2 character vectors of covariate names for model parameters
of distributions ("beta") and the multinomial logit
("delta").
optim.methoda character value of the used
optimr
method.
levela numeric value of the confidence level used for reporting.
na.actionan object of class "omit" extracted from the
"na.action" attribute of the data frame created by
model.frame
in the preparation of model matrices.
an object of class "formula" with a symbolic
description of the model to be fitted. The model formula takes the form
y ~ x1 + x2 | x1 + x4, where the | delimiter separates the
model for expected values of normal distributions (left) and the
multinomial logit model of probabilities of component membership (right).
a data frame, list or environment (or object coercible to a data
frame by
as.data.frame)
including data on outcomes and explanatory variables in 'formula'.
an optional numeric vector of row indices of the subset of the model
matrix used in the estimation. 'subset' can be longer than the
number of rows in data and include repeated values for re-sampling
purposes.
a numeric vector of minimum and maximum possible utility values
smaller than or equal to 1 (e.g. c(-0.594, 0.883)). The potential
gap between the maximum value and 1 represents an area with zero density
in the value set from which utilities were obtained. The order of the
minimum and maximum limits in 'psi' does not matter.
a numeric value of the number of components that are mixed. The
default value is 2. A value of 1 represents a tobit model with a gap
between 1 and the maximum value in 'psi'.
an optional character value of the distribution used in the
components. In this release, only the normal distribution is
available, and the default value is set to "normal".
an optional character value of one of the following
optimr
methods: "Nelder-Mead", "BFGS", "CG",
"L-BFGS-B", "nlminb", "Rcgmin", "Rvmmin" and
"hjn". The default method is "BFGS". The method
"L-BFGS-B" is used when lower and/or upper constraints are set
using 'init.lo' and 'init.hi'. The method "nlm"
cannot be used in the 'aldvmm' package.
an optional list of
optimr
control parameters.
an optional logical value indicating if an analytical
gradient should be used in
optimr
methods that can use this information. The default value is TRUE.
If 'optim.grad' is set to FALSE, a finite difference
approximation is used.
an optional character value indicating the method for
obtaining initial values. The following values are available:
"zero", "random", "constant" and "sann". The
default value is "zero".
an optional numeric vector of user-defined initial values.
User-defined initial values override the 'init.method' argument.
Initial values have to follow the same order as parameter estimates in the
return value 'coef'.
an optional numeric vector of user-defined lower limits for
constrained optimization. When 'init.lo' is not NULL, the
optimization method "L-BFGS-B" is used. Lower limits of parameters
have to follow the same order as parameter estimates in the return value
'coef'.
an optional numeric vector of user-defined upper limits for
constrained optimization. When 'init.hi' is not NULL, the
optimization method "L-BFGS-B" is used. Upper limits of parameters
have to follow the same order as parameter estimates in the return value
'coef'.
an optional logical value indicating whether standard errors
of fitted values are calculated. The default value is FALSE.
an optional logical value indicating whether the estimation
data frame is returned in the output object. The default value is
TRUE.
a numeric value of the significance level for confidence bands of fitted values. The default value is 0.95.
a character value passed to
argument 'na.action' of the function
model.frame
in the preparation of the model matrix. The default value is
"na.omit".
aldvmm fits
an adjusted limited dependent variable mixture model using the likelihood
and expected value functions from Hernandez Alava and Wailoo (2015). The
model accounts for latent classes, multi-modality, minimum and maximum
utility values and potential gaps between 1 and the next smaller utility
value. Adjusted limited dependent variable mixture models combine
multiple component distributions with a multinomial logit model of the
probabilities of component membership. The standard deviations of normal
distributions are estimated and reported as log-transformed values which
enter the likelihood function as exponentiated values to ensure
non-negative values.
The minimum utility and the largest utility smaller than or equal to 1 are
supplied in the argument 'psi'. The number of
distributions/components that are mixed is set by the argument
'ncmp'. When 'ncmp' is set to 1 the procedure estimates a
tobit model with a gap between 1 and the maximum utility value in
'psi'. The current version only allows finite mixtures of normal
distributions.
The 'formula' object can include a | delimiter to separate
formulae for expected values in components (left) and the multinomial
logit model of probabilities of group membership (right). If no |
delimiter is used, the same formula will be used for expected values in
components and the multinomial logit of the probabilities of component
membership.
aldvmm uses
optimr for
maximum likelihood estimation of model parameters. The argument
'optim.method' accepts the following methods: "Nelder-Mead",
"BFGS", "CG", "L-BFGS-B", "nlminb",
"Rcgmin", "Rvmmin" and "hjn". The default method is
"BFGS". The method "nlm" cannot be used in
aldvmm because it
requires a different implementation of the likelihood function. The
argument 'optim.control' accepts a list of
optimr
control parameters. If 'optim.grad' is set to TRUE the
function
optimr uses
analytical gradients during the optimization procedure for all methods
that allow for this approach. If 'optim.grad' is set to
FALSE or a method cannot use gradients, a finite difference
approximation is used. The hessian matrix at maximum likelihood parameters
is approximated numerically using
hessian.
'init.method' accepts four values of methods for generating initial
values: "zero", "random", "constant", "sann".
The method "zero" sets initial values of all parameters to 0. The
method "random" draws random starting values from a standard normal
distribution. The method "constant" estimates a constant-only
model and uses estimates as initial values of intercepts and standard
errors and 0 for all other parameters. The method "sann" estimates
the full model using the simulated annealing optimization method in
optim and uses
parameter estimates as initial values. When user-specified initial values
are supplied in 'init.est', the argument 'init.method' is
ignored.
By default, aldvmm
performs unconstrained optimization with upper and lower limits at
-Inf and Inf. When user-defined lower and upper limits are
supplied to 'init.lo' and/or 'init.hi', these default limits
are replaced with the user-specified values, and the method
"L-BFGS-B" is used for box-constrained optimization instead of the
user defined 'optim.method'. It is possible to only set either
maximum or minimum limits. When initial values supplied to
'init.est' or from default methods lie outside the limits, the
in-feasible values will be set to the limits using the function
bmchk.
The function aldvmm() returns the negative log-likelihood, Akaike
information criterion and Bayesian information criterion. Smaller values
of these measures indicate better fit.
If 'se.fit' is set to TRUE, standard errors of fitted values
are calculated using the delta method. The standard errors of fitted
values in the estimation data set are calculated as \(se_{fit} =
\sqrt{G^{t} \Sigma G}\), where \(G\)
is the gradient of a fitted value with respect to changes of parameter
estimates, and \(\Sigma\) is the estimated covariance matrix of
parameters (Dowd et al., 2014). The standard errors of predicted values
in new data sets are calculated as \(se_{pred} = \sqrt{MSE + G^{t}
\Sigma G}\), where
\(MSE\) is the mean squared error of fitted versus observed
outcomes in the original estimation data (Whitmore, 1986).
The generic function
summary can be
used to obtain or print a summary of the results. The generic function
predict can
be used to obtain predicted values and standard errors of predictions in
new data.
Alava, M. H. and Wailoo, A. (2015) Fitting adjusted limited dependent variable mixture models to EQ-5D. The Stata Journal, 15(3), 737--750. tools:::Rd_expr_doi("10.1177/1536867X1501500307")
Dowd, B. E., Greene, W. H., and Norton, E. C. (2014) Computation of standard errors. Health services research, 49(2), 731--750. tools:::Rd_expr_doi("10.1111/1475-6773.12122")
Whitmore, G. A. (1986) Prediction limits for a univariate normal observation. The American Statistician, 40(2), 141--143. tools:::Rd_expr_doi("10.1080/00031305.1986.10475378")
data(utility)
fit <- aldvmm(eq5d ~ age + female | 1,
data = utility,
psi = c(0.883, -0.594),
ncmp = 2)
summary(fit)
yhat <- predict(fit)
Run the code above in your browser using DataLab