cvam
The cvam
function fits
log-linear models to
coarsened categorical variables. Its model-fitting
procedures are governed by parameters in a cvamControl
object
created by the auxiliary function documented here. This function is
intended for internal use; the only reason to invoke this
function directly is to display the control parameters and their
default values.
cvamControl( iterMaxEM = 500L, iterMaxNR = 50L,
iterApproxBayes = 1L, imputeApproxBayes = FALSE,
iterMCMC = 5000L, burnMCMC = 0L, thinMCMC = 1L, imputeEvery = 0L,
saveProbSeries = FALSE,
typeMCMC = c("DA","RWM"), tuneDA = c(10,.8,.8), tuneRWM = c(1000,.1),
stuckLimit = 25L,
startValDefault = c("center", "uniform"), startValJitter = 0,
critEM = 1e-06, critNR = 1e-06, critBoundary = 1e-08, ncolMaxMM = 1000L,
excludeAllNA = TRUE, critModelCheck=1e-08, confidence=.95,
probRound = TRUE, probDigits = 4L )
a list of control parameters for internal use by the
function cvam
.
maximum number of iterations performed when
method = "EM"
; see DETAILS.
maximum number of iterations of Newton-Raphson performed during an M-step of EM; see DETAILS.
number of simulated log-linear coefficient
vectors to be drawn from their approximate posterior distribution
when method="approxBayes"
.
if TRUE
then, for each draw of the
log-linear coefficients from their approximate posterior distribution,
the true frequencies will be imputed.
number of iterations of Markov chain Monte Carlo
after the burn-in period when method="MCMC"
.
number of iterations of Markov chain Monte Carlo
performed as a burn-in period, for which the results are
discarded. The total number of iterations performed is
burnMCMC+iterMCMC
.
thinning interval for saving the results from MCMC as a series.
imputation interval for saving imputed
frequencies for the complete-data table. If 0
, then no
imputations are saved.
if TRUE
then the simulated values of
cell probabilities from MCMC will be stored as a series.
either "DA"
(data augmentation) or
"RWM"
(random-walk Metropolis); see DETAILS.
tuning parameters for data augmentation MCMC; see DETAILS.
tuning parameter for random-walk Metropolis MCMC; see DETAILS.
criterion for deciding if the MCMC algorithm has gotten stuck.
method used to obtain default starting values
for parameters if no starting values are provided. "center"
begins in the center of the parameter space, which assigns equal
probability to all non-structural zero cells in the complete-data
table. "uniform"
draws
random starting values from a uniform distribution on the cell
probabilities.
standard deviation for Gaussian random noise added to
starting values. If cvam
is called with
saturated=FALSE
, the log-linear coefficients are perturbed by
this amount; if saturated=TRUE
, the log-cell
probabilities are perturbed by this amount and renormalized to sum
to one.
convergence criterion for EM stopping rule; see DETAILS.
convergence criterion for Newton-Raphson stopping rule in M-step of EM; see DETAILS.
criterion for testing whether any estimated cell means are close to zero, in which case a warning is given.
limit on the number of columns allowed for a log-linear model matrix.
if TRUE
, then cases for which all modeled
variables are missing will be excluded from the model fitting
procedure, because they only contribute constant terms to the
observed-data loglikelihood function.
criterion for checking the log-linear model matrix for linear dependencies among the columns.
confidence coefficient for interval estimates,
used when estimates are requested in the call to cvam
.
if TRUE, estimated probabilities will be rounded.
number of digits for rounding estimated probabilities.
Joe Schafer Joseph.L.Schafer@census.gov
When cvam
is called with method="EM"
, it performs an EM
algorithm. At each E-step, observations with missing or coarsened
values are apportioned to cells of the complete-data table in the
expected amounts determined by the current estimated parameters. At
the M-step, the a log-linear model is fit to the predicted
complete-data frequencies from the E-step, using a Newton-Raphson procedure if
saturated=FALSE
. The EM algorithm is stopped after
iterMaxEM
iterations, or when the maximum
absolute difference in cell means from one iteration to the next is
no greater than critEM
. The Newton-Raphson procedure in each
M-step is stopped after iterMaxNR
iterations or when the
maximum absolute difference in cell means from one iteration to the next is
no greater than critNR
.
When cvam
is called with method="MCMC"
, the algorithm
that is run depends on typeMCMC
and on whether the model is fit
with saturated=TRUE
.
If saturated=FALSE
and
typeMCMC="DA"
, then the algorithm is a data-augmentation
procedure that resembles EM. At each cycle, observations with missing
or coarsened values are randomly allocated to cells of the
complete-data table by drawing from a multinomial distribution, and
the log-linear coefficients are updated using one step of a
Metropolis-Hastings algorithm that mimics Newton-Raphson and
conditions on the allocated
frequencies. The proposal distribution is multivariate-t and can be
adjusted by tuning constants in
tuneDA
, a numeric vector containing the degrees of
freedom, step size and scale factor.
If saturated=FALSE
and typeMCMC="RWM"
,
the observations with missing or coarsened values are not allocated,
and the log-linear coefficients are updated by a step of random-walk
Metropolis. The proposal is mutivariate-t and can be adjusted by
tuning constants in tuneRWM
, a numeric vector containing the
degrees of freedom and scale factor.
If saturated=TRUE
, then the algorithm is a
data-augmentation procedure that requires no tuning.
Full details on the EM and MCMC procedures are given in the Appendix of the vignette Log-Linear Modeling with Missing and Coarsened Values Using the cvam Package.
cvam
# display all control parameters and their default values
cvamControl()
Run the code above in your browser using DataLab