This function controls various estimation options for vglmer.
vglmer_control(
iterations = 1000,
prior_variance = "hw",
factorization_method = c("strong", "partial", "weak"),
parameter_expansion = "translation",
do_SQUAREM = TRUE,
tolerance_elbo = 1e-08,
tolerance_parameters = 1e-05,
force_whole = TRUE,
print_prog = NULL,
do_timing = FALSE,
verbose_time = FALSE,
return_data = FALSE,
linpred_method = "joint",
vi_r_method = "VEM",
verify_columns = FALSE,
debug_param = FALSE,
debug_ELBO = FALSE,
debug_px = FALSE,
quiet = TRUE,
quiet_rho = TRUE,
px_method = "dynamic",
px_numerical_it = 10,
hw_inner = 10,
init = "EM_FE"
)This function returns a named list with class vglmer_control.
It is passed to vglmer in the argument control. This argument
only accepts objects created using vglmer_control.
Default of 1000; this sets the maximum number of iterations used in estimation.
Prior distribution on the random effect variance
\(\Sigma_j\). Options are hw, jeffreys, mean_exists,
uniform, and gamma. The default (hw) is the Huang-Wand
(2013) prior whose hyper-parameters are \(\nu_j\) = 2 and \(A_{j,k}\) =
5. Otherwise, the prior is an Inverse Wishart with the following parameters
where \(d_j\) is the dimensionality of the random effect \(j\).
mean_exists: \(IW(d_j + 1, I)\)
jeffreys: \(IW(0, 0)\)
uniform: \(IW(-[d_j+1], 0)\)
limit: \(IW(d_j - 1, 0)\)
Estimation may fail if an improper prior (jeffreys, uniform,
limit) is used.
Factorization assumption for the variational
approximation. Default of "strong", i.e. a fully factorized model.
Described in detail in Goplerud (2022a). "strong", "partial",
and "weak" correspond to Schemes I, II, and III respectively in that
paper.
Default of "translation" (see Goplerud
2022b). Valid options are "translation", "mean", or
"none". "mean" should be employed if "translation" is
not enabled or is too computationally expensive. For negative binomial
estimation or any estimation where factorization_method != "strong",
only "mean" and "none" are available.
Default (TRUE) accelerates estimation using SQUAREM
(Varadhan and Roland 2008).
Default (1e-8) sets a convergence threshold if
the change in the ELBO is below the tolerance.
Default (1e-5) sets a convergence
threshold that is achieved if no parameter changes by more than the
tolerance from the prior estimated value.
Default (TRUE) requires integers for observed
outcome for binomial or count models. FALSE allows for fractional
responses.
Default (NULL) prints a "." to indicate once
5% of the total iterations have elapsed. Set to a positive integer
int to print a "." every int iterations.
Default (FALSE) does not estimate timing of each
variational update; TRUE requires the package tictoc.
Default (FALSE) does not print the time elapsed
for each parameter update. Set to TRUE, in conjunction with
do_timing=TRUE, to see the time taken for each parameter update.
Default (FALSE) does not return the original
design. Set to TRUE to debug convergence issues.
Default ("joint") updates the mean parameters
for the fixed and random effects simultaneously. This can improve the speed
of estimation but may be costly for large datasets; use "cyclical"
to update each parameter block separately.
Default ("VEM") uses a variational EM algorithm for
updating \(r\) if family="negbin". This assumes a point mass
distribution on \(r\). A number can be provided to fix \(r\). These are
the only available options.
Default (FALSE) does not verify that all
columns are drawn from the data.frame itself versus the environment. Set to
TRUE to debug potential issues.
Default (FALSE) does not store parameters before
the final iteration. Set to TRUE to debug convergence issues.
Default (FALSE) does not store the ELBO after each
parameter update. Set to TRUE to debug convergence issues.
Default (FALSE) does not store information about
whether parameter expansion worked. Set to TRUE to convergence
issues.
Default (FALSE) does not print intermediate output about
convergence. Set to TRUE to debug.
Default (FALSE) does not print information about
parameter expansions. Set to TRUE to debug convergence issues.
When code parameter_expansion="translation", default
("dynamic") tries a one-step late update and, if this fails, a
numerical improvement by L-BFGS-B. For an Inverse-Wishart prior on
\(\Sigma_j\), this is set to "osl" that only attempts a
one-step-late update.
Default of 10; if L-BFGS_B is needed for a parameter expansion, this sets the number of steps used.
If prior_variance="hw", this sets the number of
repeated iterations between estimating \(\Sigma_j\) and \(a_{j,k}\)
variational distributions at each iteration. A larger number approximates
jointly updating both parameters. Default (10) typically performs well.
Default ("EM_FE") initializes the mean variational
parameters for \(q(\beta, \alpha)\) by setting the random effects to zero
and estimating the fixed effects using a short-running EM algorithm.
"EM" initializes the model with a ridge regression with a guess as
to the random effect variance. "random" initializes the means
randomly. "zero" initializes them at zero.
Goplerud, Max. 2022a. "Fast and Accurate Estimation of Non-Nested Binomial Hierarchical Models Using Variational Inference." Bayesian Analysis. 17(2): 623-650.
Goplerud, Max. 2022b. "Re-Evaluating Machine Learning for MRP Given the Comparable Performance of (Deep) Hierarchical Models." Working Paper.
Huang, Alan, and Matthew P. Wand. 2013. "Simple Marginally Noninformative Prior Distributions for Covariance Matrices." Bayesian Analysis. 8(2):439-452.
Varadhan, Ravi, and Christophe Roland. 2008. "Simple and Globally Convergent Methods for Accelerating the Convergence of any EM Algorithm." Scandinavian Journal of Statistics. 35(2): 335-353.