bestAIC, bestBIC, bestEBIC, bestIC perform
full model enumeration when possible and otherwise resort to MCMC to
explore the models, as discussed in function modelSelection.
bestAIC_fast, bestBIC_fast, bestEBIC_fast,
bestIC_fast use a faster algorithm. It first identifies
a subset of promising models, and then computes the specified
criterion for each of them to find the best one within the subset.
For Gaussian and binary outcomes it uses function L0Learn.fit from
package L0Learn (Hazimeh et al, 2023), which combines
coordinate descent with local combinatorial search to find good models
of each size.
L1 returns all the models found in the LASSO regularization path.
CDA returns a single model found by coordinate descent,
i.e. adding/dropping one covariate at a time to improve the
specified criterion (BIC, AIC, ...).
bestBIC and the other functions documented here take similar
arguments to those of modelSelection, but here no priors
on models or parameters are needed.
Let p be the total number of parameters and n the sample size. The BIC of a model k with p_k
parameters is
- 2 L_k + p_k log(n)
the AIC is
- 2 L_k + p_k 2
the EBIC is
- 2 L_k + p_k log(n) + 2 log(p choose p_k)
and a general information criterion with a given model size penalty
- 2 L_k + p_k penalty
The MCMC model search is based on assigning a probability to each
model, and then using MCMC to sample models from this
distribution. The probability of model k is
exp(- IC_k / 2) / sum_l exp(- IC_l / 2)
where IC_k is the value of the information criterion (BIC, EBIC...)
Hence the model with best (lowest) IC_k has highest probability, which
means that it is likely to be sampled by the MCMC algorithm.