fb_select() applies the free order model selection procedure, using forward–backward selection
voncken2019modelnormref.
For a given GAMLSS distribution and model selection criterion, it selects the optimal
polynomial degrees for all distribution parameters.
fb_select(
data,
age_name,
score_name,
family,
selcrit = "BIC",
spline = FALSE,
method = "RS(10000)",
max_poly = c(5, 5, 2, 2),
min_poly = c(0, 0, 0, 0),
start_poly = c(2, 1, 0, 0),
trace = TRUE,
seed = 123,
parallel = FALSE
)A selected GAMLSS model with the chosen polynomial degrees and the final criterion value.
data.frame. Sample on which to fit the distribution; contains the scores and ages.
string. Name of the age variable.
string. Name of the score variable.
string. For example, "BB", "BCPE", "NO", etc.
See gamlss.dist::gamlss.family for more information.
string. Model selection criterion: "AIC", "BIC" (default), "GAIC(3)", or "CV"
(cross-validation with 10 folds).
logical. If FALSE (default), estimate polynomial(s) for \(\mu\);
if TRUE, estimate a p-spline for \(\mu\).
string. Estimation method for gamlss::gamlss(). Either "RS()", "CG()", or "mixed()",
with iteration count. Default is "RS(10000)".
vector. Maximum polynomial degrees for each parameter.
vector. Minimum polynomial degrees for each parameter.
vector. Starting polynomial degrees for each parameter.
logical. If TRUE, prints progress during selection.
integer. Random seed for cross-validation folds.
logical. If TRUE, candidate models are evaluated in
parallel using future.apply. This can reduce elapsed time
for computationally heavy settings (e.g., large datasets, distributions
with many parameters, or when using cross-validation as the selection
criterion). For light models or small datasets, the overhead of
parallelization may make it slower than sequential evaluation.
Parallelization is not supported for user-defined distribution families;
use built-in gamlss.dist families instead. Default is FALSE.
If parallel = TRUE, candidate models are evaluated in parallel using the
future and future.apply packages. If these packages are not installed,
a message is printed and the function continues with sequential evaluation.
Parallelization can reduce elapsed time for large datasets, complex models and cross-validation,
but may be slower than sequential evaluation for smaller problems.
voncken2019modelnormref
shape_data(), fb_select(), normtable_create()
# \donttest{
invisible(data("ids_data"))
mydata <- shape_data(ids_data, age_name = "age", score_name = "y14", family = "BB")
mod <- fb_select(mydata, age_name = "age", score_name = "shaped_score",
family = "BB", selcrit = "BIC")
# }
Run the code above in your browser using DataLab