run_lasso_mc_lambda
is called from within run_lasso
. It
tunes using multiple cores.
run_lasso_mc_lambda(
y,
L1.x,
L2.x,
L2.unit,
L2.reg,
loss.unit,
loss.fun,
data,
cores,
L2.fe.form,
L1.re,
lambda
)
The cross-validation errors for all models. A list.
Outcome variable. A character vector containing the column names of
the outcome variable. A character scalar containing the column name of
the outcome variable in survey
.
Individual-level covariates. A character vector containing the
column names of the individual-level variables in survey
and
census
used to predict outcome y
. Note that geographic unit
is specified in argument L2.unit
.
Context-level covariates. A character vector containing the
column names of the context-level variables in survey
and
census
used to predict outcome y
. To exclude context-level
variables, set L2.x = NULL
.
Geographic unit. A character scalar containing the column
name of the geographic unit in survey
and census
at which
outcomes should be aggregated.
Geographic region. A character scalar containing the column
name of the geographic region in survey
and census
by which
geographic units are grouped (L2.unit
must be nested within
L2.reg
). Default is NULL
.
Loss function unit. A character-valued scalar indicating
whether performance loss should be evaluated at the level of individual
respondents (individuals
), geographic units (L2 units
) or at
both levels. Default is c("individuals", "L2 units")
. With multiple
loss units, parameters are ranked for each loss unit and the loss unit with
the lowest rank sum is chosen. Ties are broken according to the order in
the search grid.
Loss function. A character-valued scalar indicating whether
prediction loss should be measured by the mean squared error (MSE
),
the mean absolute error (MAE
), binary cross-entropy
(cross-entropy
), mean squared false error (msfe
), the f1
score (f1
), or a combination thereof. Default is c("MSE",
"cross-entropy","msfe", "f1")
. With multiple loss functions, parameters
are ranked for each loss function and the parameter combination with the
lowest rank sum is chosen. Ties are broken according to the order in the
search grid.
Data for cross-validation. A list
of \(k\)
data.frames
, one for each fold to be used in \(k\)-fold
cross-validation.
The number of cores to be used. An integer indicating the number of processor cores used for parallel computing. Default is 1.
The fixed effects part of the Lasso classifier formula. The
formula is inherited from run_lasso
.
A list of random effects for the Lasso classifier formula. The
formula is inherited from run_lasso
.
Lasso penalty parameter. A numeric vector
of
non-negative values. The penalty parameter controls the shrinkage of the
context-level variables in the lasso model. Default is a sequence with
minimum 0.1 and maximum 250 that is equally spaced on the log-scale. The
number of values is controlled by the lasso.n.iter
parameter.