Fit a mixed model with lasso, group lasso, or sparse-group lasso via proximal gradient descent. As this is an iterative algorithm, the step size for each iteration is determined via backtracking line search. A grid search for the regularization parameter \(\lambda\) is performed using warm starts. The mixed model has the form: $$y = X b + Z u + residual.$$ The penalty of the sparse-group lasso (without additional weights for features) is then: $$\alpha \lambda ||u||_1 + (1 - \alpha) \lambda \sum_l \omega^G_l ||u^{(l)}||_2.$$ If \(\alpha = 1\), this leads to the lasso. If \(\alpha = 0\), this leads to the group lasso. Furthermore, if instead of applying the \(l_2\)-norm on \(u^{(l)}\) but on the fitted values \(Z^{(l)} u^{(l)}\) two more algorithms may be called: either the fitted group lasso or the fitted sparse-group lasso.
seagull_fitted_group_lasso(
VECTOR_Yc,
Y_MEAN,
MATRIX_Xc,
VECTOR_Xc_MEANS,
VECTOR_Xc_STANDARD_DEVIATIONS,
VECTOR_WEIGHTS_FEATURESc,
VECTOR_WEIGHTS_GROUPSc,
VECTOR_FULL_COLUMN_RANK,
VECTOR_GROUPS,
VECTOR_BETAc,
VECTOR_INDEX_PERMUTATION,
VECTOR_INDEX_EXCLUDE,
EPSILON_CONVERGENCE,
ITERATION_MAX,
GAMMA,
LAMBDA_MAX,
PROPORTION_XI,
DELTA,
NUMBER_INTERVALS,
NUMBER_FIXED_EFFECTS,
NUMBER_VARIABLES,
INTERNAL_STANDARDIZATION,
TRACE_PROGRESS
)seagull_fitted_sparse_group_lasso(
VECTOR_Yc,
Y_MEAN,
MATRIX_Xc,
VECTOR_Xc_MEANS,
VECTOR_Xc_STANDARD_DEVIATIONS,
VECTOR_WEIGHTS_FEATURESc,
VECTOR_WEIGHTS_GROUPSc,
VECTOR_FULL_COLUMN_RANK,
VECTOR_GROUPS,
VECTOR_BETAc,
VECTOR_INDEX_PERMUTATION,
VECTOR_INDEX_EXCLUDE,
ALPHA,
EPSILON_CONVERGENCE,
ITERATION_MAX,
LAMBDA_MAX,
PROPORTION_XI,
DELTA,
STEP_SIZE,
NUMBER_INTERVALS,
NUMBER_FIXED_EFFECTS,
NUMBER_VARIABLES,
INTERNAL_STANDARDIZATION,
TRACE_PROGRESS
)
seagull_group_lasso(
VECTOR_Yc,
Y_MEAN,
MATRIX_Xc,
VECTOR_Xc_MEANS,
VECTOR_Xc_STANDARD_DEVIATIONS,
VECTOR_WEIGHTS_FEATURESc,
VECTOR_GROUPS,
VECTOR_BETAc,
VECTOR_INDEX_PERMUTATION,
VECTOR_INDEX_EXCLUDE,
EPSILON_CONVERGENCE,
ITERATION_MAX,
GAMMA,
LAMBDA_MAX,
PROPORTION_XI,
NUMBER_INTERVALS,
NUMBER_FIXED_EFFECTS,
NUMBER_VARIABLES,
INTERNAL_STANDARDIZATION,
TRACE_PROGRESS
)
seagull_lasso(
VECTOR_Yc,
Y_MEAN,
MATRIX_Xc,
VECTOR_Xc_MEANS,
VECTOR_Xc_STANDARD_DEVIATIONS,
VECTOR_WEIGHTS_FEATURESc,
VECTOR_BETAc,
VECTOR_INDEX_EXCLUDE,
EPSILON_CONVERGENCE,
ITERATION_MAX,
GAMMA,
LAMBDA_MAX,
PROPORTION_XI,
NUMBER_INTERVALS,
NUMBER_FIXED_EFFECTS,
NUMBER_VARIABLES,
INTERNAL_STANDARDIZATION,
TRACE_PROGRESS
)
seagull_sparse_group_lasso(
VECTOR_Yc,
Y_MEAN,
MATRIX_Xc,
VECTOR_Xc_MEANS,
VECTOR_Xc_STANDARD_DEVIATIONS,
VECTOR_WEIGHTS_FEATURESc,
VECTOR_GROUPS,
VECTOR_BETAc,
VECTOR_INDEX_PERMUTATION,
VECTOR_INDEX_EXCLUDE,
ALPHA,
EPSILON_CONVERGENCE,
ITERATION_MAX,
GAMMA,
LAMBDA_MAX,
PROPORTION_XI,
NUMBER_INTERVALS,
NUMBER_FIXED_EFFECTS,
NUMBER_VARIABLES,
INTERNAL_STANDARDIZATION,
TRACE_PROGRESS
)
numeric vector of observations.
arithmetic mean of VECTOR_Yc.
numeric design matrix relating y to fixed and random effects \([X Z]\). The columns may be permuted corresponding to their group assignments.
numeric vector of arithmetic means of each column of MATRIX_Xc.
numeric vector of estimates of
standard deviations of each column of MATRIX_Xc. Values are calculated via
the function colSds
from the R-package matrixStats
.
numeric vector of weights for the vectors of fixed and random effects \([b^T, u^T]^T\). The entries may be permuted corresponding to their group assignments.
numeric vector of pre-calculated weights for each group.
Boolean vector, which harbors the information of whether or not the group-wise parts of the filtered matrix Z, i.e., \(Z^{(l)}\) for each group l, have full column rank.
integer vector specifying which effect (fixed and random) belongs to which group.
numeric vector whose partitions will be returned (partition 1: estimates of fixed effects, partition 2: predictions of random effects). During the computation the entries may be in permuted order. But they will be returned according to the order of the user's input.
integer vector that contains information about the original order of the user's input.
integer vector, which contains the indices of
every column that was filtered due to low standard deviation. This vector
only has an effect, if standardize = TRUE
is used.
value for relative accuracy of the solution to stop the algorithm for the current value of \(\lambda\). The algorithm stops after iteration m, if: $$||sol^{(m)} - sol^{(m-1)}||_\infty < \epsilon_c * ||sol1{(m-1)}||_2.$$
maximum number of iterations for each value of the
penalty parameter \(\lambda\). Determines the end of the calculation if
the algorithm didn't converge according to EPSILON_CONVERGENCE
before.
multiplicative parameter to decrease the step size during backtracking line search. Has to satisfy: \(0 < \gamma < 1\).
maximum value for the penalty parameter. This is the start value for the grid search of the penalty parameter \(\lambda\).
multiplicative parameter to determine the minimum value
of \(\lambda\) for the grid search, i.e. \(\lambda_{min} = \xi *
\lambda_{max}\). Has to satisfy: \(0 < \xi \le 1\). If xi=1
, only a
single solution for \(\lambda = \lambda_{max}\) is calculated.
numeric value, which is squared and added to the main diagonal of \(Z^{(l)T} Z^{(l)}\) for group l, if this matrix is not invertible.
number of lambdas for the grid search between \(\lambda_{max}\) and \(\xi * \lambda_{max}\). Loops are performed on a logarithmic grid.
non-negative integer to determine the number of fixed effects present in the mixed model.
non-negative integer which corresponds to the sum of all columns of the initial model matrices X and Z.
if TRUE
, the input vector y is
centered, and each column of the input matrices X and Z is centered and
scaled with an internal process. Additionally, a filter is applied to X and
Z, which filters columns with standard deviation less than 1.e-7
.
if TRUE
, a message will occur on the screen
after each finished loop of the \(\lambda\) grid. This is particularly
useful for larger data sets.
mixing parameter of the penalty terms. Satisfies: \(0 < \alpha < 1\). The penalty term looks as follows: $$\alpha * "lasso penalty" + (1-\alpha) * "group lasso penalty".$$
numeric value which represents the size of the step between consecutive iterations.
A list of estimates and parameters relevant for the computation:
estimate for the intercept, if present in the model.
estimates for the fixed effects b, if present in the model. Each row corresponds to a particular value of \(\lambda\).
predictions for the random effects u. Each row corresponds to a particular value of \(\lambda\).
all values for \(\lambda\) which were used during the grid search.
a sequence of actual iterations for each value of
\(\lambda\). If an occurring number is equal to max_iter
, then
the algorithm most likely did not converge to rel_acc
during the
corresponding run of the grid search.
The following parameters are also returned. But primarily for the purpose of
comparison and repetition: alpha = ALPHA
(only for the sparse-group
lasso), max_iter = ITERATION_MAX
, gamma_bls = GAMMA
, xi
= PROPORTION_XI
, and loops_lambda = NUMBER_INTERVALS
.