Learn R Programming

msgl (version 2.0.125.0)

msgl: Fit a multinomial sparse group lasso regularization path.

Description

Fit a sequence of multinomial logistic regression models using sparse group lasso, group lasso or lasso. In addition to the standard parameter grouping the algorithm supports further grouping of the features.

Usage

msgl(x, classes,
    sampleWeights = rep(1/length(classes), length(classes)),
    grouping = NULL, groupWeights = NULL,
    parameterWeights = NULL, alpha = 0.5,
    standardize = TRUE, lambda, return = 1:length(lambda),
    intercept = TRUE, sparse.data = is(x, "sparseMatrix"),
    algorithm.config = msgl.standard.config)

Arguments

x
design matrix, matrix of size $N \times p$.
classes
classes, factor of length $N$.
sampleWeights
sample weights, a vector of length $N$.
grouping
grouping of features, a vector of length $p$. Each element of the vector specifying the group of the feature.
groupWeights
the group weights, a vector of length $m$ (the number of groups). If groupWeights = NULL default weights will be used. Default weights are 0 for the intercept and $$\sqrt{K\cdot\textrm{number of features in the group}}$$ for all other
parameterWeights
a matrix of size $K \times p$. If parameterWeights = NULL default weights will be used. Default weights are is 0 for the intercept weights and 1 for all other weights.
alpha
the $\alpha$ value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.
standardize
if TRUE the features are standardize before fitting the model. The model parameters are returned in the original scale.
lambda
the lambda sequence for the regularization path.
return
the indices of lambda values for which to return a the fitted parameters.
intercept
should the model include intercept parameters
sparse.data
if TRUE x will be treated as sparse, if x is a sparse matrix it will be treated as sparse by default.
algorithm.config
the algorithm configuration to be used.

Value

  • betathe fitted parameters -- a list of length length(lambda) with each entry a matrix of size $K\times (p+1)$ holding the fitted parameters
  • lossthe values of the loss function
  • objectivethe values of the objective function (i.e. loss + penalty)
  • lambdathe lambda values used
  • classes.truethe true classes used for estimation, this is equal to the classes argument

Details

For a classification problem with $K$ classes and $p$ features (covariates) dived into $m$ groups. This function computes a sequence of minimizers (one for each lambda given in the lambda argument) of $$\hat R(\beta) + \lambda \left( (1-\alpha) \sum_{J=1}^m \gamma_J \|\beta^{(J)}\|_2 + \alpha \sum_{i=1}^{n} \xi_i |\beta_i| \right)$$ where $\hat R$ is the weighted empirical log-likelihood risk of the multinomial regression model. The vector $\beta^{(J)}$ denotes the parameters associated with the $J$'th group of features (default is one covariate per group, hence the default dimension of $\beta^{(J)}$ is $K$). The group weights $\gamma \in [0,\infty)^m$ and parameter weights $\xi \in [0,\infty)^n$ may be explicitly specified.

Examples

Run this code
data(SimData)
x <- sim.data$x
classes <- sim.data$classes
lambda <- msgl.lambda.seq(x, classes, alpha = .5, d = 50, lambda.min = 0.05)
fit <- msgl(x, classes, alpha = .5, lambda = lambda)

# Model 10, i.e. the model corresponding to lambda[10]
models(fit)[[10]]

# The nonzero features of model 10
features(fit)[[10]]

# The nonzero parameters of model 10
parameters(fit)[[10]]

# The training errors of the models.
Err(fit, x)
# Note: For high dimensional models the training errors are almost always over optimistic,
# instead use msgl.cv to estimate the expected errors by cross validation

Run the code above in your browser using DataLab