msgl.cv: Multinomial sparse group lasso cross validation

Description

Multinomial sparse group lasso cross validation using multiple possessors.

Usage

msgl.cv(x, classes, sampleWeights = NULL,
    grouping = NULL, groupWeights = NULL,
    parameterWeights = NULL, alpha = 0.5,
    standardize = TRUE, lambda, fold = 10L,
    cv.indices = list(), intercept = TRUE,
    sparse.data = is(x, "sparseMatrix"), max.threads = 2L,
    seed = NULL, algorithm.config = msgl.standard.config)

Arguments

design matrix, matrix of size $N \times p$.

classes

classes, factor of length $N$.

sampleWeights

sample weights, a vector of length $N$.

grouping

grouping of features (covariates), a vector of length $p$. Each element of the vector specifying the group of the feature. #'

groupWeights

the group weights, a vector of length $m$ (the number of groups). If

groupWeights =
  NULL

default weights will be used. Default weights are 0 for the intercept and $$\sqrt{K\cdot\textrm{number of features in the group}}$$ for all other

parameterWeights

a matrix of size $K \times p$. If parameterWeights = NULL default weights will be used. Default weights are is 0 for the intercept weights and 1 for all other weights.#'

alpha

the $\alpha$ value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.

standardize

if TRUE the features are standardize before fitting the model. The model parameters are returned in the original scale.

lambda

the lambda sequence for the regularization path.

fold

the fold of the cross validation, an integer larger than $1$ and less than $N+1$. Ignored if cv.indices != NULL. If fold$\le$max(table(classes)) then the data will be split into fold disjoint sub

cv.indices

a list of indices of a cross validation splitting. If cv.indices = NULL then a random splitting will be generated using the fold argument.

intercept

should the model include intercept parameters

sparse.data

if TRUE x will be treated as sparse, if x is a sparse matrix it will be treated as sparse by default.

max.threads

the maximal number of threads to be used

seed

deprecated, use set.seed.

algorithm.config

the algorithm configuration to be used.

Value

linkthe linear predictors -- a list of length length(lambda) one item for each lambda value, with each item a matrix of size $K \times N$ containing the linear predictors.
responsethe estimated probabilities - a list of length length(lambda) one item for each lambda value, with each item a matrix of size $K \times N$ containing the probabilities.
classesthe estimated classes - a matrix of size $N \times d$ with $d=$length(lambda).
cv.indicesthe cross validation splitting used.
featuresnumber of features used in the models.
parametersnumber of parameters used in the models.
classes.truethe true classes used for estimation, this is equal to the classes argument

Examples

Run this code

data(SimData)
x <- sim.data$x
classes <- sim.data$classes

lambda <- msgl.lambda.seq(x, classes, alpha = .5, d = 50, lambda.min = 0.05)
fit.cv <- msgl.cv(x, classes, alpha = .5, lambda = lambda)

# Cross validation errors (estimated expected generalization error)

# Misclassification rate
Err(fit.cv)

# Negative log likelihood error
Err(fit.cv, type="loglike")

Run the code above in your browser using DataLab