Learn R Programming

dglars (version 1.0.2)

cvdglars: Cross-validation deviance for dgLARS

Description

Uses the $k$-fold cross-validation deviance to estimate the solution point of the dgLARS solution curve.

Usage

cvdglars(formula, family = c("binomial", "poisson"), data, 
subset, contrast = NULL, control = list())

cvdglars.fit(X, y, family = c("binomial", "poisson"), control = list())

Arguments

formula
an object of class "formula": a symbolic description of the model to be fitted.
family
a description of the error distribution used in the model (see below for more details).
data
an optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)'.
subset
an optional vector specifying a subset of observations to be used in the fitting process.
contrast
an optional list. See the 'contrasts.arg' of 'model.matrix.default'.
control
a list of control parameters. See 'Details'.
X
design matrix of dimension $n\times p$.
y
response vector.

Value

  • cvdglars returns an object with S3 class "cvdglars", i.e. a list containing the following components:
  • callthe call that produced this object;
  • familya description of the error distribution used in the model;
  • betathe vector of the coefficients estimated by cross-validation;
  • dev_ma vector of length ng used to store the mean cross-validation deviance;
  • dev_va vector of length ng used to store the variance of the mean cross-validation deviance;
  • g0the smallest value for the tuning parameter;
  • g_hatthe value of the tuning parameter corresponding to the minimum of the cross-validation deviance;
  • g_maxthe value of the tuning parameter corresponding to the starting point of the dgLARS solution curve;
  • Xthe used design matrix;
  • ythe used response vector;
  • convan integer value used to encode the warnings and the errors related to the algorithm used to dgLARS solution curve. The values returned are: [object Object],[object Object],[object Object],[object Object],[object Object]
  • controlthe list of control parameters used to compute the cross-validation deviance.

Details

cvdglars function runs dglars nfold+1 times. The deviance is stored, and the average and its standard deviation over the folds are computed.

cvdglars.fit is the workhorse function: it is more efficient when the design matrix have already been calculated. For this reason we suggest to use this function when the dgLARS method is applied in a high-dimensional setting, i.e. when p>n .

The control argument is a list that can supply any of the following components: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

References

Augugliaro L., Mineo A.M. and Wit E.C. (2013) dgLARS: a differential geometric approach to sparse generalized linear models, Journal of the Royal Statistical Society. Series B., Vol 75(3), 471-498.

Augugliaro L., Mineo A.M. and Wit E.C. (2012) Differential geometric LARS via cyclic coordinate descent method, in Proceeding of COMPSTAT 2012, pp. 67-79. Limassol, Cyprus.

See Also

coef.cvdglars, print.cvdglars, plot.cvdglars methods

Examples

Run this code
###########################
# Logistic regression model

set.seed(123)

n <- 100
p <- 10
X <- matrix(rnorm(n*p), n, p)
b <- 1:2
eta <- b[1] + X[,1] * b[2]
mu <- binomial()$linkinv(eta)
y <- rbinom(n, 1, mu)
fit_cv <- cvdglars.fit(X, y, family = "binomial")
fit <- dglars.fit(X, y, family = "binomial", control = list(g0=fit_cv$g_hat))
fit_cv
fit$beta[,fit$np]

Run the code above in your browser using DataLab