cv.glmnet_tidiers: Tidiers for glmnet cross-validation objects

Description

Tidying methods for cross-validation performed by glmnet.cv, summarizing the mean-squared-error across choices of the penalty parameter lambda.

Usage

## S3 method for class 'cv.glmnet':
tidy(x, ...)
## S3 method for class 'cv.glmnet':
glance(x, ...)

Arguments

a "cv.glmnet" object

...

extra arguments (not used)

Value

All tidying methods return a data.frame without rownames, whose structure depends on the method chosen.
tidy produces a data.frame with one row per choice of lambda, with columns
lambdapenalty parameter lambda
estimateestimate (median) of mean-squared error or other criterion
std.errorstandard error of criterion
conf.highhigh end of confidence interval on criterion
conf.lowlow end of confidence interval on criterion
nzeronumber of parameters that are zero at this choice of lambda
glance returns a one-row data.frame with the values
nulldevnull deviance
npassestotal passes over the data across all lambda values

Details

No augment method exists for this class.

Examples

Run this code

if (require("glmnet", quietly = TRUE)) {
    set.seed(2014)

    nobs <- 100
    nvar <- 50
    real <- 5

    x <- matrix(rnorm(nobs * nvar), nobs, nvar)
    beta <- c(rnorm(real, 0, 1), rep(0, nvar - real))
    y <- c(t(beta) %*% t(x)) + rnorm(nvar, sd = 3)

    cvfit1 <- cv.glmnet(x,y)

    head(tidy(cvfit1))
    glance(cvfit1)

    library(ggplot2)
    tidied_cv <- tidy(cvfit1)
    glance_cv <- glance(cvfit1)

    # plot of MSE as a function of lambda
    g <- ggplot(tidied_cv, aes(lambda, estimate)) + geom_line() + scale_x_log10()
    g

    # plot of MSE as a function of lambda with confidence ribbon
    g <- g + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .25)
    g

    # plot of MSE as a function of lambda with confidence ribbon and choices
    # of minimum lambda marked
    g <- g + geom_vline(xintercept = glance_cv$lambda.min) +
        geom_vline(xintercept = glance_cv$lambda.1se, lty = 2)
    g

    # plot of number of zeros for each choice of lambda
    ggplot(tidied_cv, aes(lambda, nzero)) + geom_line() + scale_x_log10()

    # coefficient plot with min lambda shown
    tidied <- tidy(cvfit1$glmnet.fit)
    ggplot(tidied, aes(lambda, estimate, group = term)) + scale_x_log10() +
        geom_line() +
        geom_vline(xintercept = glance_cv$lambda.min) +
        geom_vline(xintercept = glance_cv$lambda.1se, lty = 2)
}

Run the code above in your browser using DataLab