broom (version 0.4.4)

glmnet_tidiers: Tidiers for LASSO or elasticnet regularized fits

Description

Tidying methods for regularized fits produced by glmnet, summarizing the estimates across values of the penalty parameter lambda.

Usage

# S3 method for glmnet
tidy(x, ...)

# S3 method for glmnet glance(x, ...)

Arguments

x

a "glmnet" object

...

extra arguments (not used)

Value

All tidying methods return a data.frame without rownames, whose structure depends on the method chosen.

tidy produces a data.frame with one row per combination of coefficient (including the intercept) and value of lambda for which the estimate is nonzero, with the columns:

term

coefficient name (V1...VN by default, along with "(Intercept)")

step

which step of lambda choices was used

estimate

estimate of coefficient

lambda

value of penalty parameter lambda

dev.ratio

fraction of null deviance explained at each value of lambda

glance returns a one-row data.frame with the values

nulldev

null deviance

npasses

total passes over the data across all lambda values

Details

Note that while this representation of GLMs is much easier to plot and combine than the default structure, it is also much more memory-intensive. Do not use for extremely large, sparse matrices.

No augment method is yet provided even though the model produces predictions, because the input data is not tidy (it is a matrix that may be very wide) and therefore combining predictions with it is not logical. Furthermore, predictions make sense only with a specific choice of lambda.

Examples

Run this code
# NOT RUN {
if (require("glmnet", quietly = TRUE)) {
    set.seed(2014)
    x <- matrix(rnorm(100*20),100,20)
    y <- rnorm(100)
    fit1 <- glmnet(x,y)
    
    head(tidy(fit1))
    glance(fit1)
    
    library(dplyr)
    library(ggplot2)
    
    tidied <- tidy(fit1) %>% filter(term != "(Intercept)")
    
    ggplot(tidied, aes(step, estimate, group = term)) + geom_line()
    ggplot(tidied, aes(lambda, estimate, group = term)) +
        geom_line() + scale_x_log10()
 
    ggplot(tidied, aes(lambda, dev.ratio)) + geom_line()
    
    # works for other types of regressions as well, such as logistic
    g2 <- sample(1:2, 100, replace=TRUE)
    fit2 <- glmnet(x, g2, family="binomial")
    head(tidy(fit2))
}

# }

Run the code above in your browser using DataCamp Workspace