
Last chance! 50% off unlimited learning
Sale ends in
Fit an additive model where each component is estimated to piecewise constant with a small number of adaptively-chosen knots. Tuning parameter selection is done using K-fold cross-validation. In particular, this function implements the "fused lasso additive model", as proposed in Petersen, A., Witten, D., and Simon, N. (2014). Fused Lasso Additive Model. arXiv preprint arXiv:1409.5391.
flamCV(x, y, lambda.min.ratio = 0.01, n.lambda = 50, lambda.seq = NULL,
alpha = 1, family = "gaussian", method = "BCD", fold = NULL,
n.fold = NULL, seed = NULL, within1SE = T, tolerance = 10e-6)
n x p covariate matrix. May have p > n.
n-vector containing the outcomes for the n observations in x
.
smallest value for lambda.seq
, as a fraction of the maximum lambda value, which is the data-derived smallest value for which all estimated functions are zero. The default is 0.01.
the number of lambda values to consider - the default is 50.
a user-supplied sequence of positive lambda values to consider. The typical usage is to calculate lambda.seq
using lambda.min.ratio
and n.lambda
, but providing lambda.seq
overrides this. If provided, lambda.seq
should be a decreasing sequence of values, since flamCV
relies on warm starts for speed. Thus fitting the model for a whole sequence of lambda values is often faster than fitting for a single lambda value.
the value of the tuning parameter alpha to consider - default is 1. Value must be in [0,1] with values near 0 prioritizing sparsity of functions and values near 1 prioritizing limiting the number of knots. Empirical evidence suggests using alpha of 1 when p < n and alpha of 0.75 when p > n.
specifies the loss function to use. Currently supports squared error loss (default; family="gaussian"
) and logistic loss (family="binomial"
).
specifies the optimization algorithm to use. Options are block-coordinate descent (default; method="BCD"
), generalized gradient descent (method="GGD"
), or generalized gradient descent with backtracking (method="GGD.backtrack"
). This argument is ignored if family="binomial"
.
user-supplied fold numbers for cross-validation. If supplied, fold
should be an n-vector with entries in 1,...,K when doing K-fold cross-validation. The default is to choose fold
using n.fold
.
the number of folds, K, to use for the K-fold cross-validation selection of tuning parameters. The default is 10 - specification of fold
overrides use of n.fold
.
an optional number used with set.seed()
at the beginning of the function. This is only relevant if fold
is not specified by the user.
logical (TRUE
or FALSE
) for how cross-validated tuning parameters should be chosen. If within1SE=TRUE
, lambda is chosen to be the value corresponding to the most sparse model with cross-validation error within one standard error of the minimum cross-validation error. If within1SE=FALSE
, lambda is chosen to be the value corresponding to the minimum cross-validation error.
specifies the convergence criterion for the objective (default is 10e-6).
An object with S3 class "flamCV".
m-vector containing cross-validation error where m is the length of lambda.seq
. Note that mean.cv.error[i]
contains the cross-validation error for tuning parameters alpha
and flam.out$all.lambda[i]
.
m-vector containing cross-validation standard error where m is the length of lambda.seq
. Note that se.cv.error[i]
contains the standard error of the cross-validation error for tuning parameters alpha
and flam.out$all.lambda[i]
.
optimal lambda value chosen by cross-validation.
as specified by user (or default).
index of the model corresponding to 'lambda.cv'.
object of class 'flam' returned by flam
.
as specified by user (or default).
as specified by user (or default).
as specified by user (or default).
as specified by user (or default).
matched call.
Note that flamCV
does not cross-validate over alpha
- just a single value should be provided. However, if the user would like to cross-validate over alpha
, then flamCV
should be called multiple times for different values of alpha
and the same seed
. This ensures that the cross-validation folds (fold
) remain the same for the different values of alpha
. See the example below for details.
Petersen, A., Witten, D., and Simon, N. (2014). Fused Lasso Additive Model. arXiv preprint arXiv:1409.5391.
# NOT RUN {
#See ?'flam-package' for a full example of how to use this package
#generate data
set.seed(1)
data <- sim.data(n = 50, scenario = 1, zerof = 10, noise = 1)
#fit model for a range of lambda chosen by default
#pick lambda using 2-fold cross-validation
#note: use larger 'n.fold' (e.g., 10) in practice
flamCV.out <- flamCV(x = data$x, y = data$y, alpha = 0.75, n.fold = 2)
# }
# NOT RUN {
#note that cross-validation is only done to choose lambda for specified alpha
#to cross-validate over alpha also, call 'flamCV' for several alpha and set seed
#note: use larger 'n.fold' (e.g., 10) in practice
flamCV.out1 <- flamCV(x = data$x, y = data$y, alpha = 0.65, seed = 100,
within1SE = FALSE, n.fold = 2)
flamCV.out2 <- flamCV(x = data$x, y = data$y, alpha = 0.75, seed = 100,
within1SE = FALSE, n.fold = 2)
flamCV.out3 <- flamCV(x = data$x, y = data$y, alpha = 0.85, seed = 100,
within1SE = FALSE, n.fold = 2)
#this ensures that the folds used are the same
flamCV.out1$fold; flamCV.out2$fold; flamCV.out3$fold
#compare the CV error for the optimum lambda of each alpha to choose alpha
CVerrors <- c(flamCV.out1$mean.cv.error[flamCV.out1$index.cv],
flamCV.out2$mean.cv.error[flamCV.out2$index.cv],
flamCV.out3$mean.cv.error[flamCV.out3$index.cv])
best.alpha <- c(flamCV.out1$alpha, flamCV.out2$alpha,
flamCV.out3$alpha)[which(CVerrors==min(CVerrors))]
#also can generate data for logistic FLAM model
data2 <- sim.data(n = 50, scenario = 1, zerof = 10, family = "binomial")
#fit the FLAM model with cross-validation using logistic loss
#note: use larger 'n.fold' (e.g., 10) in practice
flamCV.logistic.out <- flamCV(x = data2$x, y = data2$y, family = "binomial",
n.fold = 2)
# }
# NOT RUN {
#'flamCV' returns an object of the class 'flamCV' that includes an object
#of class 'flam' (flam.out); see ?'flam-package' for an example using S3
#methods for the classes of 'flam' and 'flamCV'
# }
Run the code above in your browser using DataLab