Fits regularization paths for coupled sparse asymmetric least squares regression at a sequence of regularization parameters.
cpernet(
x,
y,
w = 1,
nlambda = 100L,
method = "cper",
lambda.factor = ifelse(2 * nobs < nvars, 0.01, 1e-04),
lambda = NULL,
lambda2 = 0,
pf.mean = rep(1, nvars),
pf2.mean = rep(1, nvars),
pf.scale = rep(1, nvars),
pf2.scale = rep(1, nvars),
exclude,
dfmax = nvars + 1,
pmax = min(dfmax * 1.2, nvars),
standardize = TRUE,
intercept = TRUE,
eps = 1e-08,
maxit = 1000000L,
tau = 0.8
)
An object with S3 class cpernet
.
the call that produced this object.
intercept sequences both of
length length(lambda)
for the mean and scale respectively.
p*length(lambda)
matrices of coefficients for the
mean and scale respectively, stored as sparse matrices (dgCMatrix
class, the standard class for sparse numeric matrices in the Matrix
package). To convert them into normal R matrices, use as.matrix()
.
the actual sequence of lambda
values used
the number of nonzero mean and scale coefficients
respectively for each value of lambda
.
dimensions of coefficient matrices.
total number of iterations summed over all lambda values.
error flag, for warnings and errors, 0 if no error.
matrix of predictors, of dimension (nobs * nvars); each row is an observation.
response variable.
weight applied to the asymmetric squared error loss of the mean part. See details. Default is 1.0.
the number of lambda
values (default is 100).
a character string specifying the loss function to use. Only
cper
is available now.
The factor for getting the minimal lambda in the
lambda
sequence, where we set min(lambda)
=
lambda.factor
* max(lambda)
with max(lambda)
being the
smallest value of lambda
that penalizes all coefficients to zero.
The default value depends on the relationship between \(N\) (the number
of observations) and \(p\) (the number of predictors). If \(N < p\),
the default is 0.01
. If \(N > p\), the default is 0.0001
,
closer to zero. A very small value of lambda.factor
will lead to a
saturated fit. The argument takes no effect if there is a user-supplied
lambda
sequence.
a user-supplied lambda
sequence. Typically, by leaving
this option unspecified users can have the program compute its own
lambda
sequence based on nlambda
and lambda.factor
. It
is better to supply, if necessary, a decreasing sequence of lambda
values than a single (small) value. The program will ensure that the
user-supplied lambda
sequence is sorted in decreasing order.
regularization parameter lambda2
for the quadratic
penalty of the coefficients. Default is 0, meaning no L2 penalization.
L1 penalty factor of length \(p\) used for adaptive
LASSO or adaptive elastic net. Separate L1 penalty weights can be applied
to each mean or scale coefficient to allow different L1 shrinkage. Can be 0
for some variables, which imposes no shrinkage and results in that variable
being always included in the model. Default is 1 for all variables (and
implicitly infinity for variables listed in exclude
).
L2 penalty factor of length \(p\) used for adaptive elastic net. Separate L2 penalty weights can be applied to each mean or scale coefficient to allow different L2 shrinkage. Can be 0 for some variables, which imposes no shrinkage. Default is 1 for all variables.
indices of variables to be excluded from the model. Default is none. Equivalent to an infinite penalty factor.
limit the maximum number of variables in the model. Useful for very large \(p\), if a partial path is desired. Default is \(p+1\).
limit the maximum number of variables ever to be nonzero. For
example once \(\beta\) enters the model, no matter how many times it
exits or re-enters the model through the path, it will be counted only
once. Default is min(dfmax*1.2, p)
.
logical flag for variable standardization, prior to
fitting the model sequence. The coefficients are always returned to the
original scale. Default is TRUE
.
Should intercept(s) be fitted (default=TRUE) or set to zero (FALSE).
convergence threshold for coordinate descent. Each inner
coordinate descent loop continues until the maximum change in any
coefficient is less than eps
. Defaults value is 1e-8
.
maximum number of outer-loop iterations allowed at fixed lambda
values. Default is 1e7. If the algorithm does not converge, consider
increasing maxit
.
the parameter tau
in the coupled ALS regression model. The
value must be in (0,1) and cannot be 0.5. Default is 0.8.
Yuwen Gu and Hui Zou
Maintainer: Yuwen Gu <yuwen.gu@uconn.edu>
Note that the objective function in cpernet
is
$$w*1'\Psi(y-X\beta,0.5)/N + 1'\Psi(y-X\beta-X\theta,\tau)/N +
\lambda_1*\Vert\beta\Vert_1 + 0.5\lambda_2\Vert\beta\Vert_2^2 +
\mu_1*\Vert\theta\Vert +
0.5\mu_2\Vert\theta\Vert_2^2,$$ where
\(\Psi(u,\tau)=|\tau-I(u<0)|*u^2\) denotes the asymmetric squared error
loss and the penalty is a combination of L1 and L2 terms for both the mean
and scale coefficients.
For faster computation, if the algorithm is not converging or running slow,
consider increasing eps
, decreasing nlambda
, or increasing
lambda.factor
before increasing maxit
.
Gu, Y., and Zou, H. (2016).
"High-dimensional generalizations of asymmetric least squares regression and their applications."
The Annals of Statistics, 44(6), 2661–2694.
plot.cpernet
, coef.cpernet
,
predict.cpernet
, print.cpernet
set.seed(1)
n <- 100
p <- 400
x <- matrix(rnorm(n * p), n, p)
y <- rnorm(n)
tau <- 0.30
pf <- abs(rnorm(p))
pf2 <- abs(rnorm(p))
w <- 2.0
lambda2 <- 1
m2 <- cpernet(y = y, x = x, w = w, tau = tau, eps = 1e-8,
pf.mean = pf, pf.scale = pf2,
standardize = FALSE, lambda2 = lambda2)
Run the code above in your browser using DataLab