Learn R Programming

cosso (version 2.1-0)

cosso: Estimate the mean regression function for Gaussian response using Smmohting Spines with COSSO penalty

Description

Fit COSSO and adaptive COSSO models for Gaussian response. COSSO is a regularization method for variable selection and function estimation in multivariate nonparametric regression models. By imposing a soft-thresholding type penalty onto function components, the COSSO solution is sparse and hence able to identify important variables. The method is developed in the framework of smoothing spline ANOVA.

Usage

cosso(x,y,wt=rep(1,ncol(x)),scale=FALSE,nbasis,basis.id,n.step=2*ncol(x))

Arguments

x
input matrix; the number of rows is sample size, the number of columns is the data dimension. The range of input variables is scaled to [0,1].
y
response vector
wt
weights for predictors. Default is rep(1,ncol(x))
scale
if TRUE, each predictor variable is rescaled to [0,1] interval. Dafault is FALSE.
basis.id
index designating selected "knots".
nbasis
number of "knots" to be selected. Ignored when basis.id is provided.
n.step
maximum iteration number in fiding solution path.

Value

  • An object with S3 class "cosso".
  • familytype of regression model.
  • xthe input matrix
  • ythe response vector
  • Kmatan array containing kernel matrices for each input variables
  • basis.idIndices of observations used as "knots"
  • wtweights
  • tunea list containing prelminary tuning result

Details

The mean regression function is first assumed to have an additive form $$\eta(x)=\sum_{j=1}^p\eta_j(x_j),$$ then estimated by minimizing the objective function: $$RSS/nobs+\lambda_0\sum_{j=1}^p\theta^{-1}_jw_j^2||\eta_j||^2, s.t.~\sum_{j=1}^p\theta_jnbais as "knots", which reduces the dimension of the kernel matrices from nobs to nbasis. Unless specified via basis.id or nbasis, the default number of "knots" is the sample size (nobs). The weights can be specified based on either user's own discretion or adaptively computed from initial function estimates. See Storlie et al. (2011) for more discussions. One possible choice is to specify the weights as the inverse $L_2$ norm of initial function estimator, see SSANOVAwt.

References

Lin, Y. and Zhang, H. H. (2006) "Component Selection and Smoothing in Smoothing Spline Analysis of Variance Models", Annals of Statistics, 34, 2272--2297. Storlie, C. B., Bondell, H. D., Reich, B. J. and Zhang, H. H. (2011) "Surface Estimation, Variable Selection, and the Nonparametric Oracle Property", Statistica Sinica, 21, 679--705.

See Also

plot.cosso, predict.cosso, tune.cosso

Examples

Run this code
data(ozone)
## Fit cosso
## Use 50 observations as knots
t0=proc.time()
## Use half of the observations for demonstration
set.seed(27695)
train.id <- sort(sample(1:nrow(ozone),ceiling(nrow(ozone)/2)))
cossoObj <- cosso(x=ozone[train.id,2:5],y=ozone[train.id,1],nbasis=50)
print((proc.time()-t0)[3])

## Use all observations as knots
t0=proc.time()
## Use half of the observations for demonstration
set.seed(27695)
train.id <- sort(sample(1:nrow(ozone),ceiling(nrow(ozone)/2)))
cossoObj <- cosso(x=ozone[train.id,2:5],y=ozone[train.id,1])
print((proc.time()-t0)[3])


## Fit adaptive cosso
adaptive.wt <- SSANOVAwt(ozone[,-1],ozone[,1])
acossoObj <- cosso(x=ozone[,-1],y=ozone[,1],wt=adaptive.wt,nbasis=ceiling(nrow(ozone)/5))

Run the code above in your browser using DataLab