pocre: Penalized Orthogonal-Components Regression (POCRE)

Description

Apply POCRE with a pre-specified tuning parameter to build a linear regression model with orthogonal components $X\vartheta_1, X\vartheta_2, \dots$, $$Y=\mu+\sum_j (X\varpi_j)\vartheta_j+\epsilon=\mu+X\beta+\epsilon,$$ where $var[\epsilon]=\sigma^2$ and $\beta=\sum_j \varpi_j\vartheta_j$. These orthogonal components are sequentially constructed according to supervised dimension reduction under penalty set by the pre-specified tuning parameter.

While the orthogonal components are constructed using the centralized covariates, the intercept $\mu$ and regression coefficients in $\beta$ are estimated for original covariates. The sequential construction stops when no new component can be constructed (returning bSparse=1), or the new component is constructed with more than maxvar covariates (returning bSparse=0).

Usage

pocre(y, x, lambda=1, x.nop=NA, maxvar=dim(x)[1]/2,
      maxcmp=10, ptype=c('ebtz','ebt','l1','scad','mcp'),
      maxit=100, tol=1e-6, gamma=3.7, pval=FALSE)

Arguments

n*q matrix, values of q response variables (allow for multiple response variables).

n*p matrix, values of p predicting variables (excluding the intercept).

lambda

the tuning parameter (=1 by default).

x.nop

a vector indicating indices of covariates which are excluded only when evaluating the significance of components.

maxvar

maximum number of selected variables.

maxcmp

maximum number of components to be constructed.

ptype

a character to indicate the type of penalty: 'ebtz' (emprical Bayes thresholding after Fisher's z-transformation, by default), 'ebt' (emprical Bayes thresholding by Johnstone & Silverman (2004)), 'l1' (L_1 penalty), 'scad' (SCAD by Fan & Li (2001)), 'mcp' (MCP by Zhang (2010)).

maxit

maximum number of iterations to be allowed.

tol

tolerance of precision in iterations.

gamma

a parameter used by SCAD and MCP (=3.7 by default).

pval

a logical value indicating whether to calculate the p-values of components.

Value

estimated intercept of the linear regression.

beta

estimated coefficients of the linear regression.

varpi

loadings of the constructed components.

vartheta

the regression coefficients of the constructed components.

bSparse

a logical value indicating whether estimated beta has less than maxvar nonzero values.

lambda

value of the tuning paramete.

nCmp

number of constructed components.

sample size.

number of covariates.

xShift

the column means of x.

yShift

the column means of y.

sigmae2

estimated error variance $\sigma^2$.

rsq

$R^2$ value of the fitted regression model.

nzBeta

number of non-zero regression coefficients in $\beta$.

omega

internal matrix.

theta

internal matrix.

pvalue

p-values of constructed components, available when pval=TRUE.

seqpv

Type I p-values of components when sequentially including them into the model, available when pval=TRUE.

indpv

p-values of components when marginally testing each component, available when pval=TRUE.

loglik

the loglikelihood function, available when pval=TRUE.

effp

the effective number of predictors, excluding redundant ones, available when pval=TRUE.

References

Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348-1360

Johnstone IM and Silverman BW (2004). Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences. Annals of Statistics, 32: 1594-1649.

Zhang C-H (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38: 894-942.

Zhang D, Lin Y, and Zhang M (2009). Penalized orthogonal-components regression for large p small n data. Electronic Journal of Statistics, 3: 781-796.

Examples

Run this code

# NOT RUN {
data(simdata)
xx <- simdata[,-1]
yy <- simdata[,1]

#pres <- pocre(yy,xx,lambda=0.9)
pres <- pocre(yy,xx)   # lambda=1 by default
# }

Run the code above in your browser using DataLab