superpc.predict.red: Feature selection for supervised principal components

Description

Forms reduced models to approximate the supervised principal component predictor.

Usage

superpc.predict.red(fit, data, data.test, threshold, n.components = 3, n.shrinkage= 20, shrinkages=NULL,compute.lrtest = TRUE, sign.wt="both",  prediction.type =
                 c("continuous", "discrete"), n.class = 2 )

Arguments

fit

Object returned by superpc.train

data

Training data object, of form described in superpc.train dcoumentation

data.test

Test data object; same form as train

threshold

Feature score threshold; usually estimated from superpc.cv

n.components

Number of principal components to examine; should equal 1,2, etc up to the number of components used in training

n.shrinkage

Number of shrinkage values to consider. Default 20.

shrinkages

Shrinkage values to consider. Default NULL.

compute.lrtest

Should the likelihood ratio test be computed? Default TRUE

sign.wt

Signs of feature weights allowed: "both", "pos", or "neg"

prediction.type

Type of prediction: "continuous" (Default) or "discrete". In the latter, superprc score is divided into n.class groups

n.class

Number of groups for discrete predictor. Default 2.

Value

shrinkagesShrinkage values used
lrtest.reducedLikelihood ratio tests for reduced models
num.featuresNumber of features used in each reduced model
feature.listList of features used in each reduced model
coefLeast squares coefficients for each reduced model
importImportance scores for features
wtWeight for each feature, in constructing the reduced predictor
v.testOutcome predictor from reduced models. Array of n.shrinkage by (number of test observations)
v.test.1dfOutcome combined predictor from reduced models. Array of n.shrinkage by (number of test observations)
n.componentsNumber of principal components used
typeType of outcome
callcalling sequence

Details

Soft-thresholding by each of the "shrinkages" values is applied to the PC loadings. This reduce the number of features used in the model. The reduced predictor is then used in place of the supervised PC predictor.

References

~put references to the literature/web site here ~

Examples

Run this code

set.seed(332)
#generate some data

x<-matrix(rnorm(1000*40),ncol=40)
y<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
ytest<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
censoring.status<- sample(c(rep(1,30),rep(0,10)))
censoring.status.test<- sample(c(rep(1,30),rep(0,10)))

featurenames <- paste("feature",as.character(1:1000),sep="")
data<-list(x=x,y=y, censoring.status=censoring.status, featurenames=featurenames)
data.test<-list(x=x,y=ytest, censoring.status=censoring.status.test, featurenames= featurenames)



a<- superpc.train(data, type="survival")

fit<- superpc.predict(a, data, data.test, threshold=1.0, n.components=1, prediction.type="continuous")

fit.red<- superpc.predict.red(a,data, data.test, threshold=.6)
superpc.plotred.lrtest(fit.red)

Run the code above in your browser using DataLab