superpc (version 1.09)

superpc.predict.red: Feature selection for supervised principal components

Description

Forms reduced models to approximate the supervised principal component predictor.

Usage

superpc.predict.red(fit, data, data.test, threshold, n.components = 3, n.shrinkage= 20, shrinkages=NULL,compute.lrtest = TRUE, sign.wt="both",  prediction.type =
                 c("continuous", "discrete"), n.class = 2 )

Arguments

fit
Object returned by superpc.train
data
Training data object, of form described in superpc.train dcoumentation
data.test
Test data object; same form as train
threshold
Feature score threshold; usually estimated from superpc.cv
n.components
Number of principal components to examine; should equal 1,2, etc up to the number of components used in training
n.shrinkage
Number of shrinkage values to consider. Default 20.
shrinkages
Shrinkage values to consider. Default NULL.
compute.lrtest
Should the likelihood ratio test be computed? Default TRUE
sign.wt
Signs of feature weights allowed: "both", "pos", or "neg"
prediction.type
Type of prediction: "continuous" (Default) or "discrete". In the latter, superprc score is divided into n.class groups
n.class
Number of groups for discrete predictor. Default 2.

Value

  • shrinkagesShrinkage values used
  • lrtest.reducedLikelihood ratio tests for reduced models
  • num.featuresNumber of features used in each reduced model
  • feature.listList of features used in each reduced model
  • coefLeast squares coefficients for each reduced model
  • importImportance scores for features
  • wtWeight for each feature, in constructing the reduced predictor
  • v.testOutcome predictor from reduced models. Array of n.shrinkage by (number of test observations)
  • v.test.1dfOutcome combined predictor from reduced models. Array of n.shrinkage by (number of test observations)
  • n.componentsNumber of principal components used
  • typeType of outcome
  • callcalling sequence

Details

Soft-thresholding by each of the "shrinkages" values is applied to the PC loadings. This reduce the number of features used in the model. The reduced predictor is then used in place of the supervised PC predictor.

References

~put references to the literature/web site here ~

Examples

Run this code
set.seed(332)
#generate some data

x<-matrix(rnorm(1000*40),ncol=40)
y<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
ytest<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
censoring.status<- sample(c(rep(1,30),rep(0,10)))
censoring.status.test<- sample(c(rep(1,30),rep(0,10)))

featurenames <- paste("feature",as.character(1:1000),sep="")
data<-list(x=x,y=y, censoring.status=censoring.status, featurenames=featurenames)
data.test<-list(x=x,y=ytest, censoring.status=censoring.status.test, featurenames= featurenames)



a<- superpc.train(data, type="survival")

fit<- superpc.predict(a, data, data.test, threshold=1.0, n.components=1, prediction.type="continuous")

fit.red<- superpc.predict.red(a,data, data.test, threshold=.6)
superpc.plotred.lrtest(fit.red)

Run the code above in your browser using DataCamp Workspace