superpc (version 1.09)

superpc.cv: Cross-validation for supervised principal components

Description

This function uses a form of cross-validation to estimate the optimal feature threshold in supervised principal components

Usage

superpc.cv(fit, data, n.threshold = 20,  n.fold = NULL, folds = NULL,   n.components = 3, min.features = 5, max.features = nrow(data$x),  compute.fullcv =  TRUE,
                 compute.preval = TRUE, xl.mode = c("regular",
                 "firsttime", "onetime", "lasttime"), xl.time = NULL,
                 xl.prevfit = NULL)

Arguments

fit
Object returned by superpc.train
data
Data object of form described in superpc.train documentation
n.threshold
Number of thresholds to consider. Default 20.
n.fold
Number of cross-validation folds. default is around 10 (program pick a convenient value based on the sample size
folds
List of indices of cross-validation folds (optional)
n.components
Number of cross-validation components to use: 1,2 or 3.
min.features
Minimum number of features to include, in determining range for threshold. Default 5.
max.features
Maximum number of features to include, in determining range for threshold. Default is total number of features in the dataset
compute.fullcv
Should full cross-validation be done?
compute.preval
Should full pre-validation be done?
xl.mode
Used by Excel interface only
xl.time
Used by Excel interface only
xl.prevfit
Used by Excel interface only

Value

  • list(threshold = th, nonzero = nonzero, scor = out, scor.preval = out.preval, folds = folds, featurescores.folds = featurescores.folds, v.preval = cur2, type = type, call = this.call)
  • thresholdVector of thresholds considered
  • nonzeroNumber of features exceeding each value of the threshold
  • scor.prevalLikelihood ratio scores from pre-validation
  • scorFull CV scores
  • foldsIndices of CV folds used
  • featurescores.foldsFeature scores for each fold
  • v.prevalThe pre-validated predictors
  • typeproblem type
  • callcalling sequence

Details

This function uses a form of cross-validation to estimate the optimal feature threshold in supervised principal components. To avoid prolems with fitting Cox models to samll validation datastes, it uses the "pre-validation" approach of Tibshirani and Efron (2002)

Examples

Run this code
set.seed(332)
x<-matrix(rnorm(1000*40),ncol=40)
y<-10+svd(x[1:60,])$v[,1]+ .1*rnorm(40)
censoring.status<- sample(c(rep(1,30),rep(0,10)))

featurenames <- paste("feature",as.character(1:1000),sep="")
data<-list(x=x,y=y, censoring.status=censoring.status, featurenames=featurenames)


a<- superpc.train(data, type="survival")
aa<-superpc.cv(a,data)

Run the code above in your browser using DataLab