pamr.train(data, gene.subset=NULL, sample.subset=NULL,
threshold = NULL, n.threshold = 30,
scale.sd = TRUE, threshold.scale = NULL, se.scale = NULL, offset.percent = 50,
hetero=NULL, prior = NULL, remove.zeros = TRUE, sign.contrast="both",
ngroup.survival = 2)
pamr.train
fits a nearest shrunken centroid classifier to gene
expression data. Details may be found in the PNAS paper referenced
below. One feature not described there is "heterogeneity analysis".
Suppose there are two classes labelled "A" and "B".
CLass "A" is considered a normal class, and "B" an abnormal class.
Setting hetero="A" transforms expression values x[i,j] to
|x[i,j]- mean(x[i,j])| where the mean is taken only over samples in
class "A". The transformed feature values are then used in Pam.
This is useful when the abnormal class "B" is heterogeneous, i.e.
a given gene might have higher expresion than normal for some
class "B" samples, and lower for others.
With more than 2 classes, each class is centered on the class specified
by hetero.#generate some data
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=factor(y))
#train classifier
results<- pamr.train(mydata)
# train classifier on all data except class 4
results2 <- pamr.train(mydata,sample.subset=(mydata$y!=4))
# train classifier on only the first 500 genes
results3 <- pamr.train(mydata,gene.subset=1:500)
Run the code above in your browser using DataLab