powclasspred: Expected probability that a future sample is correctly classified.

Description

Estimates posterior expected probability that a future sample is correctly classified when performing class prediction. The estimate is obtained via Monte Carlo simulation from the posterior predictive.

Usage

powclasspred(gg.fit, x, groups, prgroups, v0thre=1, ngene=100, B=100)

Arguments

gg.fit

GaGa or MiGaGa fit (object of type gagafit, as returned by fitGG).

ExpressionSet, exprSet, data frame or matrix containing the gene expression measurements used to fit the model.

groups

If x is of type ExpressionSet or exprSet, groups should be the name of the column in pData(x) with the groups that one wishes to compare. If x is a matrix or a data frame, groups should be a vector indicating to which group each column in x corresponds to.

prgroups

Vector specifying prior probabilities for each group. Defaults to equally probable groups.

v0thre

Only genes with posterior probability of being equally expressed below v0thre are used.

ngene

Number of genes to use to build the classifier. Genes with smaller probability of being equally expressed are selected first.

Number of Monte Carlo samples to be used.

Value

ccall: Estimated expected probability of correctly classifying a future sample.
seccall: Estimated standard error of ccall.
ccgroup: Vector with the estimated probability of correctly classifying a sample from each group.
segroup: Estimated standard error of ccgroup.

Details

The routine simulates future samples (microarrays) from the posterior predictive distribution of a given group (e.g. control/cancer). Then it computes the posterior probability that the new sample belongs to each of the groups and classifies the sample into the group with highest probability. This process is repeated B times, and the proportion of correctly classified samples is reported for each group. The standard error is obtained via the usual normal approximation (i.e. SD/B). The overall probability of correct classification is also provided (i.e. for all groups together), but using a more efficient variant of the algorithm. Instead of reporting the observed proportion of correctly classified samples, it reports the expected proportion of correctly classified samples (i.e. the average posterior probability of the class that the sample is assigned to).

References

Rossell D. GaGa: a simple and flexible hierarchical model for microarray data analysis. http://rosselldavid.googlepages.com.