Learn R Programming

ClassifyR (version 1.2.4)

KullbackLeiblerSelection: Selection of Differential Distributions with Kullback Leibler Distance

Description

Ranks features by largest Kullback Leibler distance and chooses the features which have best resubstitution performance.

Usage

"KullbackLeiblerSelection"(expression, classes, ...) "KullbackLeiblerSelection"(expression, trainParams, predictParams, resubstituteParams, ..., verbose = 3)

Arguments

expression
Either a matrix or ExpressionSet containing the training data. For a matrix, the rows are features, and the columns are samples.
classes
A vector of class labels.
trainParams
A container of class TrainParams describing the classifier to use for training.
predictParams
A container of class PredictParams describing how prediction is to be done.
resubstituteParams
An object of class ResubstituteParams describing the performance measure to consider and the numbers of top features to try for resubstitution classification.
...
Variables passed to getLocationsAndScales.
verbose
A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3.

Value

A list of length 2. The first element has the features ranked from most important to least important. The second element has the features that were selected to be used for classification.

Details

The distance is defined as $0.5 * (location1 - location2)^2 / scale1^2 + (location1 - location2)^2 / scale2^2 + scale1^2 / scale2^2 + scale2^2 / scale1^2 $ The subscripts denote the group which the parameter is calculated for.

Examples

Run this code
  if(require(sparsediscrim))
  {
    # First 25 samples are mixtures of two normals. Last 25 samples are one normal.
    genesMatrix <- sapply(1:25, function(geneColumn) c(rnorm(50, 5, 1), rnorm(50, 15, 1)))
    genesMatrix <- cbind(genesMatrix, sapply(1:25, function(geneColumn) rnorm(100, 9, 3)))
    classes <- factor(rep(c("Poor", "Good"), each = 25))
    KullbackLeiblerSelection(genesMatrix, classes,
                             trainParams = TrainParams(), predictParams = PredictParams(),
                             resubstituteParams = ResubstituteParams(nFeatures = seq(10, 100, 10), performanceType = "balanced", better = "lower"))
  }

Run the code above in your browser using DataLab