predict.LPS: Predict method for LPS objects

Description

This function allow predictions to be made from a fitted LPS model and a new dataset. It can also plot a gene expression heatmap to visualize results of the prediction.

Usage

## S3 method for class 'LPS':
predict(object, newdata, type=c("class", "probability", "score"),
    method = c("Wright", "Radmacher", "exact"), threshold = 0.9, na.rm = TRUE,
    subset = NULL, col.lines = "#FFFFFF", col.classes = c("#FFCC00", "#1144CC"),
    customLayout = FALSE, cex.col = NA, cex.row = NA, mai.left = NA, mai.bottom = NA,
    mai.right = 1, side = NULL, side.height = 1, side.col = NULL, col.heatmap = heat(),
    zlim = "0 centered", norm = c("rows", "columns", "none"),
    norm.robust = FALSE, plot = FALSE, ...)

Arguments

object

An object of class "LPS", as returned by LPS.

newdata

Continuous data used to retrieve classes, as a data.frame or matrix, with samples in rows and features (genes) in columns. Rows and columns should be named. It can also be a named numeric vector of already computed scores.

type

Single character value, return type of the predictions to be made ("class", "probability" or "score"). See 'Value' section.

method

Single character value, the method to use to make predictions ("Wright", "Radmacher" or "exact"). See 'Details' section.

threshold

Threshold to use for class prediction. "Wright" method was designed with 0.9, "Radmacher" method makes no use of the threshold.

na.rm

Single logical value, if TRUE samples with one or many NA features will be scored too (concerned feature is removed for the concerned sample, which might be discutable).

subset

A subsetting vector to apply on newdata rows. See [ for handled values.

col.lines

If graph is TRUE, a single character value to be used for line drawing on the heatmap.

col.classes

If graph is TRUE, a character vector of two values giving to each class a distinct color.

customLayout

Single logical value, as layout does not allow nested calls, set this to TRUE to make your own call to layout and embed this plot in a wider one.

cex.col

To be passed to heat.map.

cex.row

To be passed to heat.map.

mai.left

To be passed to heat.map.

mai.bottom

To be passed to heat.map.

mai.right

To be passed to heat.map (used to plot score coefficients).

side

To be passed to heat.map.

side.height

To be passed to heat.map.

side.col

To be passed to heat.map.

col.heatmap

To be passed to heat.map.

zlim

To be passed to heat.map.

norm

To be passed to heat.map.

norm.robust

To be passed to heat.map.

plot

To be passed to heat.map.

...

Ignored, just there to match the predict generic function.

Value

For a "class" type, returns a character vector with group assignment for each new sample (possibly NA), named according to data row names. For a "probability" type, returns a numeric matrix with two columns (probabilities to be in each group) and a row for each new sample, row named according to data row names and column named according to the group labels. For a "score" type, returns a numeric vector with LPS score for each new sample, named according to data row names. Notice the score is the same for all methods. If plot is TRUE, returns the list returned by heat.map, with data described above in the first unammed element.

Details

The "Compound covariate predictor" from Radmacher et al. (method = "Radmacher") simply assign each sample to the closest group (comparing the sample score to the mean scores of each group in the training dataset). The "Linear Predictor Score" from Wright et al. (method = "Wright") modelizes scores in each training sub-group with a distinct gaussian distribution, and computes the probability for a sample to be in one of them or the other using a bayesian rule. The "exact" mode is still under development and should not be used.

References

Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. J Comput Biol. 2002;9(3):505-11. Wright G, Tan B, Rosenwald A, Hurt EH, Wiestner A, Staudt LM. A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma. Proc Natl Acad Sci U S A. 2003 Aug 19;100(17):9991-6.

Examples

Run this code

# Data with features in columns
  data(rosenwald)
  group <- rosenwald.cli$group
  expr <- t(rosenwald.expr)
  
  # NA imputation (feature's mean to minimize impact)
  f <- function(x) { x[ is.na(x) ] <- round(mean(x, na.rm=TRUE), 3); x }
  expr <- apply(expr, 2, f)
  
  # Coefficients
  coeff <- LPS.coeff(data=expr, response=group)
  
  # 10 best features model
  m <- LPS(data=expr, coeff=coeff, response=group, k=10)
  
  
  # Class prediction plot
  predict(m, expr, plot=TRUE)
  
  # Wright et al. class prediction
  table(
    group,
    prediction = predict(m, expr),
    exclude = NULL
  )
  
  # More stringent threshold
  table(
    group,
    prediction = predict(m, expr, threshold=0.99),
    exclude = NULL
  )
  
  # Radmacher et al. class prediction
  table(
    group,
    prediction = predict(m, expr, method="Radmacher"),
    exclude = NULL
  )
  
  # Probabilities
  predict(m, expr, type="probability", method="Wright")
  predict(m, expr, type="probability", method="Radmacher")
  predict(m, expr, type="probability", method="exact")
  
  # Probability plot
  predict(m, expr, type="probability", plot=TRUE)
  
  # Annotated probability plot
  side <- data.frame(group, row.names=rownames(expr))
  predict(m, expr, side=side, type="probability", plot=TRUE)
  
  # Score plot
  predict(m, expr, type="score", plot=TRUE)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

Details

References

See Also

Examples