Learn R Programming

LPS (version 1.0.4)

predict.LPS: Predict method for LPS objects

Description

This function allow predictions to be made from a fitted LPS model and a new dataset. It can also plot a gene expression heatmap to visualize results of the prediction.

Usage

## S3 method for class 'LPS':
predict(object, newdata, type=c("class", "probability", "score"),
    method = c("Wright", "Radmacher", "exact"), threshold = 0.9, na.rm = TRUE,
    subset = NULL, col.lines = "#FFFFFF", col.classes = c("#FFCC00", "#1144CC"),
    customLayout = FALSE, cex.col = NA, cex.row = NA, mai.left = NA, mai.bottom = NA,
    mai.right = 1, side = NULL, side.height = 1, side.col = NULL, col.heatmap = heat(),
    zlim = "0 centered", norm = c("rows", "columns", "none"),
    norm.robust = FALSE, plot = FALSE, ...)

Arguments

object
An object of class "LPS", as returned by LPS.
newdata
Continuous data used to retrieve classes, as a data.frame or matrix, with samples in rows and features (genes) in columns. Rows and columns should be named. It can also be a named numeric vector of already computed scores.
type
Single character value, return type of the predictions to be made ("class", "probability" or "score"). See 'Value' section.
method
Single character value, the method to use to make predictions ("Wright", "Radmacher" or "exact"). See 'Details' section.
threshold
Threshold to use for class prediction. "Wright" method was designed with 0.9, "Radmacher" method makes no use of the threshold.
na.rm
Single logical value, if TRUE samples with one or many NA features will be scored too (concerned feature is removed for the concerned sample, which might be discutable).
subset
A subsetting vector to apply on newdata rows. See [ for handled values.
col.lines
If graph is TRUE, a single character value to be used for line drawing on the heatmap.
col.classes
If graph is TRUE, a character vector of two values giving to each class a distinct color.
customLayout
Single logical value, as layout does not allow nested calls, set this to TRUE to make your own call to layout and embed this plot in a wider one.
cex.col
To be passed to heat.map.
cex.row
To be passed to heat.map.
mai.left
To be passed to heat.map.
mai.bottom
To be passed to heat.map.
mai.right
To be passed to heat.map (used to plot score coefficients).
side
To be passed to heat.map.
side.height
To be passed to heat.map.
side.col
To be passed to heat.map.
col.heatmap
To be passed to heat.map.
zlim
To be passed to heat.map.
norm
To be passed to heat.map.
norm.robust
To be passed to heat.map.
plot
To be passed to heat.map.
...
Ignored, just there to match the predict generic function.

Value

  • For a "class" type, returns a character vector with group assignment for each new sample (possibly NA), named according to data row names. For a "probability" type, returns a numeric matrix with two columns (probabilities to be in each group) and a row for each new sample, row named according to data row names and column named according to the group labels. For a "score" type, returns a numeric vector with LPS score for each new sample, named according to data row names. Notice the score is the same for all methods. If plot is TRUE, returns the list returned by heat.map, with data described above in the first unammed element.

Details

The "Compound covariate predictor" from Radmacher et al. (method = "Radmacher") simply assign each sample to the closest group (comparing the sample score to the mean scores of each group in the training dataset). The "Linear Predictor Score" from Wright et al. (method = "Wright") modelizes scores in each training sub-group with a distinct gaussian distribution, and computes the probability for a sample to be in one of them or the other using a bayesian rule. The "exact" mode is still under development and should not be used.

References

Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. J Comput Biol. 2002;9(3):505-11. Wright G, Tan B, Rosenwald A, Hurt EH, Wiestner A, Staudt LM. A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma. Proc Natl Acad Sci U S A. 2003 Aug 19;100(17):9991-6.

See Also

LPS

Examples

Run this code
# Data with features in columns
  data(rosenwald)
  group <- rosenwald.cli$group
  expr <- t(rosenwald.expr)
  
  # NA imputation (feature's mean to minimize impact)
  f <- function(x) { x[ is.na(x) ] <- round(mean(x, na.rm=TRUE), 3); x }
  expr <- apply(expr, 2, f)
  
  # Coefficients
  coeff <- LPS.coeff(data=expr, response=group)
  
  # 10 best features model
  m <- LPS(data=expr, coeff=coeff, response=group, k=10)
  
  
  # Class prediction plot
  predict(m, expr, plot=TRUE)
  
  # Wright et al. class prediction
  table(
    group,
    prediction = predict(m, expr),
    exclude = NULL
  )
  
  # More stringent threshold
  table(
    group,
    prediction = predict(m, expr, threshold=0.99),
    exclude = NULL
  )
  
  # Radmacher et al. class prediction
  table(
    group,
    prediction = predict(m, expr, method="Radmacher"),
    exclude = NULL
  )
  
  # Probabilities
  predict(m, expr, type="probability", method="Wright")
  predict(m, expr, type="probability", method="Radmacher")
  predict(m, expr, type="probability", method="exact")
  
  # Probability plot
  predict(m, expr, type="probability", plot=TRUE)
  
  # Annotated probability plot
  side <- data.frame(group, row.names=rownames(expr))
  predict(m, expr, side=side, type="probability", plot=TRUE)
  
  # Score plot
  predict(m, expr, type="score", plot=TRUE)

Run the code above in your browser using DataLab