plgem.resampledStn: Computation of Resampled PLGEM-STN Statistics

Description

This function computes resampled signal-to-noise ratio (STN) values using PLGEM fitting parameters (obtained via a call to function plgem.fit) to detect differential expression in an ExpressionSet, containing either microarray or proteomics data.

Usage

plgem.resampledStn(data, plgemFit, covariate=1, baselineCondition=1, iterations="automatic", verbose=FALSE)

Arguments

data

an object of class ExpressionSet; see Details for important information on how the phenoData slot of this object will be interpreted by the function.

plgemFit

list; the output of function plgem.fit.

covariate

integer, numeric or character; specifies the covariate to be used to distinguish the various experimental conditions from one another. See Details for how to specify the covariate.

baselineCondition

integer, numeric or character; specifies the condition to be treated as the baseline. See Details for how to specify the baselineCondition.

verbose

logical; if TRUE, comments are printed out while running.

iterations

number of iterations for the resampling step; if "automatic" it is automatically determined.

Value

RESAMPLED.STN: matrix of resampled PLGEM-STN values, with rownames identical to those in data, and colnames representing the different number of replicates found in the different comparisons; see References for details.
REPL.NUMBER: the number of replicates found for each experimental condition; see References for details.

Details

The phenoData slot of the ExpressionSet given as input is expected to contain the necessary information to distinguish the various experimental conditions from one another. The columns of the pData are referred to as ‘covariates’. There has to be at least one covariate defined in the input ExpressionSet. The sample attributes according to this covariate must be distinct for samples that are to be treated as distinct experimental conditions and identical for samples that are to be treated as replicates. There is a couple different ways how to specify the covariate: If an integer or a numeric is given, it will be taken as the covariate number (in the same order in which the covariates appear in the colnames of the pData). If a character is given, it will be taken as the covariate name itself (in the same way the covariates are specified in the colnames of the pData). By default, the first covariate appearing in the colnames of the pData is used. Similarly, there is a couple different ways how to specify which experimental condition to treat as the baseline. The available ‘condition names’ are taken from unique(as.character(pData(data)[, covariate])). If baselineCondition is given as a character, it will be taken as the condition name itself. If baselineCondition is given as an integer or a numeric value, it will be taken as the condition number (in the same order of appearance as in the ‘condition names’). By default, the first condition name is used. PLGEM-STN values are a measure of the degree of differential expression between a condition and the baseline: $$ STN = \frac{mean_{condition}-mean_{baseline}}{modeledSpread_{condition}+modeledSpread_{baseline}},$$ where: $$\log{(modeledSpread)}=PLGEMslope*\log{(mean)}+PLGEMintercept$$

plgem.resampledStn determines the resampled PLGEM-STN values for each gene or protein in data using a resampling approach; see References for details. The number of iterations should be chosen depending on the number of available replicates of the condition used for fitting the model.

References

Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.

Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.

Examples

Run this code

  data(LPSeset)
  LPSfit <- plgem.fit(data=LPSeset)
  LPSobsStn <- plgem.obsStn(data=LPSeset, plgemFit=LPSfit)
  set.seed(123)
  LPSresampledStn <- plgem.resampledStn(data=LPSeset, plgemFit=LPSfit)
  plot(density(LPSresampledStn[["RESAMPLED.STN"]], bw=0.01), col="black", lwd=2,
    xlab="PLGEM STN values",
    main="Distribution of observed\nand resampled PLGEM-STN values")
  lines(density(LPSobsStn[["PLGEM.STN"]], bw=0.01), col="red")
  legend("topright", legend=c("resampled", "observed"), col=c("black", "red"),
    lwd=2:1)

Run the code above in your browser using DataLab