Learn R Programming

missMethyl (version 1.6.2)

RUVfit: Remove unwanted variation when testing for differential methylation

Description

Provides an interface similar to lmFit from limma to the RUV2, RUV4, RUVinv and RUVrinv functions from the ruv package, which facilitates the removal of unwanted variation in a differential methylation analysis. A set of negative control variables, as described in the references, must be specified.

Usage

RUVfit(data, design, coef, ctl, method=c("inv", "rinv", "ruv4", "ruv2"), 
k = NULL, ...)

Arguments

data
numeric matrix with rows corresponding to the features of interest such as CpG sites and columns corresponding to samples or arrays.
design
the design matrix of the experiment, with rows corresponding to arrays/samples and columns to coefficients to be estimated.
coef
integer, column of the design matrix containing the comparison to test for differential methylation. Default is the last colum of the design matrix.
ctl
logical vector, length == nrow(data). Features that are to be used as negative control variables are indicated as TRUE, all other features are FALSE.
method
character string, indicates which RUV method should be used. Default method is RUVinv.
k
integer, required if method is "ruv2" or "ruv4". Indicates the number of unwanted factors to use. Can be 0.
...
additional arguments that can be passed to RUV2, RUV4, RUVinv and RUVrinv. See linked function documentation for details.

Value

  • An object of class MArrayLM (see MArrayLM-class) containing:
  • coefficientsThe estimated coefficients of the factor(s) of interest.
  • sigma2Estimates of the features' variances.
  • tt statistics for the factor(s) of interest.
  • pP-values for the factor(s) of interest.
  • multiplierThe constant by which sigma2 must be multiplied in order to get an estimate of the variance of coefficients
  • dfThe number of residual degrees of freedom.
  • WThe estimated unwanted factors.
  • alphaThe estimated coefficients of W.
  • byxThe coefficients in a regression of Y on X (after both Y and X have been "adjusted" for Z). Useful for projection plots.
  • bwxThe coefficients in a regression of W on X (after X has been "adjusted" for Z). Useful for projection plots.
  • XX. Included for reference.
  • kk. Included for reference.
  • ctlctl. Included for reference.
  • ZZ. Included for reference.
  • fullW0Can be used to speed up future calls of RUVfit.

Details

This function depends on the ruv and limma packages and is used to estimate and adjust for unwanted variation in a differential methylation analysis. Briefly, the unwanted factors W are estimated using negative control variables. Y is then regressed on the variables X, Z, and W. For methylation data, the analysis is performed on the M-values, defined as the log base 2 ratio of the methylated signal to the unmethylated signal.

References

Gagnon-Bartsch JA, Speed TP. (2012). Using control genes to correct for unwanted variation in microarray data. Biostatistics. 13(3), 539-52. Available at: http://biostatistics.oxfordjournals.org/content/13/3/539.full.

Gagnon-Bartsch, Jacob, and Speed. {2013}. Removing Unwanted Variation from High Dimensional Data with Negative Controls. Available at: http://statistics.berkeley.edu/tech-reports/820.

See Also

RUV2, RUV4, RUVinv, RUVrinv, topRUV

Examples

Run this code
if(require(minfi) & require(minfiData) & require(limma)) {

# Get methylation data for a 2 group comparison
meth <- getMeth(MsetEx)
unmeth <- getUnmeth(MsetEx)
Mval <- log2((meth + 100)/(unmeth + 100))

group<-factor(pData(MsetEx)$Sample_Group)
design<-model.matrix(~group)

# Perform initial analysis to empirically identify negative control features 
# when not known a priori
lFit = lmFit(Mval,design)
lFit2 = eBayes(lFit)
lTop = topTable(lFit2,coef=2,num=Inf)

# The negative control features should *not* be associated with factor of interest
# but *should* be affected by unwanted variation 
ctl = rownames(Mval) %in% rownames(lTop[lTop$adj.P.Val > 0.5,])

# Perform RUV adjustment and fit
fit = RUVfit(data=Mval, design=design, coef=2, ctl=ctl)
fit2 = RUVadj(fit)

# Look at table of top results
top = topRUV(fit2)
}

Run the code above in your browser using DataLab