Learn R Programming

rCGH (version 1.2.2)

EMnormalize: Genomic Profile Centralization

Description

This function analyses the Log2Ratios as a mixture of several gaussian populations, using an Expectation-Maximization algorithm (EM). The peakThresh argument specifies what proportion of the main density peak is allowed for choosing a neutral 2-copies population. The mean of the chosen population is used for centralizing the profile. See Mclust.

Usage

## S3 method for class 'rCGH':
EMnormalize(object, cut = c(0.01, 0.99), G = 2:6, useN = 25e3,
peakThresh = 0.5, ksmooth = NA, mergeVal = 0.1, Title = NA, verbose=TRUE)

Arguments

object
: An object of class "rCGH"
cut
: numeric. A vector of 2 alpha values (between 0 and 1). Log2Ratios outside the corresponding quantiles will be excluded for the gaussian mixture estimation. Default quantiles are $q_{0.01}$ and $q_{0.99}$.
G
: numeric. The number of groups to test during the gaussian mixture estimation. Default is from 2 to 6.
useN
: numeric. The number of probes to use for estimating the mixture parameters. Default is 25e3.
peakThresh
: numeric. The proportion of the highest peak to consider as a peak selection threshold. Default is 0.5.
ksmooth
: numeric. A smoothing value applied to Log2Ratios before modeling the gaussian mixture. When NA (default) ksmooth is estimated from the median absolute deviation of the Log2Ratios.
mergeVal
: numeric. Populations with means closer than mergeVal will be pooled together, default is 0.1. Set mergeVal to zero to not pool closed sub-populations.
Title
: character string. A title for the density plot. If NA (default), the sample name (when exists in object info) will be used.
verbose
: logical. When TRUE (default), progress is printed.

Value

  • An object of same class as the input.

Details

Depending on peakThresh, the mean of the highest density, or a lower value, will be chosen for centering the Log2Ratios before the segmentation. When a peakThresh value is specified, heights of density peaks are compared: the lowest peak mean among the peaks respecting the criteria: peakHeight > max(peaks)*peakThresh, is chosen for centralizing the data. See References

References

http://www.ncbi.nlm.nih.gov/pubmed/25538175{Commo et al. Impact of centralization on aCGH-based genomic profiles for precision medicine in oncology. Ann Oncol. 2014}

See Also

plotDensity, mclust

Examples

Run this code
filePath <- system.file("extdata", "Affy_cytoScan.cyhd.CN5.CNCHP.txt.bz2",
    package = "rCGH")
cgh <- readAffyCytoScan(filePath, sampleName = "AffyScHD")
cgh <- adjustSignal(cgh, nCores=1)
cgh <- EMnormalize(cgh)
getParam(cgh)

Run the code above in your browser using DataLab