Learn R Programming

lumi (version 2.24.0)

gammaFitEM: Estimate the methylation status by fitting a Gamma mixture model using EM algorithm

Description

Estimate the methylation status by fitting a two component Gamma mixture model using EM algorithm based on the all M-values of a particular sample

Usage

gammaFitEM(M, initialFit = NULL, fix.k = NULL, weighted = TRUE, maxIteration = 50, tol = 1e-04, plotMode = FALSE, truncate=FALSE, verbose = FALSE)

Arguments

M
a vector of M-values covering the whole genome
initialFit
the initial estimation of the gamma parameters returned by .initialGammaEstimation function
fix.k
the k parameter of the gamma function which is fixed during estimation
weighted
determine whether to down-weight the long tails of two component densities beyond their modes
maxIteration
maximum iterations allowed before converging
tol
the difference threshold used to determine convergence
plotMode
determine whether plot the histogram and density plot estimation
truncate
determine whether to truncate the tails beyond the modes during parameter estimation
verbose
determine whether plot intermediate messages during iterations

Value

The return is a list with "gammaFit" class attribute, which includes the following items:
logLikelihood
the log-likelihood of the fitting model
k
parameter k of gamma distribution
theta
parameter theta of gamma distribution
shift
parameter shift of gamma distribution
proportion
the proportion of two components (gamma distributions)
mode
the mode positions of the gamma distributions
probability
the estimated methylation status posterior probability of each CpG site

Details

The assumption of this function is that the M-value distribution is composed of the mixture of two shifted gamma distributions, which are defined as: dgamma(x-s[1], shape=k[1], scale=theta[1]) and dgamma(s[2]-x, shape=k[2], scale=theta[2]). Here s represents the shift.

NOTE: the methylation status modeling algorithm was developed based on 27K methylation array. It has not been tested for 450K array. Considering 450K array covers both promoter and gene body, the two component Gamma mixture model assumption may not be valid any more.

See Also

methylationCall and plotGammaFit

Examples

Run this code

data(example.lumiMethy)
M <- exprs(example.lumiMethy)
fittedGamma <- gammaFitEM(M[,1], initialFit=NULL, maxIteration=50, tol=0.0001, plotMode=TRUE, verbose=FALSE)

Run the code above in your browser using DataLab