Learn R Programming

SIBERG (version 2.0.3)

fitGP: Fit Generalized Poisson Mixture Model

Description

The function fits a two-component Generalized Poisson mixture model.

Usage

fitGP(y, d=NULL, inits=NULL, model='V', zeroPercentThr=0.2)

Value

A vector consisting parameter estimates of mu1, mu2, phi1, phi2, pi1, logLik and BIC. For 0-inflated model, mu1=phi1=0.

Arguments

y

A vector representing the RNAseq raw count.

d

A vector of the same length as y representing the normalization constant to be applied to the data.

inits

Initial value to fit the mixture model. A vector with elements mu1, mu2, phi1, phi2 and pi1.

model

Character specifying E or V model. E model fits the mixture model with equal dispersion phi while V model doesn't put any constraint.

zeroPercentThr

A scalar specifying the minimum percent of zero counts needed when fitting a zero-inflated Generalized Poisson model. This parameter is used to deal with zero-inflation in RNAseq count data. When the percent of zero exceeds this threshold, rather than fitting a 2-component Generalized Poisson mixture, a mixture of point mass at 0 and Generalized Poisson is fitted.

Author

Pan Tong (nickytong@gmail.com), Kevin R Coombes (krc@silicovore.com)

Details

This function directly maximize the log likelihood function through optimization. With this function, three models can be fitted: (1) Generalized Poisson mixture with equal dispersion (E model); (2) Generalized Poisson mixture with unequal dispersion (V model); (3) 0-inflated Generalized Poisson model. The 0-inflated Generalized Poisson has the following density function:

\(P(Y=y)=\pi D(y) + (1-\pi)GP(\mu, \phi)\) where D is the point mass at 0 while \(GP(\mu, \phi)\) is the density of Generalized Poisson distribution with mean \(\mu\) and dispersion \(\phi\). The variance is \(\phi \mu\).

The rule to fit 0-inflated model is that the observed percentage of count exceeds the user specified threshold. This rule overrides the model argument when observed percentae of zero count exceeds the threshold.

References

Tong, P., Chen, Y., Su, X. and Coombes, K. R. (2012). Systematic Identification of Bimodally Expressed Genes Using RNAseq Data. Bioinformatics, 2013 Mar 1;29(5):605-13.

See Also

SIBER fitLN fitNB fitNL

Examples

Run this code
# artificial RNAseq data from negative binomial distribution
set.seed(1000)
dat <- rnbinom(100, mu=1000, size=1/0.2)
fitGP(y=dat)

Run the code above in your browser using DataLab