Learn R Programming

HTSCluster (version 1.1)

HTSCluster-package: Clustering high throughput sequencing (HTS) data

Description

This package implements two parameterizations of a Poisson mixture model to cluster observations (e.g., genes) in high throughput sequencing data. Parameter estimation is performed using either the EM or CEM algorithm, and the BIC or ICL criteria are used for model selection (i.e., to choose the number of clusters).

Arguments

Details

ll{ Package: HTSCluster Type: Package Version: 1.1 Date: 2012-06-11 License: GPM (>=2) LazyLoad: yes }

References

Rau, A., Celeux, G., Martin-Magniette, M.-L., Maugis-Rabusseau, C (2011). Clustering high-throughput sequencing data with Poisson mixture models. Inria Research Report 7786. Available at http://hal.inria.fr/inria-00638082.

Examples

Run this code
set.seed(12345)

## Simulate data as shown in Rau et al. (2011)
## Library size setting "A", high cluster separation
## n = 2000 observations

simulate <- PoisMixSim(n = 200, libsize = "A", separation = "high")
y <- simulate$y
conds <- simulate$conditions

## Run the PMM-II model for g = {3, 4, 5}
## "TC" library size estimate, EM algorithm
## Model selection via the ICL

run <- PoisMixClus(y, gmin = 3, gmax = 5, lib.size = TRUE, 
    lib.type = "TC", conds = conds, init.type = "small-em") 

## Estimates of pi and lambda for the selected model
pi.est <- run$pi.ICL
lambda.est <- run$lambda.ICL

Run the code above in your browser using DataLab