Learn R Programming

HTSCluster (version 1.1)

logLikePoisMix: Calculate the log likelihood of a Poisson mixture model

Description

Functions to calculate the log likelihood of a Poisson mixture model, given a set of parameters, and the difference in log likelihoods for two different sets of parameters of a Poisson mixture model.

Usage

logLikePoisMix(y, mean, pi, smoothing = TRUE)
logLikePoisMixDiff(y, mean.new, pi.new, mean.old, pi.old)

Arguments

y
(n x q) matrix of observed counts for n observations and q variables
mean
List of length g containing the (n x q) matrices of conditional mean expression for all observations as calculated by the PoisMixMean function, where g represents the
pi
Vector of length g containing an estimate for $\hat{\ensuremath\boldsymbol{\pi}}$
smoothing
TRUE (default value) if smoothing should be used in calculation of log likelihood to avoid numerical problems
mean.new
List of length g containing the (n x q) matrices of conditional mean expression for all observations for one set of parameters, as calculated by the PoisMixMean function, wher
mean.old
List of length g containing the (n x q) matrices of conditional mean expression for all observations for another set of parameters, as calculated by the PoisMixMean function,
pi.new
Vector of length g containing one estimate for $\hat{\ensuremath\boldsymbol{\pi}}$
pi.old
Vector of length g containing another estimate for $\hat{\ensuremath\boldsymbol{\pi}}$

Value

  • llLog likelihood for a given set of parameters (logLikePoisMix function) or difference in log likelihoods for two different sets of parameters (in the case of the logLikePoisMixDiff function)

Details

The logLikePoisMix function calculates the log likelihood for a given set of parameters in a Poisson mixture model and is used in the PoisMixClus function for the calculation of the BIC and ICL. The logLikePoisMixDiff function is used to calculate the difference in log likelihood for two different sets of parameters in a Poisson mixture model; it is used to determine convergence in the EM algorithm run by the PoisMixClus function.

References

Loader, C. (2000) Fast and accurate computation of binomial probabilities. Available at http://projects.scipy.org/scipy/raw-attachment/ticket/620/loader2000Fast.pdf. Rau, A., Celeux, G., Martin-Magniette, M.-L., Maugis-Rabusseau, C (2011). Clustering high-throughput sequencing data with Poisson mixture models. Inria Research Report 7786. Available at http://hal.inria.fr/inria-00638082.

See Also

PoisMixClus for Poisson mixture model estimation and model selection; PoisMixMean to calculate the per-cluster conditional mean of each observation

Examples

Run this code
set.seed(12345)

## Simulate data as shown in Rau et al. (2011)
## Library size setting "A", low cluster separation
## n = 200 observations

simulate <- PoisMixSim(n = 200, libsize = "A", separation = "low")
y <- simulate$y
conds <- simulate$conditions
w <- rowSums(y)               ## Estimate of w
r <- table(conds)             ## Number of replicates per condition
d <- length(unique(conds))    ## Number of conditions
s <- colSums(y) / sum(y)      ## TC estimate of lib size
s.dot <- rep(NA, d)           ## Summing lib size within conditions
for(j in 1:d) s.dot[j] <- sum(s[which(conds == unique(conds)[j])]);

## Initial guess for pi and lambda
g.true <- 4
pi.guess <- simulate$pi
## Recalibrate so that (s.dot * lambda.guess) = 1
lambda.sim <- simulate$lambda
lambda.guess <- matrix(NA, nrow = d, ncol = g.true)
for(k in 1:g.true) {
    tmp <- lambda.sim[,k]/sum(lambda.sim[,k])
    lambda.guess[,k] <- tmp/s.dot
}

## Run the PMM-II model for g = 4
## with EM algorithm and "TC" library size parameter
run <- PoisMixClus(y, gmin = 4, gmax = 4, lib.size = TRUE, 
   lib.type = "TC", conds = conds) 
pi.est <- run$pi[[1]]
lambda.est <- run$lambda[[1]]

## Mean values for each of the parameter sets
mean.guess <- PoisMixMean(y, 4, conds, s, lambda.guess)
mean.est <- PoisMixMean(y, 4, conds, s, lambda.est)

## Log likelihood of each parameter set,
## and difference in log likelihoods
LL.guess <- logLikePoisMix(y, mean.guess, pi.guess, smoothing = FALSE)$ll  
LL.est <- logLikePoisMix(y, mean.est, pi.est, smoothing = FALSE)$ll         
LL.diff <- logLikePoisMixDiff(y, mean.guess, pi.guess, mean.est, pi.est)

LL.guess - LL.est   ## -13474.95
LL.diff             ## -13474.95

Run the code above in your browser using DataLab