smap: smap: A Segmental Maximum A Posteriori Approach to Array-CGH Copy Number Profiling

Description

This function fits a Hidden Markov Model (HMM) to a set of observed microarray intensity ratios and outputs the most plausible state sequence in the HMM through segmental a posteriori maximization.

Briefly, given an HMM with initial parameter settings lambda and a set of observations O, the method alternates maximization of the joint posterior probability of the state sequence Q and lambda given O, p(Q,lambda|O), over Q (using a modified Viterbi algorithm) and lambda (using a gradient descent scheme with individual learning rate adaptation).

Usage

smap(x, Obs, sd.min=0.05, mean.sd=0.05, max.iters=Inf, gd.max.iters=Inf, tau=0.05, eta=0.01, e.change=0.5, e.same=1.2, e.min=0.0001, e.max=0.5, adaptive=TRUE, overlap=TRUE, distance=TRUE, chrom.wise=FALSE, verbose=1, L=5000000)

Arguments

An object of class SMAPHMM-class.

Obs

An object of class SMAPObservations-class.

sd.min

The minimum allowed standard deviation of state associated Gaussian distributions (numeric).

mean.sd

Prior standard deviation of state associated Gaussian means (numeric).

max.iters

Maximum number of iterations in the SMAP algorithm (numeric).

gd.max.iters

Maximum number of iterations in the gradient descent algorithm per SMAP iteration (numeric).

tau

Minimum log probability improvement required in the SMAP and gradient descent optimization (numeric).

eta

Initial learning rate in the gradient descent optimization (numeric).

e.change

Multiplier for individual learning rate adaptation if the sign of partial derivative changes (numeric). Only used if adaptive == TRUE.

e.same

Multiplier for individual learning rate adaptation if the sign of partial derivative stays the same (numeric). Only used if adaptive == TRUE.

e.min

Minimum allowed learning rate (numeric).

e.max

Maximum allowed learning rate (numeric).

adaptive

If TRUE, individual learning rate adaptation according to Algorithm 1 in Bagos et al. (2004) is used in the gradient descent optimization.

overlap

If TRUE, genomic overlap of clones is considered in the optimization.

distance

If TRUE, genomic distance between clones is considered in the optimization, in terms of distance based transition probabilities.

chrom.wise

If TRUE, the observations are analyzed chromosome-wise rather than genome-wise.

verbose

Specifies the amount of output produced; 0 means no information and 3 a lot of information (numeric).

A positive length parameter that controls the convergence of distance based transition probabilities towards 1 / noStates(x) (numeric).

Value

The method returns an object of class SMAPProfile-class or SMAPProfiles-class if chrom.wise is set to FALSE or TRUE, respectively.

Details

sd.min, mean.sd, and eta must all be greater than 0. tau must be greater than 0 if max.iters or gd.max.iters are infinite, and can be 0 otherwise. If adaptive is TRUE, then e.change is required to be in the interval (0,1], e.same must be greater than or equal to 1, and e.max must be greater than 0.

References

Andersson, R., Bruder, C. E. G., Piotrowski, A., Menzel, U., Nord, H., Sandgren, J., Hvidsten, T. R., Diaz de Stahl, T., Dumanski, J. P., Komorowski, J., A Segmental Maximum A Posteriori Approach to Array-CGH Copy Number Profiling, submitted Bagos P. G., Liakopoulos T. D., Hamodrakas, S. J. (2004) Faster Gradient Descent Training of Hidden Markov Models, Using Individual Learning Rate Adaptation. In Paliouras, G., Sakakibara, Y., editors, ICGI, volume 3264 of Lecture Notes in Computer Science, pages 40--52.

Examples

Run this code

## Load Glioblastoma multiforme data
data(GBM)
observations <- SMAPObservations(value=as.numeric(GBM[,2]),
                                 chromosome=as.character(GBM[,3]),
                                 startPosition=as.numeric(GBM[,4]),
                                 endPosition=as.numeric(GBM[,5]),
                                 name="G24460",
                                 reporterId=as.character(GBM[,1]))
plot(observations, ylim=c(0,2))
## Initiate HMM
init.means <- c(0.4, 0.7, 1, 1.3, 1.6, 3)
init.sds <- rep(0.1, 6)
phi <- cbind(init.means, init.sds)
hmm <- SMAPHMM(6, phi, initTrans=0.02)
hmm
## RUN SMAP:
profile <- smap(hmm, observations, verbose=2)
## genome profile
plot(profile, ylim=c(0,2))
## chromosome 9 profile
ids <- which(chromosome(observations) == "9")
plot(profile[ids], ylim=c(0,2), main="chromosome 9")
## output results for chromosome 9
#cbind(reporterId(observations[ids]), Q(profile[ids]))

Run the code above in your browser using DataLab