viterbi2Wrapper: Wrapper function for fitting the viterbi algorithm

Description

The viterbi algorithm, implemented in C, estimates the optimal state path as well as the forward and backward variables that are used for updating the mean and variances in a copy number HMM. The function viterbi2Wrapper should not be called directly by the user. Rather, users should fit the HMM by passing an appropriate container to the method hmm. We document the viterbi2Wrapper arguments as several of the arguments can be modified from their default value when passed from the hmm method through the .... In particular, nupdates, p.hom, and prOutlierBaf.

Usage

viterbi2Wrapper(index.samples, cnStates, prOutlierBAF = list(initial = 1e-05, max = 0.001, maxROH = 1e-05), p.hom = 0.05, is.log, limits, normalIndex = 3L, nupdates = 10, tolerance = 5, computeLLR = TRUE, returnEmission = FALSE, verbose = FALSE, grFun, matrixFun, snp.index, anyNP)

Arguments

index.samples

Index for the samples that are to be processed.

cnStates

numeric vector for the initial copy number state means.

prOutlierBAF

A list with elements 'initial', 'max', and 'maxROH' corresponding to the initial estimate of the probability that a B allele frequency (BAF) is an outlier, the maximum value for this parameter over states that do not involve homozygous genotypes, and the maximum value over states that assume homozygous genotypes. This parameter is experimental and could be used to fine tune the HMM for different platforms. For example, the BAFs for the Affy platform are typically more noisey than the BAFs for Illumina. One may want to set small values of these parameters for Illumina (e.g, 1e-5, 1e-3, and 1e-5) and larger values for Affy (e.g., 1e-3, 0.01, 1e-3).

p.hom

numeric: weight for observing homozygous genotypes. For value 0, homozygous genotypes / B allele frequencies have the same emission probability in the 'normal' state as in the states hemizygous deletion and in copy-neutral region of homozygosity. Regions of homozygosity are common in normal genomes. For small values of p.hom, hemizygous deletions will only be called if the copy number estimates show evidence of a decrease from normal.

is.log

logical: Whether the copy number estimates in the r matrix are on the log-scale.

limits

numeric vector of length two specifying the range of the copy number estimates in r. Values of r outside of this range are truncated. See copyNumberLimits.

normalIndex

integer specifying the index for the normal state. Note that states must be ordered by the mean of the copy number state. E.g., state 1 is homozygous deletion (0 copies), state 2 is hemizygous deletion (1 copy), normal (2 copies), ... In a 6-state HMM, normalIndex should be 3.

nupdates

integer specifying the maximum number of iterations for reestimating the mean and variance for each of the copy number states. The number of iterations may be fewer than nupdates if the difference in the log-likelihood between successive iterations is less than tolerance.

tolerance

numeric value for indicating convergence of the log-likelihood. If the difference in the log-likelihood of the observed data given the HMM model at iteration i and i-1 is less than tolerance, no additional updates of model parameters using the EM algorithm is needed.

computeLLR

Logical. Whether to compute a log likelihood ratio (LLR) comparing the predicted state to the normal state. This is calculated post-hoc and is not precisely the likelihood estimated from the Viterbi algorithm. When FALSE, the LLR is not calculated and the algorithm is slightly faster.

returnEmission

Logical. If TRUE, an array of emission probabilities are returned. The dimensions of the array are SNPs, samples, and copy number states.

verbose

Logical. Whether to print some of the details of the processing.

grFun

An R function for coercing the state-path from the HMM to a GRanges object. Takes advantage of lexical scope.

matrixFun

An R function for subsetting the assay data (takes advantage of lexical scope).

snp.index

The SNP indices

anyNP

An indicator for whether any of the markers are nonpolymorphic, and therefore BAFs / genotypes are ignored

Value

A GRanges object if returnEmission is FALSE. Otherwise, an array of emission probabilities is returned.