Performs EM algorithm for a given configuration matrix
ITH_optim(
my_data,
my_purity,
init_eS,
pi_eps0 = NULL,
my_unc_q = NULL,
max_iter = 4000,
my_epsilon = 1e-06
)If the EM algorithm converges, the output will be a list containing
iternumber of iterations
convergeconvergence status
unc_q0initial unconstrained subclone proportions parameter
unc_qunconstrained estimate of q
qestimated subclone proportions among cancer cells
CN_MA_piestimated mixture probabilities of multiplicities and allocations given copy number states
etaestimated subclone proportion among tumor cells
purityuser-inputted tumor purity
entropyestimated entropy
inferA R dataframe containing inferred variant allocations
(infer_A), multiplicities (infer_M), cellular prevalences
(infer_CP).
msmodel size, number of parameters within parameter space
LLThe observed log likelihood evaluated at maximum likelihood estimates.
AIC = 2 * LL - 2 * msNegative AIC, used for model selection
BIC = 2 * LL - ms * log(LOCI)Negative BIC, used for model selection
LOCIThe number of inputted somatic variants.
A R dataframe containing the following columns:
tADtumor alternate read counts
tRDtumor reference read counts
CN_1minor allele count
CN_2major allele count, where CN_1 <= CN_2
tCNCN_1 + CN_2
A single numeric value of known/estimated purity
A subclone configuration matrix pre-defined in R
list eS
A user-specified parameter denoting the proportion
of loci not explained by the combinations of purity, copy number,
multiplicity, and allocation. If NULL, it is initialized at
1e-3. If set to 0.0, the parameter is not estimated.
An optimal initial vector for the unconstrained
q vector, useful after running grid_ITH_optim. If
this variable is NULL, then the subclone proportions,
q, are randomly initialized. For instance, if
my_unc_q = ( x1 , x2 ), then q = ( exp(x1) / (1 + exp(x1) + exp(x2)) , exp(x2) / (1 + exp(x1) + exp(x2)) , 1 / (1 + exp(x1) + exp(x2)).
Positive integer, preferably 1000 or more, setting the maximum number of iterations
Convergence criterion threshold for changes in the log likelihood, preferably 1e-6 or smaller