runExPANdS: Main Function

Description

Given a set of mutations, ExPANdS predicts the number of clonal expansions in a tumor, the size of the resulting subpopulations in the tumor bulk and which mutations accumulate in a cell prior to its clonal expansion. Input-parameters SNV and CBS hold the paths to tabdelimited files containing the mutations and the copy numbers respectively. Alternatively SNV and CBS can be read into the workspace and passed to runExPANdS as a numeric matrix. The robustness of the subpopulation predictions by ExPANdS increases with the number of mutations provided. It is recommended that SNV contains at least 200 mutations to obtain stable results.

Usage

runExPANdS(SNV, CBS, maxScore=2.5, max_PM=6, precision=NA, plotF=1,snvF="out.expands")

Arguments

SNV

Matrix in which each row corresponds to a mutation. Columns in SNV must include: chr - the chrmosome on which each mutation is located; startpos - the genomic position of each mutation; AF_Tumor - the allele-frequency of each mutati

CBS

Matrix in which each row corresponds to a copy number segment. CBS is typically the output of a circular binary segmentation algorithm. Columns in CBS must include: chr - chromosome; startpos - the first genomic position of a copy number se

maxScore

Upper threshold for the confidence of subpopulation detection. Only subpopulations identified at a score below $maxScore$ (default 2.5) are kept.

max_PM

Upper threshold for the number of amplicons per mutated cell (default: 6). Increasing the value of this variable is not recommended unless extensive depth and breadth of coverage underly the measurements of copy numbers and allele frequencies. See also

precision

Precision with which subpopulation size is predicted, a small value reflects a high resolution and can trigger a higher number of predicted subpopulations (default 0.1/log(n/7), where n = # mutations).

plotF

Option for displaying a visual representation of the identified SPs (0 - no display; 1 - display subpopulation size; 2 - display subpopulation size and cell-frequency probability clusters; default: 1).

snvF

The name of the file from which mutations have been read.

Value

List with three fields:
finalSPsMatrix of predicted subpopulations. Each row corresponds to a subpopulation and each column contains information about that subpopulation, such as the size in the sequenced tumor bulk (column Mean Weighted) and the confidence with which the subpopulation has been detected (column score).
dmMatrix containing the input mutations with at least two additional columns: SP - the subpopulation to which the mutation has been asssigned; %maxP - the confidence of assignment.
densitiesMatrix as obtained by computeCellFrequencyDistributions. Each row corresponds to a mutation and each column corresponds to a cellular frequency. Each value $densities[i,j]$ represents the probability that mutation $i$ is present in a fraction $f$ of cells, where $f$ is given by: $colnames(densities[,j]).$

References

Noemi Andor, Julie Harness, Hans Werner Mewes and Claudia Petritsch. (2013) ExPANdS: Expanding Ploidy and Allele Frequency on Nested Subpopulations. Bioinformatics. In Review.

Examples

Run this code

data(snv);
data(cbs);
maxScore=2.5;
set.seed(4); idx=sample(1:nrow(snv), 60, replace=FALSE);
#out= runExPANdS(snv[idx,], cbs, maxScore);

Run the code above in your browser using DataLab