Learn R Programming

expands (version 1.2)

runExPANdS: Main Function

Description

Given a set of mutations, ExPANdS predicts the number of clonal expansions in a tumor, the size of the resulting subpopulations in the tumor bulk and which mutations accumulate in a cell prior to its clonal expansion. Input-parameters SNV and CBS hold the paths to tabdelimited files containing the mutations and the copy numbers respectively. Alternatively SNV and CBS can be read into the workspace and passed to runExPANdS as a numeric matrix. The robustness of the subpopulation predictions by ExPANdS increases with the number of mutations provided. It is recommended that SNV contains at least 200 mutations to obtain stable results.

Usage

runExPANdS(SNV, CBS, maxScore=2.5, max_PM=6, precision=NA, plotF=1,snvF="out.expands")

Arguments

SNV
Matrix in which each row corresponds to a mutation. Columns in SNV must include: chr - the chrmosome on which each mutation is located; startpos - the genomic position of each mutation; AF_Tumor - the allele-frequency of each mutati
CBS
Matrix in which each row corresponds to a copy number segment. CBS is typically the output of a circular binary segmentation algorithm. Columns in CBS must include: chr - chromosome; startpos - the first genomic position of a copy number se
maxScore
Upper threshold for the confidence of subpopulation detection. Only subpopulations identified at a score below $maxScore$ (default 2.5) are kept.
max_PM
Upper threshold for the number of amplicons per mutated cell (default: 6). Increasing the value of this variable is not recommended unless extensive depth and breadth of coverage underly the measurements of copy numbers and allele frequencies. See also
precision
Precision with which subpopulation size is predicted, a small value reflects a high resolution and can trigger a higher number of predicted subpopulations (default 0.1/log(n/7), where n = # mutations).
plotF
Option for displaying a visual representation of the identified SPs (0 - no display; 1 - display subpopulation size; 2 - display subpopulation size and cell-frequency probability clusters; default: 1).
snvF
The name of the file from which mutations have been read.

Value

  • List with three fields:
  • finalSPsMatrix of predicted subpopulations. Each row corresponds to a subpopulation and each column contains information about that subpopulation, such as the size in the sequenced tumor bulk (column Mean Weighted) and the confidence with which the subpopulation has been detected (column score).
  • dmMatrix containing the input mutations with at least two additional columns: SP - the subpopulation to which the mutation has been asssigned; %maxP - the confidence of assignment.
  • densitiesMatrix as obtained by computeCellFrequencyDistributions. Each row corresponds to a mutation and each column corresponds to a cellular frequency. Each value $densities[i,j]$ represents the probability that mutation $i$ is present in a fraction $f$ of cells, where $f$ is given by: $colnames(densities[,j]).$

References

Noemi Andor, Julie Harness, Hans Werner Mewes and Claudia Petritsch. (2013) ExPANdS: Expanding Ploidy and Allele Frequency on Nested Subpopulations. Bioinformatics. In Review.

Examples

Run this code
data(snv);
data(cbs);
maxScore=2.5;
set.seed(4); idx=sample(1:nrow(snv), 60, replace=FALSE);
#out= runExPANdS(snv[idx,], cbs, maxScore);

Run the code above in your browser using DataLab