Learn R Programming

expands (version 1.4)

runExPANdS: Main Function

Description

Given a set of mutations, ExPANdS predicts the number of clonal expansions in a tumor, the size of the resulting subpopulations in the tumor bulk and which mutations accumulate in a cell prior to its clonal expansion. Input-parameters SNV and CBS hold the paths to tabdelimited files containing the mutations and the copy numbers respectively. Alternatively SNV and CBS can be read into the workspace and passed to runExPANdS as a numeric matrices. The robustness of the subpopulation predictions by ExPANdS increases with the number of mutations provided. It is recommended that SNV contains at least 200 mutations to obtain stable results.

Usage

runExPANdS(SNV, CBS, maxScore=2.5, max_PM=6, precision=NA,
 plotF=2,snvF="out.expands",maxN=8000,region=NA)

Arguments

SNV
Matrix in which each row corresponds to a mutation. Columns in SNV must include: chr - the chrmosome on which each mutation is located; startpos - the genomic position of each mutation; AF_Tumor - the allele-frequency of each mutati
CBS
Matrix in which each row corresponds to a copy number segment. CBS is typically the output of a circular binary segmentation algorithm. Columns in CBS must include: chr - chromosome; startpos - the first genomic position of a copy number se
maxScore
Upper threshold for the confidence of subpopulation detection. Only subpopulations identified at a score below $maxScore$ (default 2.5) are kept.
max_PM
Upper threshold for the number of amplicons per mutated cell (default: 6). Increasing the value of this variable is not recommended unless extensive depth and breadth of coverage underly the measurements of copy numbers and allele frequencies. See also
precision
Precision with which subpopulation size is predicted, a small value reflects a high resolution and can trigger a higher number of predicted subpopulations (default 0.1/log(n/7), where n = # mutations).
plotF
Option for displaying a visual representation of the identified SPs (0 - no display; 1 - display subpopulation size; 2 - display subpopulation size and phylogeny; 3 - display subpopulation size, phylogeny and cell-frequency probability clusters; default:
snvF
The name of the file from which mutations have been read.
maxN
Upper limit for # SNVs during clustering. If number of user supplied SNVs exceeds , the clustering of cellular frequency distributions will be restricted to SNVs found within (default: 8000; increasing value of this parameter not recommende
region
Regional boundary for mutations included during clustering. Matrix in which each row corresponds to a genomic segment. Columns must include: chr - the chrmosome of the segment ; start - the first genomic position of the segment; end

Value

  • List with three fields:
  • finalSPsMatrix of predicted subpopulations. Each row corresponds to a subpopulation and each column contains information about that subpopulation, such as the size in the sequenced tumor bulk (column Mean Weighted) and the confidence with which the subpopulation has been detected (column score).
  • dmMatrix containing the input mutations with at least two additional columns: SP - the subpopulation to which the mutation has been asssigned; %maxP - the confidence of assignment.
  • densitiesMatrix as obtained by computeCellFrequencyDistributions. Each row corresponds to a mutation and each column corresponds to a cellular frequency. Each value $densities[i,j]$ represents the probability that mutation $i$ is present in a fraction $f$ of cells, where $f$ is given by: $colnames(densities[,j]).$
  • ploidyMatrix as obtained by assignQuantityToSP. Each row corresponds to a copy number segment as obtained by CBS. Includes one additional column for each predicted SP, holding the ploidy of each segment in the corresponding SP.
  • treeAn object of class "phylo" (library ape) as obtained by buildPhylo. Holds the inferred phylogenetic relationships between subpopulations.

References

Noemi Andor, Julie Harness, Hans Werner Mewes and Claudia Petritsch. (2013) ExPANdS: Expanding Ploidy and Allele Frequency on Nested Subpopulations. Bioinformatics. In Review.

Examples

Run this code
data(snv);
data(cbs);
maxScore=2.5;
set.seed(4); idx=sample(1:nrow(snv), 60, replace=FALSE);
#out= runExPANdS(snv[idx,], cbs, maxScore);

Run the code above in your browser using DataLab