Analyses of selection using the dNdScv and dNdSloc models. Default parameters typically increase the performance of the method on cancer genomic studies. Reference files are currently only available for the GRCh37/hg19 version of the human genome.
dndscv(mutations, gene_list = NULL, refdb = "hg19", sm = "192r_3w",
kc = "cgc81", cv = "hg19", max_muts_per_gene_per_sample = 3,
max_coding_muts_per_sample = 3000, use_indel_sites = T, min_indels = 5,
maxcovs = 20, constrain_wnon_wspl = T, outp = 3)
Table of mutations (5 columns: sampleID, chr, pos, ref, alt). Only list independent events as mutations.
List of genes to restrict the analysis (use for targeted sequencing studies)
Reference database (path to .rda file)
Substitution model (precomputed models are available in the data directory)
List of a-priori known cancer genes (to be excluded from the indel background model)
Covariates (a matrix of covariates -columns- for each gene -rows-) [default: reference covariates] [cv=NULL runs dndscv without covariates]
If n<Inf, arbitrarily the first n mutations by chr position will be kept
Hypermutator samples often reduce power to detect selection
Use unique indel sites instead of the total number of indels (it tends to be more robust)
Minimum number of indels required to run the indel recurrence module
Maximum number of covariates that will be considered (additional columns in the matrix of covariates will be excluded)
This constrains wnon==wspl (this typically leads to higher power to detect selection)
Output: 1 = Global dN/dS values; 2 = Global dN/dS and dNdSloc; 3 = Global dN/dS, dNdSloc and dNdScv
'dndscv' returns a list of objects:
- globaldnds: Global dN/dS estimates across all genes.
- sel_cv: Gene-wise selection results using dNdScv.
- sel_loc: Gene-wise selection results using dNdSloc.
- annotmuts: Annotated coding mutations.
- genemuts: Observed and expected numbers of mutations per gene.
- mle_submodel: MLEs of the substitution model.
- exclsamples: Samples excluded from the analysis.
- exclmuts: Coding mutations excluded from the analysis.
- nbreg: Negative binomial regression model for substitutions.
- nbregind: Negative binomial regression model for indels.
Martincorena I, et al. (2017) Universal patterns of selection in cancer and somatic tissues. Under revision. Preprint available in BioRxiv: https://doi.org/10.1101/132324