Learn R Programming

dndscv (version 0.0.0.9)

dndscv: dNdScv

Description

Analyses of selection using the dNdScv and dNdSloc models. Default parameters typically increase the performance of the method on cancer genomic studies. Reference files are currently only available for the GRCh37/hg19 version of the human genome.

Usage

dndscv(mutations, gene_list = NULL, refdb = "hg19", sm = "192r_3w",
  kc = "cgc81", cv = "hg19", max_muts_per_gene_per_sample = 3,
  max_coding_muts_per_sample = 3000, use_indel_sites = T, min_indels = 5,
  maxcovs = 20, constrain_wnon_wspl = T, outp = 3)

Arguments

mutations

Table of mutations (5 columns: sampleID, chr, pos, ref, alt). Only list independent events as mutations.

gene_list

List of genes to restrict the analysis (use for targeted sequencing studies)

refdb

Reference database (path to .rda file)

sm

Substitution model (precomputed models are available in the data directory)

kc

List of a-priori known cancer genes (to be excluded from the indel background model)

cv

Covariates (a matrix of covariates -columns- for each gene -rows-) [default: reference covariates] [cv=NULL runs dndscv without covariates]

max_muts_per_gene_per_sample

If n<Inf, arbitrarily the first n mutations by chr position will be kept

max_coding_muts_per_sample

Hypermutator samples often reduce power to detect selection

use_indel_sites

Use unique indel sites instead of the total number of indels (it tends to be more robust)

min_indels

Minimum number of indels required to run the indel recurrence module

maxcovs

Maximum number of covariates that will be considered (additional columns in the matrix of covariates will be excluded)

constrain_wnon_wspl

This constrains wnon==wspl (this typically leads to higher power to detect selection)

outp

Output: 1 = Global dN/dS values; 2 = Global dN/dS and dNdSloc; 3 = Global dN/dS, dNdSloc and dNdScv

Value

'dndscv' returns a list of objects:

- globaldnds: Global dN/dS estimates across all genes.

- sel_cv: Gene-wise selection results using dNdScv.

- sel_loc: Gene-wise selection results using dNdSloc.

- annotmuts: Annotated coding mutations.

- genemuts: Observed and expected numbers of mutations per gene.

- mle_submodel: MLEs of the substitution model.

- exclsamples: Samples excluded from the analysis.

- exclmuts: Coding mutations excluded from the analysis.

- nbreg: Negative binomial regression model for substitutions.

- nbregind: Negative binomial regression model for indels.

Details

Martincorena I, et al. (2017) Universal patterns of selection in cancer and somatic tissues. Under revision. Preprint available in BioRxiv: https://doi.org/10.1101/132324