Learn R Programming

sigminer (version 0.1.11)

prepare_maf: Prepare nmf input matrix for mutational signature analysis

Description

NMF input matrix here is trinucletiode matrix. This function calls trinucleotideMatirx provided by maftools to extract 96 mutation motifs.

Usage

prepare_maf(maf, ref_genome = NULL, prefix = NULL, add = TRUE,
  ignoreChr = NULL, useSyn = TRUE, fn = NULL)

Arguments

maf

an MAF object generated by read.maf

ref_genome

BSgenome object or name of the installed BSgenome package. Example: BSgenome.Hsapiens.UCSC.hg19 Default NULL, tries to auto-detect from installed genomes.

prefix

Prefix to add or remove from contig names in MAF file.

add

If prefix is used, default is to add prefix to contig names in MAF file. If false prefix will be removed from contig names.

ignoreChr

Chromsomes to ignore from analysis. e.g. chrM

useSyn

Logical. Whether to include synonymous variants in analysis. Defaults to TRUE

fn

If given writes APOBEC results to an output file with basename fn. Default NULL.

Value

list of 2. A matrix of dimension nx96, where n is the number of samples in the MAF and a table describing APOBEC enrichment per sample.

Details

Extracts immediate 5' and 3' bases flanking the mutated site and classifies them into 96 substitution classes. Requires BSgenome data packages for sequence extraction.

APOBEC Enrichment: Enrichment score is calculated using the same method described by Roberts et al.

E = (n_tcw * background_c) / (n_C * background_tcw)

where, n_tcw = number of mutations within T[C>T]W and T[C>G]W context. (W -> A or T)

n_C = number of mutated C and G

background_C and background_tcw motifs are number of C and TCW motifs occuring around +/- 20bp of each mutation.

One-sided Fisher's Exact test is performed to determine the enrichment of APOBEC tcw mutations over background.

References

Roberts SA, Lawrence MS, Klimczak LJ, et al. An APOBEC Cytidine Deaminase Mutagenesis Pattern is Widespread in Human Cancers. Nature genetics. 2013;45(9):970-976. doi:10.1038/ng.2702.

See Also

Other signature analysis prepare function series: prepare_copynumber

Examples

Run this code
# NOT RUN {
laml.maf <- system.file("extdata", "tcga_laml.maf.gz", package = "maftools")
laml <- read_maf(maf = laml.maf)
library(BSgenome.Hsapiens.UCSC.hg19)
laml.tnm <- prepare_maf(
  maf = laml, ref_genome = "BSgenome.Hsapiens.UCSC.hg19",
  prefix = "chr", add = TRUE, useSyn = TRUE
)
# }

Run the code above in your browser using DataLab