Learn R Programming

⚠️There's a newer version (2.3.1) of this package.Take me there.

sigminer: an easy-to-use and scalable toolkit for genomic alteration signature analysis and visualization in R

Overview

Genomic alterations including single nucleotide substitution (SBS), copy number alteration (CNA), etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called ‘signature’. This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.

SBS signatures:

Copy number signatures:

DBS signatures:

INDEL (i.e. ID) signatures:

Feature

  • supports a standard de novo pipeline for identification of 4 types of signatures: copy number, SBS, DBS and INDEL
  • supports quantify exposure for one sample based on known signatures
  • supports two methods for calling copy number signatures: one is from Macintyre et al. 2018 and the other is created by us
  • supports association and group analysis and visualization for signatures
  • supports a bayesian variant of NMF algorithm to enable optimal inferences for the number of signatures through the automatic relevance determination technique from SignatureAnalyzer package
  • supports two plot styles for signature profile: ‘default’ (like SignatureAnalyzer package) and ‘cosmic’ (like COSMIC database)
  • supports two types of signatrue exposures: relative exposure (relative contribution of signatures in each sample) and absolute exposure (estimated variation records of signatures in each sample)
  • supports basic summary and visualization for profile of mutation (powered by maftools) and copy number
  • supports parallel computation by R packages foreach, future and NMF
  • efficient code powered by R packages data.table and tidyverse
  • elegant plots powered by R packages ggplot2, ggpubr, cowplot and patchwork
  • well tested by R package testthat and documented by R package roxygen2, roxytest, pkgdown, and etc. for both reliable and reproducible research

Installation

You can install the stable release of sigminer from CRAN with:

install.packages("sigminer", dependencies = TRUE)
# Or
BiocManager::install("sigminer", dependencies = TRUE)

You can install the development version of sigminer from Github with:

remotes::install_github("ShixiangWang/sigminer", dependencies = TRUE)
# For Chinese users, run 
remotes::install_git("https://gitee.com/ShixiangWang/sigminer", dependencies = TRUE)

Usage

A complete documentation of sigminer can be read online at https://shixiangwang.github.io/sigminer-doc/ (For Chinese users, you can also read it at https://shixiangwang.gitee.io/sigminer-doc). All functions are well organized and documented at https://shixiangwang.github.io/sigminer/reference/index.html (For Chinese users, you can also read it at https://shixiangwang.gitee.io/sigminer/reference/index.html). For usage of a specific function fun, run ?fun in your R console to see its documentation.

Citation


Wang, Shixiang, et al. “Copy number signature analyses in prostate cancer reveal distinct etiologies and clinical outcomes” medRxiv (2020) https://www.medrxiv.org/content/early/2020/04/29/2020.04.27.20082404


Acknowledgments

If you use NMF package in R, please also cite:

Gaujoux, Renaud, and Cathal Seoighe. "A Flexible R Package for 
    Nonnegative Matrix Factorization."" BMC Bioinformatics 11, no. 1 (December 2010).

The method “M” for extracting copy number signatures was based in part on the source code from paper Copy number signatures and mutational processes in ovarian carcinoma, if you use this feature, please also cite:

Macintyre, Geoff, et al. "Copy number signatures and mutational
    processes in ovarian carcinoma." Nature genetics 50.9 (2018): 1262.

The code for extracting SBS signatures was based in part on the source code of the maftools package, if you use this feature, please also cite:

Mayakonda, Anand, et al. "Maftools: efficient and comprehensive analysis
    of somatic variants in cancer." Genome research 28.11 (2018): 1747-1756.

The code for extracting mutational signatures was based in part on the source code of the SignatureAnalyzer package, if you use this feature, please also cite:

Kim, Jaegil, et al. "Somatic ERCC2 mutations are associated with a distinct genomic
    signature in urothelial tumors." Nature genetics 48.6 (2016): 600.

References

  1. Alexandrov, Ludmil B., et al. “The repertoire of mutational signatures in human cancer.” Nature 578.7793 (2020): 94-101.
  2. Macintyre, Geoff, et al. “Copy number signatures and mutational processes in ovarian carcinoma.” Nature genetics 50.9 (2018): 1262.
  3. Mayakonda, Anand, et al. “Maftools: efficient and comprehensive analysis of somatic variants in cancer.” Genome research 28.11 (2018): 1747-1756.
  4. Gaujoux, Renaud, and Cathal Seoighe. “A Flexible R Package for Nonnegative Matrix Factorization.”" BMC Bioinformatics 11, no. 1 (December 2010).
  5. H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
  6. Tan, Vincent YF, and Cédric Févotte. “Automatic relevance determination in nonnegative matrix factorization with the/spl beta/-divergence.” IEEE Transactions on Pattern Analysis and Machine Intelligence 35.7 (2012): 1592-1605.
  7. Kim, Jaegil, et al. “Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors.” Nature genetics 48.6 (2016): 600.
  8. Bergstrom EN, Huang MN, Mahto U, Barnes M, Stratton MR, Rozen SG, Alexandrov LB: SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genomics 2019, 20:685 https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-019-6041-2

LICENSE

The software is made available for non commercial research purposes only under the MIT. However, notwithstanding any provision of the MIT License, the software currently may not be used for commercial purposes without explicit written permission after contacting Shixiang Wang wangshx@shanghaitech.edu.cn or Xue-Song Liu liuxs@shanghaitech.edu.cn.

MIT © 2019-2020 Shixiang Wang, Xue-Song Liu

MIT © 2018 Geoffrey Macintyre

MIT © 2018 Anand Mayakonda


Cancer Biology Group @ShanghaiTech

Research group led by Xue-Song Liu in ShanghaiTech University

Copy Link

Version

Install

install.packages('sigminer')

Monthly Downloads

494

Version

1.0.5

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Shixiang Wang

Last Published

May 14th, 2020

Functions in sigminer (1.0.5)

cytobands.hg38

Location of Chromosome Cytobands at Genome Build hg38
get_sig_feature_association

Calculate Association between Signature Exposures and Other Features
get_sig_exposure

Get Signature Exposure from 'Signature' Object
get_bayesian_result

Get Specified Bayesian NMF Result from Run
get_adj_p

Get Adjust P Values from Group Comparison
%>%

Pipe operator
hello

Say Hello to Users
cytobands.hg19

Location of Chromosome Cytobands at Genome Build hg19
get_groups

Get Sample Groups from Signature Decomposition Information
get_group_comparison

Get Comparison Result between Signature Groups
get_cn_ploidy

Get Ploidy from Absolute Copy Number Profile
read_copynumber

Read Absolute Copy Number Profile
get_genome_annotation

Get Genome Annotation
enrich_component_strand_bias

Performs Strand Bias Enrichment Analysis for a Given Sample-by-Component Matrix
get_tidy_parameter

Get Tidy Parameter from Flexmix Model
handle_hyper_mutation

Handle Hypermutant Samples
report_bootstrap_p_value

Report P Values from bootstrap Results
show_cn_components

Show Copy Number Components
show_catalogue

Show Alteration Catalogue Profile
show_cn_circos

Show Copy Number Profile in Circos
scoring

Score Copy Number Profile
show_cn_distribution

Show Copy Number Distribution either by Length or Chromosome
show_sig_bootstrap

Show Signature Bootstrap Analysis Results
subset.CopyNumber

Subsetting CopyNumber object
show_sig_fit

Show Signature Fit Result
show_group_mapping

Map Groups using Sankey
sigminer

sigminer: Extract, Analyze and Visualize Signatures for Genomic Variations
show_sig_feature_corrplot

Draw Corrplot for Signature Exposures and Other Features
read_maf

Read MAF Files
show_cn_features

Show Copy Number Feature Distributions
show_sig_consensusmap

Show Signature Consensus Map
show_sig_exposure

Plot Signature Exposure
get_sig_similarity

Calculate Similarity between Identified Signatures and Reference Signatures
sig_convert

Convert Signatures between different Genomic Distribution of Components
sig_estimate

Estimate Signature Number
sig_fit_bootstrap

Obtain Bootstrap Distribution of Signature Exposures of a Certain Tumor Sample
sig_fit_bootstrap_batch

Exposure Instability Analysis of Signature Exposures with Bootstrap
sig_names

Obtain or Modify Signature Information
show_cn_profile

Show Sample Copy Number Profile
transcript.hg19

Merged Transcript Location at Genome Build hg19
show_sig_number_survey2

Show Comprehensive Signature Number Survey
sig_tally

Tally a Genomic Alteration Object
show_sig_number_survey

Show Simplified Signature Number Survey
tidyeval

Tidy eval helpers
get_tidy_association

Get Tidy Signature Association Results
use_color_style

Set Color Style for Plotting
transcript.hg38

Merged Transcript Location at Genome Build hg38
show_cosmic_sig_profile

Plot COSMIC Signature Profile
show_sig_profile

Show Signature Profile
show_group_comparison

Plot Group Comparison Result
sig_fit

Fit Signature Exposures with Linear Combination Decomposition
sig_extract

Extract Signatures through NMF
sig_auto_extract

Extract Signatures through the Automatic Relevance Determination Technique
add_labels

Add Text Labels to a ggplot
add_h_arrow

Add Horizontal Arrow with Text Label to a ggplot
centromeres.hg19

Location of Centromeres at Genome Build hg19
chromsize.hg19

Chromosome Size of Genome Build hg19
centromeres.hg38

Location of Centromeres at Genome Build hg38
CN.features

Classfication Table of Copy Number Features Devised by Wang et al.
MAF-class

Class MAF
CopyNumber-class

Class CopyNumber
chromsize.hg38

Chromosome Size of Genome Build hg38