Learn R Programming

⚠️There's a newer version (3.0.11) of this package.Take me there.

ICAMS

In-depth Characterization and Analysis of Mutational Signatures (‘ICAMS’)

Purpose

Analysis and visualization of experimentally elucidated mutational signatures – the kind of analysis and visualization in Boot et al., “In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors”, Genome Research 2018, https://doi.org/10.1101/gr.230219.117 and “Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types”, Genome Research 2020 https://doi.org/10.1101/gr.255620.119. ‘ICAMS’ stands for In-depth Characterization and Analysis of Mutational Signatures. ‘ICAMS’ has functions to read in variant call files (VCFs) and to collate the corresponding catalogs of mutational spectra and to analyze and plot catalogs of mutational spectra and signatures. Handles both “counts-based” and “density-based” catalogs of mutational spectra or signatures.

Installation

To begin with, install the necessary dependency package from Bioconductor for ICAMS:

install.packages("BiocManager")
BiocManager::install("BSgenome")

For first time installation, it may take a long time, please be patient.

Afterwards, install the stable version of ICAMS from CRAN with the R command line:

install.packages("ICAMS")

Get the development version

To use features in the development version, you can install ICAMS from the master branch on GitHub, which may not be stable:

install.packages("remotes")
remotes::install_github(repo = "steverozen/ICAMS", ref = "master")

Binaries of recent stable development versions are at Windows binary or macOS binary These are for users who cannot install from source because they do not have Rtools (Windows) or XCode (Mac). To use these binaries, download the .zip (Windows) or .tgz (Mac) file for your operating system.

Then do:

install.packages(pkgs = "path-to-binary-file-on-your-computer", repos = NULL)

Reference manual

https://github.com/steverozen/ICAMS/blob/master/data-raw/ICAMS_2.2.3.pdf

Frequently asked questions

How to do normalization for “counts-based” catalogs of mutational spectra or signatures to account for differing abundances of the source sequence of the mutations?

You can use exported function TransformCatalog in ICAMS to normalize the data. Please refer to the documentation and example of TransformCatalog for more details.

Copy Link

Version

Install

install.packages('ICAMS')

Monthly Downloads

563

Version

2.2.3

License

GPL-3 | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Steve Rozen

Last Published

September 18th, 2020

Functions in ICAMS (2.2.3)

AnnotateSBSVCF

Add sequence context and transcript information to an in-memory SBS VCF
AddRunInformation

Create a run information text file from generating zip archive from VCF files.
AnnotateIDVCF

Add sequence context to an in-memory ID (insertion/deletion) VCF, and confirm that they match the given reference genome
AnnotateDBSVCF

Add sequence context and transcript information to an in-memory DBS VCF
AddDBSClass

Add DBS mutation class to an annotated DBS VCF
AddAndCheckDBSClassInVCF

Add and check DBS class in an annotated VCF with the corresponding DBS mutation matrix
AddSBSClass

Add SBS mutation class to an annotated SBS VCF
AddTranscript

Add transcript information to a data frame with mutation records
AddAndCheckSBSClassInVCF

Add and check SBS class in an annotated VCF with the corresponding SBS mutation matrix
AddSeqContext

Add sequence context to a data frame with mutation records
Canonicalize1Del

Given a deletion and its sequence context, categorize it
CalBaseCountsFrom3MerAbundance

Calculate base counts from three mer abundance
CheckAndReturnDBSMatrix

Check and return the DBS mutation matrix
CalculateNumberOfSpace

Calculate the number of space needed to add strand bias statistics to the run-information.txt file.
Canonicalize1INS

Given an insertion and its sequence context, categorize it.
Canonicalize1ID

Given a single insertion or deletion in context categorize it.
CreateDinucAbundance

Create dinucleotide abundance
CatalogRowOrder

Standard order of row names in a catalog
CollapseCatalog

"Collapse" a catalog
CheckAndReturnIDCatalog

Check and return ID catalog
CheckAndReturnDBSCatalogs

Check and return DBS catalogs
CreateOnePPMFromSBSVCF

Create position probability matrix (PPM) for *one* sample from a Variant Call Format (VCF) file.
CheckDBSClassInVCF

Check DBS mutation class in VCF with the corresponding DBS mutation matrix
CheckAndReturnSBSMatrix

Check and return the SBS mutation matrix
CreateOneColSBSMatrix

Create the matrix an SBS catalog for *one* sample from an in-memory VCF.
GeneExpressionData

Example gene expression data from two cell lines
CreateStrandedDinucAbundance

Create stranded dinucleotide abundance
ConvertICAMSCatalogToSigProSBS96

Covert an ICAMS SBS96 Catalog to SigProfiler format
CheckAndFixChrNames

Check and, if possible, correct the chromosome names in a VCF data.frame.
CreateTrinucAbundance

Create trinucleotide abundance
CreateStrandedTrinucAbundance

Create stranded trinucleotide abundance
FindDelMH

Return the length of microhomology at a deletion
CreateExomeStrandedRanges

Create exome transcriptionally stranded regions
GetMutationLoadsFromStrelkaSBSVCFs

Get mutation loads information from Strelka SBS VCF files.
GetMutationLoadsFromStrelkaIDVCFs

Get mutation loads information from Strelka ID VCF files.
CreateTransRanges

Create a transcript range file from the raw GFF3 File
CheckSBSClassInVCF

Check SBS mutation class in VCF with the corresponding SBS mutation matrix
GenerateEmptyKmerCounts

Generate an empty matrix of k-mer abundance
CreateTetranucAbundance

Create tetranucleotide abundance
MakeVCFDBSdf

MakeVCFDBSdf Take DBS ranges and the original VCF and generate a VCF with dinucleotide REF and ALT alleles.
CheckAndReturnIDMatrix

Check and return the ID mutation matrix
InferAbundance

Infer abundance given a matrix-like object and additional information.
MutectVCFFilesToZipFile

Create a zip file which contains catalogs and plot PDFs from Mutect VCF files
FindMaxRepeatIns

Return the number of repeat units in which an insertion is embedded.
Plot96PartOfCompositeToPDF

Plot the SBS96 part of a SignatureAnalyzer COMPOSITE signature or catalog
CheckAndReturnSBSCatalogs

Check and return SBS catalogs
CanonicalizeID

Determine the mutation types of insertions and deletions.
MakeDataFrameFromVCF

Read in the data lines of a Variant Call Format (VCF) file
PlotPPMToPdf

Plot position probability matrices (PPM) to a PDF file
ReadVCF

Read in the data lines of a Variant Call Format (VCF) file
CreateOneColDBSMatrix

Create the matrix a DBS catalog for *one* sample from an in-memory VCF.
IsGRCh38

Test if object is BSgenome.Hsapiens.UCSC.hg38.
ReadMutectVCF

Read in the data lines of a Variant Call Format (VCF) file created by Mutect
StandardChromNameNew

Standardize the chromosome name annotations for a data frame.
CheckSeqContextInVCF

Check that the sequence context information is consistent with the value of the column REF.
InferClassOfCatalogForRead

Infer the class of catalog in a file.
ICAMS

ICAMS: In-depth Characterization and Analysis of Mutational Signatures
GetVAF

Extract the VAFs (variant allele frequencies) and read depth information from a VCF file
IsGRCm38

Test if object is BSgenome.Mmusculus.UCSC.mm10.
GetCustomKmerCounts

Generate custom k-mer abundance from a given reference genome
ReadStapleGT96SBS

Read a 96-channel spectra (or signature) catalog where rownames are e.g. "A[C>A]T"
PlotTransBiasGeneExpToPdf

Plot transcription strand bias with respect to gene expression values to a PDF file
ReadAndSplitMutectVCFs

Read and split Mutect VCF files
PlotTransBiasGeneExp

Plot transcription strand bias with respect to gene expression values
RevcSBS96

Reverse complement strings that represent stranded SBSs
StrelkaSBSVCFFilesToCatalogAndPlotToPdf

Create SBS and DBS catalogs from Strelka SBS VCF files and plot them to PDF
GenerateKmer

Generate all possible k-mers of length k.
GetMutationLoadsFromMutectVCFs

Get mutation loads information from Mutect VCF files.
CreateOneColIDMatrix

Create one column of the matrix for an indel catalog from *one* in-memory VCF.
GetSequenceKmerCounts

Generate k-mer abundance from given nucleotide sequences
CreatePPMFromSBSVCFs

Create position probability matrices (PPM) from a list of SBS vcfs
CreatePentanucAbundance

Create pentanucleotide abundance
ReadStrelkaSBSVCFs

Read Strelka SBS (single base substitutions) VCF files.
GetGenomeKmerCounts

Generate k-mer abundance from a given genome
TestMakeCatalogFromStrelkaSBSVCFs

This function is to make catalogs from the sample Strelka SBS VCF files to compare with the expected catalog information.
VCFsToIDCatalogs

Create ID (small insertion and deletion) catalog from ID VCFs
RenameColumnsWithNameVAF

Is there any column in df1 with name "VAF"? If there is, change its name to "VAF_old" so that it will conflict with code in other parts of ICAMS package.
PlotCatalog

Plot one spectrum or signature
StrelkaIDVCFFilesToCatalogAndPlotToPdf

Create ID (small insertion and deletion) catalog from Strelka ID VCF files and plot them to PDF
StrelkaIDVCFFilesToCatalog

Create ID (small insertion and deletion) catalog from Strelka ID VCF files
ReadStrelkaIDVCFs

Read Strelka ID (small insertion and deletion) VCF files
ReadStrelkaSBSVCF

Read in the data lines of an SBS VCF created by Strelka version 1
ReadStrelkaIDVCF

Read in the data lines of an ID VCF created by Strelka version 1
ReadMutectVCFs

Read Mutect VCF files.
NormalizeGenomeArg

WriteCatalog

Write a catalog
InferRownames

Infer the correct rownames for a matrix based on its number of rows
StopIfRegionIllegal

Stop if region is illegal.
Restaple96

Convert 96-channel mutation-type identifiers like this "ACTA" -> "A[C>A]T"
RevcDBS144

Reverse complement strings that represent stranded DBSs
FindMaxRepeatDel

Return the number of repeat units in which a deletion is embedded
ReadAndSplitStrelkaSBSVCFs

Read and split Strelka SBS VCF files
MutectVCFFilesToCatalogAndPlotToPdf

Create SBS, DBS and Indel catalogs from Mutect VCF files and plot them to PDF
IsGRCh37

Test if object is BSgenome.Hsapiens.1000genome.hs37d5.
ReadDukeNUSCat192

Read a 192-channel spectra (or signature) catalog in Duke-NUS format
TransRownames.ID.SigPro.PCAWG

For indels, convert SigProfiler rownames into ICAMS/PCAWG7 rownames
ReadVCFs

Read VCF files
SplitStrelkaSBSVCF

Split an in-memory Strelka VCF into SBS, DBS, and variants involving > 2 consecutive bases
SplitOneMutectVCF

Split a mutect2 VCF into SBS, DBS, and ID VCFs, plus a list of other mutations
StopIfTranscribedRegionIllegal

Stop if region is illegal for an in-transcript catalogs
all.abundance

K-mer abundances
PlotCatalogToPdf

Plot catalog to a PDF file
PlotPPM

Plot position probability matrix (PPM) for *one* sample from a Variant Call Format (VCF) file.
MutectVCFFilesToCatalog

Create SBS, DBS and Indel catalogs from Mutect VCF files
ReadCatalog

Read catalog
StrelkaSBSVCFFilesToCatalog

Create SBS and DBS catalogs from Strelka SBS VCF files
StrelkaIDVCFFilesToZipFile

Create a zip file which contains ID (small insertion and deletion) catalog and plot PDF from Strelka ID VCF files
GetStrandedKmerCounts

Generate stranded k-mer abundance from a given genome and gene annotation file
SplitListOfMutectVCFs

Split each Mutect VCF into SBS, DBS, and ID VCFs (plus two VCF-like data frame with left-over rows).
ReadTranscriptRanges

Read transcript ranges and strand information from a gff3 format file. Use this one for the new, cut down gff3 file (2018 11 24)
VCFsToSBSCatalogs

Create SBS catalogs from SBS VCFs
ReadBedRanges

Read chromosome and position information from a bed format file.
StandardChromName

Standardize the chromosome name annotations for a data frame.
SplitListOfStrelkaSBSVCFs

Split a list of in-memory Strelka SBS VCF into SBS, DBS, and variants involving > 2 consecutive bases
StopIfCatalogTypeIllegal

Stop if catalog.type is illegal.
TCFromDenSigDen

density -> <anything> density.signature -> density.signature, counts.signature
revc

Reverse complement every string in string.vec
RemoveRangesOnBothStrand

Remove ranges that fall on both strands
TransRownames.ID.PCAWG.SigPro

For indels, convert ICAMS/PCAWG7 rownames into SigProfiler rownames
TestPlotCatCOMPOSITE

Plot the a SignatureAnalyzer COMPOSITE signature or catalog into separate pdfs
as.catalog

Create a catalog from a matrix, data.frame, or vector
StopIfNrowIllegal

Stop if the number of rows in object is illegal
TCFromCouSigCou

Source catalog type is counts or counts.signature
RenameColumnsWithNameStrand

Is there any column in df with name "strand"? If there is, change its name to "strand_old" so that it will conflict with code in other parts of ICAMS package.
Unstaple96

Convert 96-channel mutations-type identifiers like this "A[C>A]T" -> "ACTA"
TranscriptRanges

Transcript ranges data
TestMakeCatalogFromStrelkaIDVCFs

This function is to make catalogs from the sample Strelka ID VCF files to compare with the expected catalog information.
WriteCat

Write a catalog to a file.
StrelkaSBSVCFFilesToZipFile

Create a zip file which contains catalogs and plot PDFs from Strelka SBS VCF files
WriteCatalogIndelSigPro

Write Indel Catalogs in SigProExtractor format
TransformCatalog

Transform between counts and density spectrum catalogs and counts and density signature catalogs
VCFsToDBSCatalogs

Create DBS catalogs from VCFs
TestMakeCatalogFromMutectVCFs

This function makes catalogs from the sample Mutect VCF file and compares it with the expected catalog information.
CheckAndReorderRownames

Check whether the rownames of object are correct, if yes then put the rows in the correct order.