Check and, if possible, correct the chromosome names in a VCF data.frame
.
Create position probability matrix (PPM) for *one* sample from
a Variant Call Format (VCF) file.
CheckAndFixChrNamesForTransRanges
Check and, if possible, correct the chromosome names in a trans.ranges data.table
Check and return the ID mutation matrix
Check and return ID catalog
Create one column of the matrix for an indel catalog from *one* in-memory VCF.
Check DBS mutation class in VCF with the corresponding DBS mutation matrix
Check SBS mutation class in VCF with the corresponding SBS mutation matrix
Check whether the rownames of object
are correct, if yes then put the
rows in the correct order.
Return the length of microhomology at a deletion
Return the number of repeat units in which a deletion is embedded
Extract the VAFs (variant allele frequencies) and read depth information from
a VCF file
ICAMS: In-depth Characterization and Analysis of Mutational Signatures
CheckAndReturnDBSCatalogs
Check and return DBS catalogs
ConvertICAMSCatalogToSigProSBS96
Covert an ICAMS SBS96 Catalog to SigProfiler format
Check and return the DBS mutation matrix
Create the matrix a DBS catalog for *one* sample from an in-memory VCF.
Create the matrix an SBS catalog for *one* sample from an in-memory VCF.
Generate an empty matrix of k-mer abundance
CreateExomeStrandedRanges
Create exome transcriptionally stranded regions
Generate all possible k-mers of length k.
MakeVCFDBSdf Take DBS ranges and the original VCF and generate a VCF with
dinucleotide REF and ALT alleles.
Generate k-mer abundance from a given genome
GetMutationLoadsFromMutectVCFs
Get mutation loads information from Mutect VCF files.
Create SBS, DBS and Indel catalogs from Mutect VCF files
Plot transcription strand bias with respect to gene expression values
"Collapse" a catalog
Create pentanucleotide abundance
Plot position probability matrix (PPM) for *one* sample from a Variant Call Format
(VCF) file.
Create trinucleotide abundance
Create dinucleotide abundance
Plot position probability matrices (PPM) to a PDF file
Read in the data lines of a Variant Call Format (VCF) file
Check that the sequence context information is consistent with the value of
the column REF.
Test if object is BSgenome.Mmusculus.UCSC.mm10
.
Create a transcript range file from the raw GFF3 File
Create position probability matrices (PPM) from a list of SBS vcfs
Internal read catalog function to be wrapped in a tryCatch
PlotTransBiasGeneExpToPdf
Plot transcription strand bias with respect to gene expression values to a
PDF file
Read a 192-channel spectra (or signature) catalog in Duke-NUS format
Remove ranges that fall on both strands
CreateStrandedTrinucAbundance
Create stranded trinucleotide abundance
GetMutationLoadsFromStrelkaIDVCFs
Get mutation loads information from Strelka ID VCF files.
StrelkaIDVCFFilesToCatalogAndPlotToPdf
Create ID (small insertion and deletion) catalog from Strelka ID VCF files
and plot them to PDF
StrelkaIDVCFFilesToCatalog
Create ID (small insertion and deletion) catalog from Strelka ID VCF files
Read Strelka SBS (single base substitutions) VCF files.
Infer the correct rownames for a matrix based on its number of rows
Generate k-mer abundance from given nucleotide sequences
Read Mutect VCF files.
Read transcript ranges and strand information from a gff3 format file.
Use this one for the new, cut down gff3 file (2018 11 24)
MutectVCFFilesToCatalogAndPlotToPdf
Create SBS, DBS and Indel catalogs from Mutect VCF files and plot them to PDF
Create a zip file which contains catalogs and plot PDFs from Mutect VCF files
This function converts an data.table imported
from external catalog text file into ICAMS
internal catalog object of appropriate type.
CreateStrandedDinucAbundance
Create stranded dinucleotide abundance
Read in the data lines of a Variant Call Format (VCF) file created by Mutect
RenameColumnsWithNameStrand
Is there any column in df
with name "strand"?
If there is, change its name to "strand_old" so that it will
conflict with code in other parts of ICAMS package.
Create tetranucleotide abundance
Generate stranded k-mer abundance from a given genome and gene annotation file
GetMutationLoadsFromStrelkaSBSVCFs
Get mutation loads information from Strelka SBS VCF files.
Read Strelka ID (small insertion and deletion) VCF files
Read in the data lines of an SBS VCF created by Strelka version 1
These two functions is applicable only for
internal ICAMS-formatted catalog object.
K-mer abundances
Write Indel Catalogs in SigProExtractor format
Infer abundance
given a matrix-like object
and additional information.
VCFsToCatalogsAndPlotToPdf
Create SBS, DBS and Indel catalogs from VCFs and plot them to PDF
Read catalog
Split an in-memory Strelka VCF into SBS, DBS, and variants involving
> 2 consecutive bases
TestMakeCatalogFromStrelkaSBSVCFs
This function is to make catalogs from the sample Strelka SBS VCF files
to compare with the expected catalog information.
Split a VCF into SBS, DBS, and ID VCFs, plus a list of other mutations
Split an in-memory SBS VCF into pure SBSs, pure DBSs, and variants involving
> 2 consecutive bases
Plot the a SignatureAnalyzer COMPOSITE signature or catalog into separate pdfs
Create DBS catalogs from VCFs
Plot96PartOfCompositeToPDF
Plot the SBS96 part of a SignatureAnalyzer COMPOSITE signature or catalog
Create a catalog from a matrix
, data.frame
, or vector
Get error message and either stop or create a null error output for read catalog
Take strings representing a genome and return the BSgenome
object.
Is there any column in df1 with name "VAF"?
If there is, change its name to "VAF_old" so that it will
conflict with code in other parts of ICAMS package.
Split a mutect2 VCF into SBS, DBS, and ID VCFs, plus a list of other mutations
Reverse complement every string in string.vec
Convert 1536-channel mutation-type identifiers like this "ACCGTA" -> "AC[C>A]GT"
Reverse complement strings that represent stranded SBSs
Stop if catalog.type
is illegal.
Transform between counts and density spectrum catalogs
and counts and density signature catalogs
Write a catalog to a file.
Split each Mutect VCF into SBS, DBS, and ID VCFs (plus
VCF-like data frame with left-over rows)
Source catalog type is counts or counts.signature
density -> <anything>
density.signature -> density.signature, counts.signature
Transcript ranges data
Write a catalog
Return the number of repeat units in which an insertion
is embedded.
Example gene expression data from two cell lines
TestMakeCatalogFromMutectVCFs
This function makes catalogs from the sample Mutect VCF file
and compares it with the expected catalog information.
Create ID (small insertion and deletion) catalog from ID VCFs
Create SBS catalogs from SBS VCFs
Stop if the number of rows in object
is illegal
Convert SBS96-channel mutations-type identifiers like this "A[C>A]T" -> "ACTA"
TestMakeCatalogFromStrelkaIDVCFs
This function is to make catalogs from the sample Strelka ID VCF files
to compare with the expected catalog information.
Create SBS, DBS and Indel catalogs from VCFs
Read and split VCF files
Read a 96-channel spectra (or signature) catalog where rownames are e.g. "A[C>A]T"
StrelkaSBSVCFFilesToCatalogAndPlotToPdf
Create SBS and DBS catalogs from Strelka SBS VCF files and plot them to PDF
StrelkaSBSVCFFilesToZipFile
Create a zip file which contains catalogs and plot PDFs from Strelka SBS VCF files
Test if object is BSgenome.Hsapiens.1000genome.hs37d5
.
Read chromosome and position information from a bed format file.
Reverse complement strings that represent stranded DBSs
Standardize the chromosome name annotations for a data frame.
Convert 96-channel mutation-type identifiers like this "ACTA" -> "A[C>A]T"
Standardize the chromosome name annotations for a data frame.
Read in the data lines of an ID VCF created by Strelka version 1
TransRownames.ID.SigPro.PCAWG
For indels, convert SigProfiler rownames into ICAMS/PCAWG7 rownames
Generate custom k-mer abundance from a given reference genome
Analogous to GetMutectVAF
, calculating VAF and read depth
from PCAWG7 consensus vcfs
Test if object is BSgenome.Hsapiens.UCSC.hg38
.
TransRownames.ID.PCAWG.SigPro
For indels, convert ICAMS/PCAWG7 rownames into SigProfiler rownames
Plot catalog to a PDF file
Plot one spectrum or signature
Create a zip file which contains catalogs and plot PDFs from VCFs
Analogous to VCFsToZipFile
, also generates density CSV and PDF files in the zip
archive.
ReadAndSplitStrelkaSBSVCFs
Read and split Strelka SBS VCF files
Read and split Mutect VCF files
Read in the data lines of a Variant Call Format (VCF) file
SplitListOfStrelkaSBSVCFs
Split a list of in-memory Strelka SBS VCF into SBS, DBS, and variants involving
> 2 consecutive bases
Read VCF files
Split each VCF into SBS, DBS, and ID VCFs (plus
VCF-like data frame with left-over rows)
Stop if region
is illegal.
StrelkaIDVCFFilesToZipFile
Create a zip file which contains ID (small insertion and deletion) catalog
and plot PDF from Strelka ID VCF files
StrelkaSBSVCFFilesToCatalog
Create SBS and DBS catalogs from Strelka SBS VCF files
StopIfTranscribedRegionIllegal
Stop if region
is illegal for an in-transcript catalogs
Convert SBS1536-channel mutations-type identifiers like this "AC[C>A]GT" -> "ACCGTA"
Convert DBS78-channel mutations-type identifiers like this "AC>GA" -> "ACGA"
Add SBS mutation class to an annotated SBS VCF
Add sequence context to an in-memory ID (insertion/deletion) VCF, and
confirm that they match the given reference genome
Add sequence context and transcript information to an in-memory DBS VCF
Add sequence context and transcript information to an in-memory SBS VCF
Add and check SBS class in an annotated VCF with the corresponding SBS
mutation matrix
Add transcript information to a data frame with mutation records
Add sequence context to a data frame with mutation records
Determine the mutation types of insertions and deletions.
CalBaseCountsFrom3MerAbundance
Calculate base counts from three mer abundance
CheckAndReturnSBSCatalogs
Check and return SBS catalogs
Create a run information text file from generating zip archive from VCF
files.
Add and check DBS class in an annotated VCF with the corresponding DBS
mutation matrix
Add DBS mutation class to an annotated DBS VCF
Given an insertion and its sequence context, categorize it.
Standard order of row names in a catalog
Calculate the number of space needed to add strand bias statistics to
the run-information.txt file.
Given a deletion and its sequence context, categorize it
Given a single insertion or deletion in context categorize it.
Check and return the SBS mutation matrix