Add sequence context and transcript information to an in-memory DBS VCF
Add sequence context to a data frame with mutation records
Add sequence context to an in-memory ID (insertion/deletion) VCF, and
confirm that they match the given reference genome
Add sequence context and transcript information to an in-memory SBS VCF
Add SBS mutation class to an annotated SBS VCF
Add and check SBS class in an annotated VCF with the corresponding SBS
mutation matrix
Add transcript information to a data frame with mutation records
Add DBS mutation class to an annotated DBS VCF
Add and check DBS class in an annotated VCF with the corresponding DBS
mutation matrix
Create a run information text file from generating zip archive from VCF
files.
Standard order of row names in a catalog
Check and, if possible, correct the chromosome names in a VCF data.frame
.
Given an insertion and its sequence context, categorize it.
Check and return the ID mutation matrix
Return the length of microhomology at a deletion
CheckAndReturnSBSCatalogs
Check and return SBS catalogs
Create trinucleotide abundance
Create the matrix a DBS catalog for *one* sample from an in-memory VCF.
GetMutationLoadsFromStrelkaIDVCFs
Get mutation loads information from Strelka ID VCF files.
Given a deletion and its sequence context, categorize it
Create one column of the matrix for an indel catalog from *one* in-memory VCF.
Check and return the DBS mutation matrix
Given a single insertion or deletion in context categorize it.
Determine the mutation types of insertions and deletions.
Check DBS mutation class in VCF with the corresponding DBS mutation matrix
Check and return ID catalog
CheckAndReturnDBSCatalogs
Check and return DBS catalogs
CreateStrandedDinucAbundance
Create stranded dinucleotide abundance
Check and return the SBS mutation matrix
Create the matrix an SBS catalog for *one* sample from an in-memory VCF.
Check whether the rownames of object
are correct, if yes then put the
rows in the correct order.
"Collapse" a catalog
Example gene expression data from two cell lines
Create position probability matrix (PPM) for *one* sample from
a Variant Call Format (VCF) file.
Return the number of repeat units in which a deletion is embedded
Return the number of repeat units in which an insertion
is embedded.
CreateStrandedTrinucAbundance
Create stranded trinucleotide abundance
Test if object is BSgenome.Mmusculus.UCSC.mm10
.
Test if object is BSgenome.Hsapiens.UCSC.hg38
.
ConvertICAMSCatalogToSigProSBS96
Covert an ICAMS SBS96 Catalog to SigProfiler format
Generate an empty matrix of k-mer abundance
GetMutationLoadsFromStrelkaSBSVCFs
Get mutation loads information from Strelka SBS VCF files.
Generate k-mer abundance from a given genome
Create pentanucleotide abundance
Create position probability matrices (PPM) from a list of SBS vcfs
ReadAndSplitStrelkaSBSVCFs
Read and split Strelka SBS VCF files
Extract the VAFs (variant allele frequencies) and read depth information from
a VCF file
GetMutationLoadsFromMutectVCFs
Get mutation loads information from Mutect VCF files.
Read in the data lines of a Variant Call Format (VCF) file
ICAMS: In-depth Characterization and Analysis of Mutational Signatures
Plot position probability matrices (PPM) to a PDF file
CalBaseCountsFrom3MerAbundance
Calculate base counts from three mer abundance
MakeVCFDBSdf Take DBS ranges and the original VCF and generate a VCF with
dinucleotide REF and ALT alleles.
Read Strelka ID (small insertion and deletion) VCF files
Read in the data lines of an SBS VCF created by Strelka version 1
Reverse complement strings that represent stranded SBSs
Create tetranucleotide abundance
Read chromosome and position information from a bed format file.
Reverse complement strings that represent stranded DBSs
Infer the correct rownames for a matrix based on its number of rows
Create a transcript range file from the raw GFF3 File
Plot transcription strand bias with respect to gene expression values
Infer abundance
given a matrix-like object
and additional information.
Create a zip file which contains catalogs and plot PDFs from Mutect VCF files
density -> <anything>
density.signature -> density.signature, counts.signature
Split an in-memory Strelka VCF into SBS, DBS, and variants involving
> 2 consecutive bases
Source catalog type is counts or counts.signature
Split a mutect2 VCF into SBS, DBS, and ID VCFs, plus a list of other mutations
InferClassOfCatalogForRead
Infer the class of catalog in a file.
Take strings representing a genome and return the BSgenome
object. Calculate the number of space needed to add strand bias statistics to
the run-information.txt file.
Test if object is BSgenome.Hsapiens.1000genome.hs37d5
.
PlotTransBiasGeneExpToPdf
Plot transcription strand bias with respect to gene expression values to a
PDF file
Read and split Mutect VCF files
Plot96PartOfCompositeToPDF
Plot the SBS96 part of a SignatureAnalyzer COMPOSITE signature or catalog
Read in the data lines of a Variant Call Format (VCF) file created by Mutect
Plot one spectrum or signature
Read Mutect VCF files.
Convert 96-channel mutation-type identifiers like this "ACTA" -> "A[C>A]T"
Stop if catalog.type
is illegal.
Is there any column in df1 with name "VAF"?
If there is, change its name to "VAF_old" so that it will
conflict with code in other parts of ICAMS package.
Read catalog
Stop if the number of rows in object
is illegal
Plot the a SignatureAnalyzer COMPOSITE signature or catalog into separate pdfs
TestMakeCatalogFromStrelkaSBSVCFs
This function is to make catalogs from the sample Strelka SBS VCF files
to compare with the expected catalog information.
Write a catalog to a file.
Read a 192-channel spectra (or signature) catalog in Duke-NUS format
CreateExomeStrandedRanges
Create exome transcriptionally stranded regions
Create dinucleotide abundance
Check that the sequence context information is consistent with the value of
the column REF.
Read a 96-channel spectra (or signature) catalog where rownames are e.g. "A[C>A]T"
Check SBS mutation class in VCF with the corresponding SBS mutation matrix
Read in the data lines of an ID VCF created by Strelka version 1
Read in the data lines of a Variant Call Format (VCF) file
Read VCF files
Convert 96-channel mutations-type identifiers like this "A[C>A]T" -> "ACTA"
Split each Mutect VCF into SBS, DBS, and ID VCFs (plus two
VCF-like data frame with left-over rows).
Generate all possible k-mers of length k.
Write a catalog
StrelkaIDVCFFilesToCatalogAndPlotToPdf
Create ID (small insertion and deletion) catalog from Strelka ID VCF files
and plot them to PDF
Create DBS catalogs from VCFs
K-mer abundances
Write Indel Catalogs in SigProExtractor format
StrelkaIDVCFFilesToCatalog
Create ID (small insertion and deletion) catalog from Strelka ID VCF files
Create ID (small insertion and deletion) catalog from ID VCFs
TestMakeCatalogFromStrelkaIDVCFs
This function is to make catalogs from the sample Strelka ID VCF files
to compare with the expected catalog information.
TestMakeCatalogFromMutectVCFs
This function makes catalogs from the sample Mutect VCF file
and compares it with the expected catalog information.
Generate custom k-mer abundance from a given reference genome
Generate k-mer abundance from given nucleotide sequences
Create SBS catalogs from SBS VCFs
Standardize the chromosome name annotations for a data frame.
StrelkaSBSVCFFilesToCatalogAndPlotToPdf
Create SBS and DBS catalogs from Strelka SBS VCF files and plot them to PDF
StrelkaSBSVCFFilesToZipFile
Create a zip file which contains catalogs and plot PDFs from Strelka SBS VCF files
Transform between counts and density spectrum catalogs
and counts and density signature catalogs
Generate stranded k-mer abundance from a given genome and gene annotation file
Transcript ranges data
SplitListOfStrelkaSBSVCFs
Split a list of in-memory Strelka SBS VCF into SBS, DBS, and variants involving
> 2 consecutive bases
Create SBS, DBS and Indel catalogs from Mutect VCF files
MutectVCFFilesToCatalogAndPlotToPdf
Create SBS, DBS and Indel catalogs from Mutect VCF files and plot them to PDF
Standardize the chromosome name annotations for a data frame.
Plot catalog to a PDF file
Plot position probability matrix (PPM) for *one* sample from a Variant Call Format
(VCF) file.
Read transcript ranges and strand information from a gff3 format file.
Use this one for the new, cut down gff3 file (2018 11 24)
Read Strelka SBS (single base substitutions) VCF files.
RenameColumnsWithNameStrand
Is there any column in df
with name "strand"?
If there is, change its name to "strand_old" so that it will
conflict with code in other parts of ICAMS package.
Remove ranges that fall on both strands
StrelkaIDVCFFilesToZipFile
Create a zip file which contains ID (small insertion and deletion) catalog
and plot PDF from Strelka ID VCF files
StrelkaSBSVCFFilesToCatalog
Create SBS and DBS catalogs from Strelka SBS VCF files
Stop if region
is illegal.
StopIfTranscribedRegionIllegal
Stop if region
is illegal for an in-transcript catalogs
Create a catalog from a matrix
, data.frame
, or vector
Reverse complement every string in string.vec
TransRownames.ID.SigPro.PCAWG
For indels, convert SigProfiler rownames into ICAMS/PCAWG7 rownames
TransRownames.ID.PCAWG.SigPro
For indels, convert ICAMS/PCAWG7 rownames into SigProfiler rownames