ICAMS (version 2.0.7)

SplitListOfStrelkaSBSVCFs: Split a list of in-memory Strelka SBS VCF into SBS, DBS, and variants involving > 2 consecutive bases

Description

SBSs are single base substitutions, e.g. C>T, A<G,.... DBSs are double base substitutions, e.g. CC>TT, AT>GG, ... Variants involving > 2 consecutive bases are rare, so this function just records them. These would be variants such ATG>CCT, AGAT > TCTA, ...

Usage

SplitListOfStrelkaSBSVCFs(list.of.vcfs)

Arguments

list.of.vcfs

A list of in-memory data frames containing Strelka SBS VCF file contents.

Value

A list of 3 in-memory objects with the elements: SBS.vcfs: List of Data frames of pure SBS mutations -- no DBS or 3+BS mutations DBS.vcfs: List of Data frames of pure DBS mutations -- no SBS or 3+BS mutations ThreePlus: List of Data tables with the key CHROM, LOW.POS, HIGH.POS and additional information (reference sequence, alternative sequence, context, etc.) Additional information not fully implemented at this point because of limited immediate biological interest.