SplitOneVCF: Split a VCF into SBS, DBS, and ID VCFs, plus a list of other mutations

Description

Split a VCF into SBS, DBS, and ID VCFs, plus a list of other mutations

SplitOneVCF(
  vcf.df,
  max.vaf.diff = 0.02,
  name.of.VCF = NULL,
  always.merge.SBS = FALSE,
  chr.names.to.process = NULL
)

A list with 3 in-memory VCFs and discarded variants that were not incorporated into the first 3 VCFs:

* SBS: VCF with only single base substitutions.

* DBS: VCF with only doublet base substitutions.

* ID: VCF with only small insertions and deletions.

* discarded.variants: Non-NULL only if there are variants that were excluded from the analysis. See the added extra column

discarded.reason for more details. @md

vcf.df: An in-memory data.frame representing a VCF, including VAFs, which are added by ReadVCF.
max.vaf.diff: The maximum difference of VAF, default value is 0.02. If the absolute difference of VAFs for adjacent SBSs is bigger than max.vaf.diff, then these adjacent SBSs are likely to be "merely" asynchronous single base mutations, opposed to a simultaneous doublet mutation or variants involving more than two consecutive bases. Use negative value (e.g. -1) to suppress merging adjacent SBSs to DBS.
name.of.VCF: Name of the VCF file.
always.merge.SBS: If TRUE merge adjacent SBSs as DBSs regardless of VAFs and regardless of the value of max.vaf.diff.
chr.names.to.process: A character vector specifying the chromosome names in VCF whose variants will be kept and processed, other chromosome variants will be discarded. If NULL(default), all variants will be kept except those on chromosomes with names that contain strings "GL", "KI", "random", "Hs", "M", "JH", "fix", "alt".