Learn R Programming

ICAMS (version 2.1.2)

AnnotateIDVCF: Add sequence context to an in-memory ID (insertion/deletion) VCF, and confirm that they match the given reference genome.

Description

Add sequence context to an in-memory ID (insertion/deletion) VCF, and confirm that they match the given reference genome.

Usage

AnnotateIDVCF(ID.vcf, ref.genome, flag.mismatches = 0, name.of.VCF = NULL)

Arguments

ID.vcf

An in-memory ID (insertion/deletion) VCF as a data.frame. This function expects that there is a "context base" to the left, for example REF = ACG, ALT = A (deletion of CG) or REF = A, ALT = ACC (insertion of CC).

ref.genome

A ref.genome argument as described in ICAMS.

flag.mismatches

Deprecated. If there are mismatches to references, the function will automatically discard these rows. User can refer to the element discarded.variants in the return value for more details.

name.of.VCF

Name of the VCF file.

Value

A list whose first element "annotated.vcf" contains the original VCF data frame with 2 new columns added to the input data frame:

  1. seq.context The sequence embedding the variant.

  2. seq.context.width The width of seq.context to the left.

If there are rows that are discarded from the original VCF data frame, the function will generate a warning and a second element "discarded.variants" will be included in the return value. The discarded variants can belong to the following types:

  1. Variants which have the same number of bases for REF and ALT alleles.

  2. Variants which have empty REF or ALT allels.

  3. Complex indels.

  4. Variants with mismatches between VCF and reference sequence.

Examples

Run this code
# NOT RUN {
file <- c(system.file("extdata/Strelka-ID-vcf/",
                      "Strelka.ID.GRCh37.s1.vcf",
                      package = "ICAMS"))
ID.vcf <- ReadStrelkaIDVCFs(file)[[1]]
if (requireNamespace("BSgenome.Hsapiens.1000genomes.hs37d5", quietly = TRUE)) {
  list <- AnnotateIDVCF(ID.vcf, ref.genome = "hg19")
  annotated.ID.vcf <- list$annotated.vcf}
# }

Run the code above in your browser using DataLab