Learn R Programming

ensemblVEP (version 1.12.0)

parseCSQToGRanges: Parse the CSQ column of a VCF object into a GRanges object

Description

Parse the CSQ column in a VCF object returned from the Ensembl Variant Effect Predictor (VEP).

Usage

"parseCSQToGRanges"(x, VCFRowID=character(), ..., info.key = "CSQ") "parseCSQToGRanges"(x, VCFRowID=character(), ..., info.key = "CSQ")

Arguments

x
The character name of a vcf file on disk or a VCF object
VCFRowID
A character vector of rownames from the original VCF. When provided, the result includes a metadata column named ‘VCFRowID’ which maps the result back to the row (variant) in the original VCF.

When VCFRowID is not provided no ‘VCFRowID’ column is included.

info.key
The name of the INFO key that VEP writes the consequences to in the output (default is CSQ). This should only be used if something other that CSQ was passed in the --vcf_info_field flag in the output options.
...
Arguments passed to other methods. Currently not used.

Value

Returns a GRanges object with consequence data as the metadata columns.

Details

When ensemblVEP returns a VCF object, the consequence data are returned unparsed in the 'CSQ' INFO column. parseCSQToGRanges parses these data into a GRanges object that is expanded to match the dimension of the 'CSQ' data. Because each variant can have multiple matches, the ranges in the GRanges are repeated.

If rownames from the original VCF are provided as VCFRowID a metadata column is included in the result that maps back to the row (variant) in the original VCF.

References

Ensembl VEP Home: http://uswest.ensembl.org/info/docs/tools/vep/index.html

See Also

ensemblVEP VEPParam-class

Examples

Run this code
  file <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") 
  vep <- ensemblVEP(file, param=VEPParam(dataformat=c(vcf=TRUE)))
 
  ## The returned 'CSQ' data are unparsed.
  info(vep)$CSQ
 
  ## Parse into a GRanges and include the 'VCFRowID' column.
  vcf <- readVcf(file, "hg19")
  csq <- parseCSQToGRanges(vep, VCFRowID=rownames(vcf))
  csq[1:4]

Run the code above in your browser using DataLab