VCF (Variant Call Format) file format is used to store variation data and its metadata. Based on the used analysis program (e.g. GATK, freebayes, etc...), details within the VCF file can slightly differ. For example, type of mutation is not mentioned as output for certain variant analysis programs. the "read_vcf" function, ignores the first header/metadata lines and directly converts the data into a tidy dataframe. The function will extract the type of mutation. By absence, it will derive the type of mutation from the "ref" and "alt" column.
read_vcf(
file,
parse_info = FALSE,
col_names = def_names("vcf"),
col_types = def_types("vcf")
)dataframe
Either a path to a file, a connection, or literal data (either a
single string or a raw vector). file can also be a character vector
containing multiple filepaths or a list containing multiple connections.
Files ending in .gz, .bz2, .xz, or .zip will be automatically
decompressed. Files starting with http://, https://, ftp://, or
ftps:// will be automatically downloaded. Remote compressed files
(.gz, .bz2, .xz, .zip) will be automatically downloaded and
decompressed.
Literal data is most useful for examples and tests. To be recognised as
literal data, wrap the input with I().
if set to 'TRUE', the read_vcf function will split all the metadata stored in the "info" column and stores it into separate columns. By default it is set to 'FALSE'.
column names to use. Defaults to def_names("vcf") (see def_names).
column types to use. Defaults to def_types("vcf") (see def_types).