VariantInfo
object## S3 method for class 'TabixFile,GRanges':
readVariantInfo(file, regions, subset,
noIndels=TRUE, onlyPass=TRUE,
na.limit=1, MAF.limit=1,
na.action=c("impute.major", "omit", "fail"),
MAF.action=c("ignore", "omit","invert", "fail"),
omitZeroMAF=TRUE, refAlt=FALSE, sex=NULL)
## S3 method for class 'TabixFile,missing':
readVariantInfo(file, regions, ...)
## S3 method for class 'character,GRanges':
readVariantInfo(file, regions, ...)
## S3 method for class 'character,missing':
readVariantInfo(file, regions, ...)
TabixFile
object or a character
string with a file name of the VCF file to read from; if file
is a file name, the method internally creates a
TabixFile
object for this file name.GRanges
object that specifies
which genomic regions to read from the VCF file; if missing,
the entire VCF file is read.TRUE
(default), only single-nucleotide
variants (SNVs) are considered and indels are skipped.TRUE
(default), only variants are considered
whose value in the FILTER
column is refAlt=TRUE
, the MAFs of the
variants that have been inverted do no longer correspond to the
true alternate allele.TRUE
(default), variants with an MAF
of 0 are not considered and omitted from the output object.TRUE
, two metadata columns named FALSE
.NULL
, all samples are treated the same
without any modifications; if sex
is a factor with levels
F
(female) and M
(male) that is as long as
subset
or as the VCF file has samples, this argument is
interpreted as the sex of the samples. In this case, the genotypes
corresponding to male samples are doubled before computing MAFs.
The option to supply the sex
argument is meant to allow
for a correct estimate of MAFs as readGenotypeMatrix
and assocTest
compute it. Note, however, that the
MAFs computed in this way do not correspond to the true MAFs
contained in the data.TabixFile,GRanges
.VariantInfo
readVariantInfo
method considers each variant and
determines its minor allele frequency (MAF) and the type of the
variant. The result is returned as a VariantInfo
object, i.e. a GRanges
object with two
metadata columns refAlt
is TRUE
,
two further metadata columns For all variants, filters in terms of missing values and MAFs can be
applied. Moreover, variants with MAFs greater than 0.5 can filtered
out or inverted. For details, see descriptions of parameters
na.limit
, MAF.limit
, na.action
, and
MAF.action
above.
Li, H., Handsaker, B., Wysoker, A., Fenell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and 1000 Genome Project Data Processing Subgroup (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.
GenotypeMatrix
vcfFile <- system.file("examples/example1.vcf.gz", package="podkat")
## default parameters
vInfo <- readVariantInfo(vcfFile)
vInfo
summary(vInfo)
## including zero MAF variants and reference/alternate alleles
vInfo <- readVariantInfo(vcfFile, omitZeroMAF=FALSE, refAlt=TRUE)
vInfo
summary(vInfo)
Run the code above in your browser using DataLab