Usage
vcf2geno(vcf, ped, none = "0/0", one = c("0/1"), both = "1/1", na.string = ".", use.rownames = FALSE, allowDifference = FALSE, removeMonomorphic = TRUE,
removeNonBiallelic = TRUE, changeMinor = FALSE)
Arguments
vcf
a matrix resulting from reading a vcf file into R
, or an object of class collapsedVCF
(i.e. the output
of, e.g., the function readVcf
from the VariantAnnotation
package). If use.rownames = FALSE
,
the column names of the genotype matrix must correspond to the personal IDs in ped
(i.e. either the column pid
of ped
, if the entries in pid
are unique, or otherwise, a combination of the columns famid
and
pid
from ped
, combined using an underscore). If use.rownames = TRUE
, the column names of the genotype
matrix specified by vcf
must correspond to the row names of ped
.
ped
a data frame containing the family information for the subjects in vcf
(might also contain information for
other subjects, see allowDifference
). This data frame must contain the columns famid
, pid
, fatid
,
and motid
comprising the family ID, the personal ID as well as the ID of the father and the mother, respectively.
none
a character string or vector specifying the coding for the homozygous reference genotype.
one
a character string or vector specifying the coding for the heterozygous genotype.
both
a character string or vector specifying the coding for the homozygous variant genotype.
na.string
a character string or vector specifying how missing values are coded in the vcf file.
use.rownames
a logical value specifying whether the row names of ped
correspond to the sample names in
vcf
. For details, see vcf
.
allowDifference
a logical value specifying whether ped
and vcf
are allowed to also contain samples not
available in the respective other object. If FALSE
, all samples in ped
must also be available in vcf
,
and vice versa (matched as described in vcf
). If TRUE
, at least 10% of the samples must be contained in both
vcf
and ped
.
removeMonomorphic
a logical value specifying whether monomorphic SNVs should be removed from the output.
removeNonBiallelic
a logical value specifying whether SNVs showing other genotypes than
the ones specified by none
, one
, and both
(which are, therefore, assumed
to show more than two alleles) should be removed.
changeMinor
a logical value specifying whether the coding of the genotypes should be changed for SNVs for which the
default coding leads to a minor allele frequency larger than 0.5. The genotypes are coded by the number of minor alleles,
i.e. the genotype(s) specified by none
is coded by 0, the genotype(s) specified by one
is coded by 1,
and the genotype(s) specified by both
is coded by 2. If for an SNV this leads to a minor allele frequency larger
than 0.5 and changeMinor = TRUE
, this 0, 1, 2-coding will be changed into a 2, 1, 0-coding.