Learn R Programming

QTLRel (version 0.1)

genoImpute: Impute Genotypic Data

Description

Impute missing genotypic data in advance intercross lines (AIL).

Usage

genoImpute(gdat, gmap, prd=NULL, step=Inf, gr=2, pos=NULL,
   method=c("Haldane","Kosambi"), na.str="NA", verbose=FALSE)

Arguments

gdat
genotype data. Should be a matrix or a data frame, with each row representing an observation and each column a marker locus. The column names should be marker names. Optional if an object prd from genoPr
gmap
a genetic map. Should be data frame (snp, chr, dist,...), where "snp" is the SNP (marker) name, "chr" is the chromosome where the "snp" is, and "dist" is the genetic distance in centi-Morgan (cM) from the leftmost SNP (marker) on the chromosome.
prd
an object from genoProb if not NULL. See "details" for more information.
step
the maximum distance (in cM) between two adjacent loci for which the probabilities are calculated. The distance corresponds to the "cumulative" recombination rate at gr-th generation.
gr
the generation under consideration.
pos
data frame (chr, dist, snp, ...). If given, step will be ignored.
method
whether "Haldane" or "Kosambi" mapping function should be used.
na.str
string for missing values.
verbose
a logical variable. If TRUE, certain information will be printed out during calculation.

Value

  • A matrix with the number of rows being the same as gdat and with the number of columns depending on the SNP set in both gdat and gmap and the step length.

Details

The missing genotypic value is randomly assigned with a probability conditional on the genotypes of the flanking SNPs (makers). An object, prd, from genoProb alone can be used for the purpose of imputation. Then, the output (especially the putative loci) will be determined by prd. Optionally, it can be used together with gdat so that missing values in gdat will be imputed if possible, depending on whether loci in the columns of gdat can be identified in the third dimension of prd; this won't change the original genotypic data. See examples.

See Also

genoProb

Examples

Run this code
data(miscEx)

sum(is.na(gdat))
gdat[11:13,21:25]

gdtmp<- (gdat=="AA") + (gdat=="AB")*2 + (gdat=="BB")*3
   gdtmp<- replace(gdtmp,is.na(gdtmp),0)
prDat<- genoProb(gdat=gdtmp, gmap=genMap, step=Inf,
   gr=2, method="Haldane", verbose=TRUE)
prDat$pr[11:13,,21:25]

tmp<- genoImpute(prd=prDat)
dim(gdat)
sum(is.na(tmp))
tmp[11:13,21:25]

tmp<- genoImpute(gdat[1:15,],prd=prDat)
dim(gdat)
sum(is.na(tmp))
tmp[11:13,21:25]

tmp<- genoImpute(gdat, gmap=genMap, step=Inf,
   gr=2, na.str=NA)
dim(gdat)
sum(is.na(tmp))
tmp[11:13,21:25]

tmp<- genoImpute(gdat[1:15,], gmap=genMap, step=Inf,
   gr=2, na.str=NA, verbose=TRUE)
dim(gdat)
sum(is.na(tmp))
tmp[11:13,21:25]

Run the code above in your browser using DataLab