Learn R Programming

beadarrayMSV (version 1.1.0)

locateParalogues: Match paralogs with chromosomes

Description

Matches patterns of parental inherited alleles within half-siblings between MSV-5 paralogs and genetic map SNPs. The matches for each MSV-5 marker are summed in order to enable mapping of the paralogs to individual chromosomes

Usage

locateParalogues(BSSnp, paraCalls, inheritP, offspringLim = 7,
    ratioLim = 0.9)

plotCountsChrom(chromHits, markers = 1:16, ...)

Arguments

BSSnp
"AlleleSetIllumina" (or "MultiSet") object containing only SNP markers, with an assayData entry call (see
paraCalls
List with two matrix elements father and mother containing paralogue calls representing paternal and maternal inherited alleles in offspring, respectively (see unm
inheritP
List with two matrix elements father and mother containing genetic map marker calls representing paternal and maternal inherited alleles in offspring, respectively (see
offspringLim
In order for a match between a paralogue and a chromosome to be detected, the number of (informative) half-siblings must equal or exceed this numeric value (see setMergeOptions)
ratioLim
The patterns of paternal and maternal inherited alleles among half-sib family offspring are compared between MSV-5 paralogs and genetic map SNPs. The ratio of matching allele patterns between the two must equal or exceed this numeric value in
chromHits
A numeric array of size (markers x chromosomes x 2) with the average number of matches per chromosome for mothers and fathers separately. Part of the output from locateParalogues.
markers
Index to subset of MSV-5 markers to plot
...
Additional arguments to axis, to be used on the x-axes

Value

  • The function locateParalogues returns a list with elements
  • cPerMarkerA numeric array of size (markers x chromosomes x 2) with the average number of matches per chromosome for mothers and fathers separately
  • nCountsTotMatrix of size (markers x 2) with the total sum of matches per marker for mothers and fathers
  • plotCountsChrom is used for its side effects

Details

The individual paralogs in paraCalls are associated with the genetic map markers in inheritP. If a matching offspring is registered each time an informative allele in the paralogue corresponds with an informative allele in the mapped marker, the degree of association between the two is determined by counting the number of matches. It is not known whether an A-allele in the paralogue matches with an A- or B-allele in the tested marker, but the the combination that produces the highest number of matches is assumed. This means that any pattern of random mis-matches is equally probable as the same number of matches for two unlinked loci. The chance of linkage being falsely declared between two loci however decreases as the number and ratio of matches increase. Associations supported by too few informative meioses are therefore filtered away. There is a 50% chance of inheriting either allele (A or B) at any segregating locus, which means that a single match is produced by chance 50% of the times. This gives for instance a 6/2^5 (19%) probability that the alleles of two unlinked loci will match for four out of five offspring. Also, as we cannot tell mis-matches from matches, the probability of a false detection is doubled. As such a filter would yield far to many false positives, we need to reduce the probability of random associations further. The default filter counts only markers with at least offspringLim=7 informative meioses and at least ratioLim=90% matches/mis-matches to the paralogue. This threshold implies a random false positive match will occur in 2*11/2^10 (2,1%) of the tests. The total number of matches across markers within each chromosome is divided by the number of tested markers, such that the chromosomes with the highest average number of matches can be found.

The plots produced by plotCountsChrom visualize the average scores produced by locateParalogues. A red (fathers) and black (mothers) line is plotted for each MSV-5 marker, with one or two peaks indicating the chromosome(s) the paralogs map to.

See Also

plotCountsChrom, setMergeOptions, unmixParalogues, resolveInheritanceSNP, MultiSet, AlleleSetIllumina, assignParalogues

Examples

Run this code
#Read markers into an AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],beadInfo=beadInfo)

#Genotype calling and splitting of MSV-5 paralogs
BSRed <- callGenotypes(BSRed)
BSRed <- validateCallsPedigree(BSRed)
iMSV5 <- fData(BSRed)$Classification %in% 'MSV-5' &
    fData(BSRed)$Ped.Errors %in% 0
paraCalls <- unmixParalogues(BSRed[iMSV5,])

#Genetic map SNPs and inherited parental alleles
iSNP <- fData(BSRed)$Classification %in% 'SNP' &!
    is.na(fData(BSRed)$Chromosome)
inheritP <- resolveInheritanceSNP(BSRed[iSNP,])

#Match paralogs with map
chromHits <- locateParalogues(BSRed[iSNP,],paraCalls,inheritP)

#The example data and map are too small to detect most homeologies
plotCountsChrom(chromHits$cPerMarker,1:sum(iMSV5),at=1:15,
    labels=dimnames(chromHits$c)[[2]],las=2)

Run the code above in your browser using DataLab