Learn R Programming

beadarrayMSV (version 1.0.3)

assignParalogues: Assign MSV-5 paralogs to chromosomes

Description

Based on linkage information and a set of MSV-5 markers which have been split into individual paralogs within half-sib families, this function attempts to map the paralogs to their respective chromosomes and name them accordingly

Usage

setMergeOptions(minC = NULL, noiseQuantile = 0.75,
    offspringLim = 7, ratioLim = 0.9, rngLD = 5)

assignParalogues(BSSnp, BSRed, paraCalls = unmixParalogues(BSRed, singleCalls), inheritP = resolveInheritanceSNP(BSSnp), singleCalls = getSingleCalls(BSRed), cHits = locateParalogues(BSSnp, paraCalls, inheritP, mO$offspringLim, mO$ratioLim)$cPerMarker, mO = setMergeOptions())

Arguments

minC
A numeric value corresponding to the elements of cHits below which no chromosomes are detected
noiseQuantile
The quantile of the third largest chromosomes across markers from which minC may be estimated
offspringLim
In order for a match between a paralogue and a chromosome to be detected, the number of (informative) half-siblings must equal or exceed this numeric value
ratioLim
The patterns of paternal and maternal inherited alleles among half-sib family offspring are compared between MSV-5 paralogs (see unmixParalogues) and genetic map SNPs (see
rngLD
Numeric indicating how many map-units (e.g. cM) to include on each side of the genetic map marker to increase the number of informative meioses and the power of the associations with the paralogs.
BSSnp
"AlleleSetIllumina" (or "MultiSet") object containing SNPs of known location on the chromosomes, including an assayData entry call. A
BSRed
"AlleleSetIllumina" (or "MultiSet") object containing MSV-5's to be mapped, with a required assayData-list entry call. Must contain t
paraCalls
List containing two matrices, mother and father, with the parental inherited alleles of individual paralogs assuming unknown alternate parent (see unmixParalogues<
inheritP
List containing two matrices, mother and father, with the parental inherited alleles for the markers in BSSnp (see resolveInheritanceSNP)
singleCalls
Matrix containing MSV-5s for which both paralogs are either monomorphic or polymorphic (see getSingleCalls)
cHits
A three-dimensional array of size (markers x chromosomes x 2) containing an average number of matches of a paralogue to a chromsome for both the mothers and fathers (average across the number of markers in the map for that chromosome; see
mO
List with options used in the mapping of paralogs (see setMergeOptions)

Value

  • A list containing
  • xa matrix holding the calls for those paralogs that are successfully mapped to a chromosome. The rownames reflect the chromosome as well as the marker-name
  • chromPairsa matrix with 0, 1, or 2 chromosomes to which the MSV-5's have been succesfully mapped
  • positionFemalea matrix holding the mapped paralogue positions as estimated by the female parent half-sib families
  • positionMalea matrix holding the mapped paralogue positions as estimated by the male parent half-sib families

Details

While the function locateParalogues allows for matching of paralogs to any chromosome, assignParalogues uses the former output and limits the allowed choices to one or two chromosomes. The paralogs are given names reflecting these chromosomes, which allows for merging of the linkage information in paraCalls into a single, much more informative data-table.

Initially, the largest value of cHits between mother and father is chosen, and the resulting scores are sorted decreasingly among chromosomes one marker at the time. Up to two of the highest scoring chromosomes are selected if their values exceed mO$minC. If this element is NULL, it will be estimated based on the mO$noiseQuantile'th quantile of the third highest ranking chromosomes across markers. Also, the second ranking chromosome will not be selected unless it scores twice as high as the third ranking chromosome. Using the maternal and paternal half-sib families in turn, each paralogue is mapped to either of the selected chromosomes if sufficient association is detected. For each half-sib family and each (informative) paralogue, only genetic map markers for which the parent in question is heterozygous are useful. This reduces the number of genetic map markers to which the paralogs can be associated. Similarly, for only a subset of the half-siblings are the parental inherited alleles in each paralogue known. This tends to reduce the number of informative offspring in each family drastically. Missing parental alleles among the genetic map markers further reduce the numbers of informative offspring, however these may sometimes be imputed using neighbouring markers assumed to be in linkage disequilibrium (LD) with the marker in question. The option rngLD indirectly controls the number of helping markers to use.

The mapping itself proceeds by applying the filter defined in mO to the genetic map markers on a specific chromosome (see locateParalogues for specifics about the filter). A set of statistics are then calculated to find the marker that matches the chromosome most closely. If there are two candidate chromosomes, the one with the highest ranked marker is selected. If there is only one candidate, it is selected if it outranks all the other chromosomes in terms of the calculated statistics. If a succesfull match is found, the parental inherited alleles for that family are assigned to the paralogue whose name reflects the chromosome match.

See Also

plotCountsChrom, setMergeOptions, unmixParalogues, resolveInheritanceSNP, MultiSet, AlleleSetIllumina, locateParalogues, getSingleCalls

Examples

Run this code
#Read markers into an AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],beadInfo=beadInfo)

#Genotype calling and splitting of MSV-5 paralogs
BSRed <- callGenotypes(BSRed)
BSRed <- validateCallsPedigree(BSRed)
iMSV5 <- fData(BSRed)$Classification %in% 'MSV-5' &
    fData(BSRed)$Ped.Errors %in% 0
singleCalls <- getSingleCalls(BSRed[iMSV5,])
paraCalls <- unmixParalogues(BSRed[iMSV5,],singleCalls)

#Genetic map SNPs and inherited parental alleles
iSNP <- fData(BSRed)$Classification %in% 'SNP' &!
    is.na(fData(BSRed)$Chromosome)
inheritP <- resolveInheritanceSNP(BSRed[iSNP,])

#Match paralogs with map
mO <- setMergeOptions(minC=1)
chromHits <- locateParalogues(BSRed[iSNP,],paraCalls,
   inheritP,mO$offspringLim,mO$ratioLim)

#The example data and map are too small to detect most homeologies
plotCountsChrom(chromHits$cPerMarker,1:sum(iMSV5),at=1:15,
   labels=dimnames(chromHits$c)[[2]],las=2)

#Only a few, single paralogs are succesfully assigned to chromosomes
mergedCalls <- assignParalogues(BSRed[iSNP,],BSRed[iMSV5],paraCalls,
   inheritP,singleCalls,cHits=chromHits$cPerMarker,mO=mO)
print(mergedCalls$chromPairs)
print(mergedCalls$x[,1:4])

Run the code above in your browser using DataLab