Learn R Programming

mappoly (version 0.2.3)

est_pairwise_rf: Pairwise two-point analysis

Description

Performs the two-point pairwise analysis between all markers in a sequence. For each pair, the function estimates the recombination fraction for all possible linkage phase configurations and associated LOD Scores.

Usage

est_pairwise_rf(
  input.seq,
  count.cache = NULL,
  ncpus = 1L,
  mrk.pairs = NULL,
  n.batches = 1L,
  est.type = c("disc", "prob"),
  verbose = TRUE,
  memory.warning = TRUE,
  parallelization.type = c("PSOCK", "FORK"),
  tol = .Machine$double.eps^0.25
)

Arguments

input.seq

an object of class mappoly.sequence

count.cache

an object of class cache.info containing pre-computed genotype frequencies, obtained with cache_counts_twopt. If NULL (default), genotype frequencies are internally loaded.

ncpus

Number of parallel processes (cores) to spawn (default = 1)

mrk.pairs

a matrix of dimensions 2*N, containing N pairs of markers to be analyzed. If NULL (default), all pairs are considered

n.batches

The number of batches of marker pairs that should be analyzed in parallel. Using n.batches > 1, will usually result in more processing time. However, it will require less memory. See examples.

est.type

Indicates whether to use the discrete ("disc") or the probabilistic ("prob") dosage scoring when estimating the two-point recombination fractions.

verbose

If TRUE (default), current progress is shown; if FALSE, no output is produced

memory.warning

if TRUE, prints a memory warning if the number of markers is greater than 10000 for ploidy levels up to 4, and 3000 for ploidy levels > 4.

parallelization.type

one of the supported cluster types. This should be either PSOCK (default) or FORK.

tol

the desired accuracy. See optimize() for details

Value

An object of class poly.est.two.pts.pairwise which is a list containing the following components:

data.name

name of the object of class mappoly.data with the raw data

n.mrk

number of markers in the sequence

seq.num

a vector containing the (ordered) indices of markers in the sequence, according to the input file

pairwise

a list of size choose(length(input.seq$seq.num), 2), each of them containing a matrix where the name of the rows have the form x-y, where x and y indicate how many homologues share the same allelic variant in parents P and Q, respectively (see Mollinari and Garcia, 2019 for notation). The first column indicates the LOD Score in relation to the most likely linkage phase configuration. The second column shows the estimated recombination fraction for each configuration, and the third indicates the LOD Score comparing the likelihood under no linkage (r=0.5) with the estimated recombination fraction (evidence of linkage).

chisq.pval.thresthreshold used to perform the segregation tests chisq.pvalp-values associated with the performed segregation tests

References

Mollinari, M., and Garcia, A. A. F. (2019) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models, _G3: Genes, Genomes, Genetics_. 10.1534/g3.119.400378

Examples

Run this code
# NOT RUN {
  ## Tetraploid example (first 50 markers) 
  all.mrk <- make_seq_mappoly(tetra.solcap, 1:50)
  red.mrk <- elim_redundant(all.mrk)
  unique.mrks <- make_seq_mappoly(red.mrk)
  all.pairs <- est_pairwise_rf(input.seq = unique.mrks,
                               ncpus = 1, 
                               verbose=TRUE)
   all.pairs
   plot(all.pairs, 20, 21)
   mat <- rf_list_to_matrix(all.pairs)
   plot(mat)
# }

Run the code above in your browser using DataLab