reduce_markers: Reduce markers to a subset of more-evenly-spaced ones

Description

Find the largest subset of markers such that no two adjacent markers are separated by less than some distance.

Usage

reduce_markers(
  map,
  min_distance = 1,
  weights = NULL,
  max_batch = 10000,
  batch_distance_mult = 1,
  cores = 1
)

Value

A list like the input map, but with the selected subset of markers.

Arguments

map: A list with each component being a vector with the marker positions for a chromosome.
min_distance: Minimum distance between markers.
weights: A (optional) list of weights on the markers; same size as map.
max_batch: Maximum number of markers to consider in a batch
batch_distance_mult: If working with batches of markers, reduce min_distance by this multiple.
cores: Number of CPU cores to use, for parallel calculations. (If 0, use parallel::detectCores().) Alternatively, this can be links to a set of cluster sockets, as produced by parallel::makeCluster().

Details

Uses a dynamic programming algorithm to find, for each chromosome, the subset of markers for with max(weights) is maximal, subject to the constraint that no two adjacent markers may be separated by more than min_distance.

The computation time for the algorithm grows with like the square of the number of markers, like 1 sec for 10k markers but 30 sec for 50k markers. If the number of markers on a chromosome is greater than max_batch, the markers are split into batches and the algorithm applied to each batch with min_distance smaller by a factor min_distance_mult, and then merged together for one last pass.

References

Broman KW, Weber JL (1999) Method for constructing confidently ordered linkage maps. Genet Epidemiol 16:337--343

Examples

Run this code

# read data
grav2 <- read_cross2(system.file("extdata", "grav2.zip", package="qtl2"))

# grab genetic map
gmap <- grav2$gmap

# subset to markers that are >= 1 cM apart
gmap_sub <- reduce_markers(gmap, 1)

# drop all of the other markers from the cross
markers2keep <- unlist(lapply(gmap_sub, names))
grav2_sub <- pull_markers(grav2, markers2keep)

Run the code above in your browser using DataLab

Last chance! 50% off unlimited learning