Last chance! 50% off unlimited learning
Sale ends in
Find the largest subset of markers such that no two adjacent markers are separated by less than some distance.
reduce_markers(
map,
min_distance = 1,
weights = NULL,
max_batch = 10000,
batch_distance_mult = 1,
cores = 1
)
A list like the input map
, but with the selected
subset of markers.
A list with each component being a vector with the marker positions for a chromosome.
Minimum distance between markers.
A (optional) list of weights on the markers; same
size as map
.
Maximum number of markers to consider in a batch
If working with batches of markers,
reduce min_distance
by this multiple.
Number of CPU cores to use, for parallel calculations.
(If 0
, use parallel::detectCores()
.)
Alternatively, this can be links to a set of cluster sockets, as
produced by parallel::makeCluster()
.
Uses a dynamic programming algorithm to find, for each
chromosome, the subset of markers for with max(weights
) is
maximal, subject to the constraint that no two adjacent markers may
be separated by more than min_distance
.
The computation time for the algorithm grows with like the square
of the number of markers, like 1 sec for 10k markers
but 30 sec for 50k markers. If the number of markers on a chromosome
is greater than max_batch
, the markers are split into batches and
the algorithm applied to each batch with min_distance smaller by a
factor min_distance_mult
, and then merged together for one last pass.
Broman KW, Weber JL (1999) Method for constructing confidently ordered linkage maps. Genet Epidemiol 16:337--343
find_dup_markers()
, drop_markers()
# read data
grav2 <- read_cross2(system.file("extdata", "grav2.zip", package="qtl2"))
# grab genetic map
gmap <- grav2$gmap
# subset to markers that are >= 1 cM apart
gmap_sub <- reduce_markers(gmap, 1)
# drop all of the other markers from the cross
markers2keep <- unlist(lapply(gmap_sub, names))
grav2_sub <- pull_markers(grav2, markers2keep)
Run the code above in your browser using DataLab