Learn R Programming

MAPpoly

MAPpoly (v. 0.3.3) is an R package to construct genetic maps in diploids and autopolyploids with even ploidy levels. In its current version, MAPpoly can handle ploidy levels up to 8 when using hidden Markov models (HMM) and up to 12 when using the two-point simplification. When dealing with large numbers of markers (> 10,000), we strongly recommend using high-performance computing (HPC).

In its current version, MAPpoly can handle the following types of datasets:

  1. CSV files
  2. MAPpoly files
    • Dosage based
    • Probability based
  3. fitPoly files
  4. VCF files

MAPpoly also is capable of importing objects generated by the following R packages

  1. updog
  2. polyRAD
  3. polymapR
    • Datasets
    • Maps

The mapping strategy uses pairwise recombination fraction estimation as the first source of information to position allelic variants in specific homologs sequentially. The algorithm relies on the multilocus likelihood obtained through a hidden Markov model (HMM) for situations where pairwise analysis has limited power. The derivation of the HMM used in MAPpoly can be found in Mollinari and Garcia, 2019. The computation of the offspring's genotypes probabilities and haplotype reconstruction, as well as the preferential pairing profiles, is presented in Mollinari et al., 2020.

Installation

From CRAN (stable version)

To install MAPpoly from the The Comprehensive R Archive Network (CRAN) use

install.packages("mappoly")

From GitHub (development version)

You can install the development version from Git Hub. Within R, you need to install devtools:

install.packages("devtools")

If you are using Windows, please install the the latest recommended version of Rtools.

To install MAPpoly from Git Hub use

devtools::install_github("mmollina/mappoly", dependencies=TRUE)

For further QTL analysis, we recommend our QTLpoly package. QTLpoly performs random-effect multiple interval mapping (REMIM) in full-sib families of autopolyploid species based on restricted maximum likelihood (REML) estimation and score statistics, as described in Pereira et al. 2020.

We recently released VIEWpoly. VIEWpoly provides a graphical user interface to integrate, visualize and explore results from linkage and quantitative trait loci analysis, together with genomic information for autopolyploid (and diploid) species. The app is meant for interactive use and allows users to optionally upload different sources of information, including gene annotation and alignment files, enabling the exploitation and search for candidate genes in a genome browser. VIEWpoly supports inputs other than MAPpoly's, including polymapR, diaQTL, QTLpoly, and polyqtlR.

MAPpoly's workflow

Vignettes

Related software

# Enable this universe
options(repos = c(
    polyploids = 'https://polyploids.r-universe.dev',
    CRAN = 'https://cloud.r-project.org'))

# Install some packages
install.packages('mappoly')

Miscellaneous

Articles referencing MAPpoly

  1. Rose Rosette Disease Resistance Loci Detected in Two Interconnected Tetraploid Garden Rose Populations (Lau et al., 2022)
  2. VIEWpoly: a visualization tool to integrate and explore results of polyploid genetic analysis. (Taniguti et al., 2022)
  3. Genetic Dissection of Early Blight Resistance in Tetraploid Potato. (Xue et al., 2022)
  4. Haplotype reconstruction in connected tetraploid F1 populations (Zheng et al., 2021)
  5. QTL mapping in outbred tetraploid (and diploid) diallel populations (Amadeu et al., 2021)
  6. Using probabilistic genotypes in linkage analysis of polyploids. (Liao et al., 2021)
  7. Discovery of a major QTL for root-knot nematode Meloidogyne incognita resistance in cultivated sweetpotato Ipomoea batatas. (Oloka, et al., 2021)
  8. Quantitative trait locus mapping for common scab resistance in a tetraploid potato full-sib population. (Pereira et al., 2021)
  9. The recombination landscape and multiple QTL mapping in a Solanum tuberosum cv.'Atlantic'-derived F1 population. (Pereira et al., 2021)
  10. High-Resolution Linkage Map and QTL Analyses of Fruit Firmness in Autotetraploid Blueberry (Cappai et al., 2020)
  11. When a phenotype is not the genotype: Implications of phenotype misclassification and pedigree errors in genomics-assisted breeding of sweetpotato Ipomoea batatas (L.) Lam.(Gemenet et al., 2020)
  12. Quantitative trait loci and differential gene expression analyses reveal the genetic basis for negatively associated beta-carotene and starch content in hexaploid sweetpotato [Ipomoea batatas (L.) Lam.] (Gemenet et al., 2020)
  13. Multiple QTL Mapping in Autopolyploids: A Random-Effect Model Approach with Application in a Hexaploid Sweetpotato Full-Sib Population. (Pereira et al., 2020)

Acknowledgment

This package has been developed as part of the Genomic Tools for Sweetpotato Improvement project (GT4SP) and SweetGAINS, both funded by Bill & Melinda Gates Foundation. Its continuous improvement is made possible by the project AFRI-Grant: A Genetics-Based Data Analysis System for Breeders in Polyploid Breeding Programs and SCRI-Grant: Tools for polyploids, funded by USDA NIFA.


NC State University promotes equal opportunity and prohibits discrimination and harassment based upon one’s age, color, disability, gender identity, genetic information, national origin, race, religion, sex (including pregnancy), sexual orientation and veteran status.

Copy Link

Version

Install

install.packages('mappoly')

Monthly Downloads

345

Version

0.4.1

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Marcelo Mollinari

Last Published

March 6th, 2024

Functions in mappoly (0.4.1)

cache_counts_twopt

Frequency of genotypes for two-point recombination fraction estimation
calc_homologprob

Homolog probabilities
check_data_sanity

Data sanity check
check_pairwise

Check if all pairwise combinations of elements of input.seq are contained in twopt
est_map_haplo_given_genoprob

Estimate a genetic map given a sequence of block markers given the conditional probabilities of the genotypes
calc_genoprob_error

Compute conditional probabilities of the genotypes using global error
create_map

Create a map with pseudomarkers at a given step
check_ls_phase

Compare a list of linkage phases and return the markers for which they are different.
concatenate_ph_list

concatenate two linkage phase lists
est_haplo_hmm

Estimate a genetic map given a sequence of block markers
edit_order

Edit sequence ordered by reference genome positions comparing to another set order
compare_maps

Compare a list of maps
compare_haplotypes

Compare two polyploid haplotypes stored in list format
export_data_to_polymapR

Export data to polymapR
export_map_list

Export a genetic map to a CSV file
cross_simulate

Simulate an autopolyploid full-sib population
concatenate_new_marker

Concatenate new marker
elim_conf_using_two_pts

Eliminate configurations using two-point information
filter_missing_ind

Filter individuals based on missing genotypes
draw_phases

Plot the linkage phase configuration given a list of homologous chromosomes
est_rf_hmm_sequential

Multipoint analysis using Hidden Markov Models: Sequential phase elimination
est_rf_hmm

Multipoint analysis using Hidden Markov Models in autopolyploids
filter_missing_mrk

Filter markers based on missing genotypes
detect_info_par

Detects which parent is informative
calc_prefpair_profiles

Preferential pairing profiles
dist_prob_to_class

Returns the class with the highest probability in a genotype probability distribution
framework_map

Design linkage map framework in two steps: i) estimating the recombination fraction with HMM approach for each parent separately using only markers segregating individually (e.g. map 1 - P1:3 x P2:0, P1: 2x4; map 2 - P1:0 x P2:3, P1:4 x P2:2); ii) merging both maps and re-estimate recombination fractions.
generate_all_link_phase_elim_equivalent

Generate all possible linkage phases in matrix form given the dose and the number of shared alleles between a inserted marker and a pre-computed linkage configuration.
get_LOD

Extract the LOD Scores in a 'mappoly.map' object
genotyping_global_error

Prior probability for genotyping error
get_full_info_tail

Get the tail of a marker sequence up to the point where the markers provide no additional information.
drop_marker

Remove markers from a map
est_full_hmm_with_global_error

Re-estimate genetic map given a global genotyping error
est_full_hmm_with_prior_prob

Re-estimate genetic map using dosage prior probability distribution
draw_cross

Draw simple parental linkage phase configurations
est_rf_hmm_single_phase_single_parent

Multilocus analysis using Hidden Markov Models (single parent, single phase)
filter_non_conforming_classes

Filter non-conforming classes in F1, non double reduced population.
est_pairwise_rf2

Pairwise two-point analysis - RcppParallel version
est_pairwise_rf

Pairwise two-point analysis
est_rf_hmm_single_phase

Multipoint analysis using Hidden Markov Models (single phase)
filter_segregation

Filter markers based on chi-square test
elim_redundant

Eliminate redundant markers
get_cache_two_pts_from_web

Access a remote server to get Counts for recombinant classes
get_counts

Counts for recombinant classes
get_ij

Given a pair of character indicating the numbers i and j : 'i-j', returns a numeric pair c(i,j)
elim_equiv

Eliminates equivalent linkage phase configurations
mds_mappoly

Estimates loci position using Multidimensional Scaling
get_genomic_order

Get the genomic position of markers in a sequence
export_qtlpoly

Export to QTLpoly
get_indices_from_selected_phases

Get the indices of selected linkage phases given a threshold
extract_map

Extract the maker position from an object of class 'mappoly.map'
get_submap

Extract sub-map from map
get_counts_two_parents

Counts for recombinant classes
gg_color_hue

Color pallet ggplot-like
get_dosage_type

Get Dosage Type in a Sequence
get_w_m

Get the number of bivalent configurations
filter_missing

Filter missing genotypes
group_mappoly

Assign markers to linkage groups
get_ols_map

Get weighted ordinary least squared map give a sequence and rf matrix
make_pairs_mappoly

Subset pairwise recombination fractions
make_seq_mappoly

Create a Sequence of Markers
filter_aneuploid

Filter aneuploid chromosomes from progeny individuals
find_blocks

Allocate markers into linkage blocks
get_ph_conf_ret_sh

Given a homology group in matrix form, it returns the number shared homologous for all pairs of markers in this group
filter_map_at_hmm_thres

Filter MAPpoly Map Configurations by Loglikelihood Threshold
filter_individuals

Filter out individuals
get_tab_mrks

Get table of dosage combinations
hexafake

Simulated autohexaploid dataset.
genetic-mapping-functions

Genetic Mapping Functions
get_ph_list_subset

subset of a linkage phase list
parallel_block

Auxiliary function to estimate a map in a block of markers using parallel processing
format_rf

Format results from pairwise two-point estimation in C++
generate_all_link_phases_elim_equivalent_haplo

Eliminate equivalent linkage phases
import_data_from_polymapR

Import data from polymapR
plot_GIC

Genotypic information content
mappoly-color-palettes

MAPpoly Color Palettes
get_counts_all_phases

Counts for recombinant classes
paralell_pairwise_probability

Parallel Pairwise Probability Estimation
get_rf_from_list

Get the recombination fraction for a sequence of markers given an object of class mappoly.twopt and a list containing the linkage phase configuration. This list can be found in any object of class two.pts.linkage.phases, in x$config.to.test$'Conf-i', where x is the object of class two.pts.linkage.phases and i is one of the possible configurations.
hexafake.geno.dist

Simulated autohexaploid dataset with genotype probabilities.
is.prob.data

Check if Object is a Probability Dataset in MAPpoly
merge_datasets

Merge datasets
merge_maps

Merge two maps
merge_parental_maps

Build merged parental maps
plot_compare_haplotypes

Plot Two Overlapped Haplotypes
print_mrk

Summary of a set of markers
plot_genome_vs_map

Physical versus genetic distance
plot_map_list

Plot a genetic map
loglike_hmm

Multipoint log-likelihood computation
paralell_pairwise_discrete

Parallel Pairwise Discrete Estimation
paralell_pairwise_discrete_rcpp

Wrapper function to discrete-based pairwise two-point estimation in C++
solcap.prior.map

Resulting maps from tetra.solcap.geno.dist
print_ph

cat for graphical representation of the phases
import_from_updog

Import from updog
import_phased_maplist_from_polymapR

Import phased map list from polymapR
read_geno_csv

Data Input in CSV format
get_counts_single_parent

Counts for recombinant classes in a polyploid parent.
get_states_and_emission_single_parent

Get states and emission in one informative parent
get_rf_from_mat

Get recombination fraction from a matrix
mrk_chisq_test

Chi-square test
split_and_rephase

Divides map in sub-maps and re-phase them
msg

Msg function
plot_mappoly.map2

Plot object mappoly.map2
ls_linkage_phases

List of linkage phases
plot_mrk_info

Plot marker information
plot_one_map

plot a single linkage group with no phase
sim_cross_one_informative_parent

Simulate mapping population (one parent)
make_mat_mappoly

Subset recombination fraction matrices
table_to_mappoly

Conversion of data.frame to mappoly.data
perm_pars

N!/2 combination
sim_cross_two_informative_parents

Simulate mapping population (tow parents)
tetra.solcap

Autotetraploid potato dataset.
plot.mappoly.homoprob

Plots mappoly.homoprob
segreg_poly

Polysomic segregation frequency
plot.mappoly.prefpair.profiles

Plots mappoly.prefpair.profiles
prepare_map

prepare maps for plot
maps.hexafake

Resulting maps from hexafake
select_rf

Select rf and lod based on thresholds
poly_hmm_est

Estimate genetic map using as input the probability distribution of genotypes (wrapper function to C++)
update_ph_list_at_hmm_thres

makes a phase list from map, selecting only configurations under a certain threshold
perm_tot

N! combination
solcap.err.map

Resulting maps from tetra.solcap
read_geno_prob

Data Input
read_vcf

Data Input VCF
reest_rf

Re-estimate the recombination fractions in a genetic map
tetra.solcap.geno.dist

Autotetraploid potato dataset with genotype probabilities.
sim_homologous

Simulate homology groups
update_framework_map

Add markers that are informative in both parents using HMM approach and evaluating difference in LOD and gap size
ph_list_to_matrix

Linkage phase format conversion: list to matrix
solcap.dose.map

Resulting maps from tetra.solcap
rf_snp_filter

Remove markers that do not meet a LOD criteria
read_geno

Data Input
read_fitpoly

Data Input in fitPoly format
ph_matrix_to_list

Linkage phase format conversion: matrix to list
v_2_m

Conversion: vector to matrix
plot_progeny_dosage_change

Display genotypes imputed or changed by the HMM chain given a global genotypic error
rev_map

Reverse map
rf_list_to_matrix

Recombination fraction list to matrix
update_missing

Update missing information
update_map

Update map
solcap.mds.map

Resulting maps from tetra.solcap
summary_maps

Summary maps
split_mappoly

Split map into sub maps given a gap threshold
sample_data

Random sampling of dataset
calc_genoprob_haplo

Compute conditional probabilities of the genotypes given a sequence of block markers
calc_genoprob_dist

Compute conditional probabilities of the genotypes using probability distribution of dosages
add_marker

Add a single marker to a map
calc_genoprob_single_parent

Compute conditional probabilities of the genotype (one informative parent)
calc_genoprob

Compute conditional probabilities of the genotypes
aggregate_matrix

Aggregate matrix cells (lower the resolution by a factor)
add_md_markers

Add markers to a pre-existing sequence using HMM analysis and evaluating difference in LOD
add_mrk_at_tail_ph_list

add a single marker at the tail of a linkage phase list
check_if_rf_is_possible

Check if it is possible to estimate the recombination fraction between neighbor markers using two-point estimation
cat_phase

cat for phase information
check_data_dose_sanity

Checks the consistency of dataset (dosage)
check_data_dist_sanity

Checks the consistency of dataset (probability distribution)