Learn R Programming

crosshap

What does it do?

crosshap is an LD-based local haplotype analysis and visualization tool.

Given a genomic variant data for a region of interest, crosshap performs LD-based local haplotyping. Tightly linked variants are clustered into Marker Groups (MGs), and individuals are grouped into local haplotypes by shared allelic combinations of MGs. Following this, crosshap provides a range of visualization options to examine relevant characteristics of the linked Marker Groups and local haplotypes.

Why would I use it?

crosshap was originally designed to explore local haplotype patterns that may underlie phenotypic variability in quantitative trait locus (QTL) regions. It is ideally suited to complement and follow-up GWAS results (takes same inputs). crosshap equips users with the tools to explain why a region reported a GWAS hit, what variants are causal candidates, what populations are they present/absent in, and what the features are of those populations.

Alternatively, crosshap can simply be a tool to identify patterns of linkage among local variants, and to classify individuals based on shared haplotypes.

Note: crosshap is designed for in-depth, user-driven analysis of inheritance patterns in specific regions of interest, not genome-wide scans.

Installation

crosshap is available on CRAN:

install.packages("crosshap")

For the latest features, you can install the development version of crosshap from GitHub with:

# install.packages("devtools")
devtools::install_github("JacobIMarsh/crosshap")

Usage

Documentation

In short, a typical crosshap analysis workflow involves the following steps. For a detailed explanation and walk through, see our Getting started vignette.

  1. Read in raw inputs
read_vcf(region.vcf)
read_LD(plink.ld)
read_metadata(metadata.txt)
read_pheno(pheno.txt)
  1. Run local haplotyping at a range of epsilon values
HapObject <- run_haplotyping(vcf, LD, metadata, pheno, epsilon, MGmin)
  1. Build clustering tree to optimize epsilon value
clustree_viz(HapObject)
  1. Visualize local haplotypes and Marker Groups
crosshap_viz(HapObject, epsilon)

From here you can examine haplotype and Marker Group features from the visualization, and export relevant information from the haplotype object.

HapObject$Haplotypes_MGmin30_E0.6$Indfile
HapObject$Haplotypes_MGmin30_E0.6$Hapfile
HapObject$Haplotypes_MGmin30_E0.6$Varfile

Contact

For technical queries feel free to contact me: jacob.marsh@unc.edu . Please contact Prof. David Edwards for all other queries: dave.edwards@uwa.edu.au .

Copy Link

Version

Install

install.packages('crosshap')

Monthly Downloads

290

Version

1.4.0

License

MIT + file LICENSE

Maintainer

Jacob Marsh

Last Published

March 31st, 2024

Functions in crosshap (1.4.0)

vcf

Example VCF
build_top_metaplot

Top metadata-hap bar plot
read_metadata

Read metadata to tibble
clustree_viz

Clustering tree
read_LD

Read LD correlation matrix to tibble
read_vcf

Read VCF to tibble
crosshap-package

crosshap: Local Haplotype Clustering and Visualization
read_pheno

Read phenotype data to tibble
crosshap_viz

Visualize haplotypes
run_haplotyping

Cluster SNPs and identify haplotypes
run_hdbscan_haplotyping

Cluster SNPs with HDBSCAN and identify haplotypes
build_right_phenoplot

Right SNP-pheno phenoplot
build_summary_tables

Hap/MG summary tables
build_left_posplot

Left SNP-position plot
build_left_alleleplot

Left SNP-allele plot
build_mid_dotplot

Middle MG/hap dot plot
HapObject

Example Haplotype object
arith_mode

Mode utility function
build_bot_halfeyeplot

Bot hap-pheno raincloud plot
build_right_clusterplot

Right intra-cluster linkage plot
mean_na.rm

Mean utility function
LD

Example LD matrix
pheno

Example phenotype data
metadata

Example Domestication metadata
tagphenos

Calculate SNP phenotypic associations
prepare_hap_umap

UMAP haplotype visualization helper
%>%

Pipe operator
pseudo_haps

Identify haplotypes from clustered SNPs