Learn R Programming

detectRUNS

detectRUNS is a R package for the detection of runs of homozygosity (ROH/ROHom) and of heterozygosity (ROHet, a.k.a. "heterozygosity-rich regions") in diploid genomes. Besides runs detection, it implements several functions to summarize and plot results.

Installation

detectRUNS is installed as a standard R package. Some core functions are written in C++ to increase efficieny of calculations: this makes use of the R library Rcpp. detectRUNS uses other R packages for data manipulation and plots. These packages are set as Imports, and detectRUNS will try to install any missing packages upon installation.

Dependencies

detectRUNS imports: plyr, iterators, itertools, ggplot2, reshape2, Rcpp, gridExtra, data.table detectRUNS suggests: testthat, knitr, rmarkdown, prettydoc

Documentation

Please see the package vignette for a complete tutorial. What follows is a minimal working example to give the gist of the tool.

Example

This is a basic example which shows you how to detect runs of homozygosity (ROH):

#1) detectRUNS (sliding-windows method)
genotypeFile <- system.file("extdata", "Kijas2016_Sheep_subset.ped", package = "detectRUNS")
mapFile <- system.file("extdata", "Kijas2016_Sheep_subset.map", package = "detectRUNS")
# calculating runs with sliding window approach
\dontrun{
 # skipping runs calculation
 runs <- slidingRUNS.run(genotypeFile, mapFile, windowSize = 15, threshold = 0.1,
 minSNP = 15, ROHet = FALSE,  maxOppWindow = 1, maxMissWindow = 1, maxGap=10^6,
 minLengthBps = 100000,  minDensity = 1/10000)
}
# loading pre-calculated data
runsFile <- system.file("extdata", "Kijas2016_Sheep_subset.sliding.csv", package="detectRUNS")
colClasses <- c(rep("character", 3), rep("numeric", 4)  )
runs <- read.csv2(runsFile, header = TRUE, stringsAsFactors = FALSE,  colClasses = colClasses)

#2) summarise results
summaryList <- summaryRuns(runs = runs, mapFile = mapFilePath, genotypeFile = genotypeFilePath, Class = 6, snpInRuns = TRUE)

#3) plot results
plot_Runs(runs = runs)

Copy Link

Version

Install

install.packages('detectRUNS')

Monthly Downloads

705

Version

0.9.6

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Filippo Biscarini

Last Published

October 24th, 2019

Functions in detectRUNS (0.9.6)

plot_Runs

Function to plot runs per individual
slidingRuns

Function to detect runs using sliding window approach
homoZygotTestCpp

Function to check whether a window is (loosely) homozygous or not
slidingRUNS.run

Main function to detect RUNS (ROHom/ROHet) using sliding windows (a la Plink)
slidingWindow

Function to slide a window over a vector (individual's genotypes)
pedConvertCpp

Convert ped genotypes to 0/1
writeRUN

Function to write out RUNS per individual animal
plot_DistributionRuns

Plot Distribution of runs
slidingWindowCpp

Function to slide a window over a vector (individual's genotypes)
plot_InbreedingChr

Plot Froh-based inbreeding coefficients by group
heteroZygotTest

Function to check whether a window is (loosely) heterozygous or not
heteroZygotTestCpp

Function to check whether a window is (loosely) heterozygous or not
plot_StackedRuns

Plot stacked runs
plot_PatternRuns

Plot sum of run-lengths (or average run-lengths) against the number of runs per individual
readPOPCpp

Function to return a dataframe of population (POP, ID)
summaryRuns

Summary statistics on detected runs
reorderDF

Function to reorder data frames by CHROMOSOME
plot_SnpsInRuns

Plot the number of times each SNP falls inside runs
plot_manhattanRuns

Plot the proportion of times SNPs are inside runs - MANHATTAN PLOT
readExternalRuns

Read runs from external files
Froh_inbreeding

Function to calculated Froh genome-wide or chromosome-wide
tableRuns

Function to retrieve most common runs in the population
Froh_inbreedingClass

Function to calculated Froh using a ROH-class
plot_ViolinRuns

Violin plot of run length per individual (either sum or mean)
snpInsideRuns

Function to count number of times a SNP is in a RUN
snpInRun

Function to return a vector of T/F for whether a SNP is or not in a RUN
snpInsideRunsCpp

Function to count number of times a SNP is in a RUN
snpInRunCpp

Function to return a vector of T/F for whether a SNP is or not in a RUN
chromosomeLength

Function to found max position for each chromosome
genoConvertCpp

Convert 0/1/2 genotypes to 0/1
consecutiveRUNS.run

Main function to detect genomic RUNS (ROHom/ROHet) using the consecutive method
findOppositeAndMissing

Function to calculate oppositeAndMissingGenotypes array
genoConvert

Convert 0/1/2 genotypes to 0/1
createRUNdf

Function to create a dataframe of RUNS per individual animal Requires a map file (other filename to read or R object) Parameters on maximum number of missing and opposite genotypes in the run (not the window) are implemented here
consecutiveRuns

Function to detect consecutive runs in a vector (individual's genotypes)
consecutiveRunsCpp

Function to detect consecutive runs in a vector (individual's genotypes)
homoZygotTest

Function to check whether a window is (loosely) homozygous or not