Learn R Programming

admixr

Introduction

ADMIXTOOLS is a widely used software package for calculating admixture statistics and testing population admixture hypotheses.

A typical ADMIXTOOLS workflow often involves a combination of sed/awk/shell scripting and manual editing to create different configuration files. These are then passed as command-line arguments to one of ADMIXTOOLS commands, and control how to run a particular analysis. The results are then redirected to another file, which has to be parsed by the user to extract values of interest, often using command-line utilities again or (worse) by manual copy-pasting. Finally, the processed results are analysed in R, Excel or another program.

This workflow can be a little cumbersome, especially if one wants to explore many hypotheses involving different combinations of populations. Most importantly, however, it makes it difficult to follow the rules of best practice for reproducible science, as it is nearly impossible to construct reproducible automated "pipelines".

This R package makes it possible to perform all stages of an ADMIXTOOLS analysis entirely from R. It provides a set of convenient functions that completely remove the need for "low level" configuration of individual ADMIXTOOLS programs, allowing users to focus on the analysis itself.

How to cite

admixr is now published as an Application Note in the journal Bioinformatics. If you use it in your work, please cite the paper!

Installation instructions

To install admixr from Github you need to install the package devtools first. You can simply run:

install.packages("devtools")
devtools::install_github("bodkan/admixr")

If you want to update admixr to a more recent version, simply run devtools::install_github("bodkan/admixr") again.

Note that in order to use the admixr package, you need a working installation of ADMIXTOOLS! You can find installation instructions here.

Furthermore, you need to make sure that R can find ADMIXTOOLS binaries on the $PATH. If this is not the case, running library(admixr) will show a warning message with instructions on how to fix this.

Follow me on Twitter if you want to stay updated on new admixr developments.

Example

This is all the code that you need to perform ADMIXTOOLS analyses using this package! No shell scripting, no copy-pasting and manual editing of text files. The only thing you need is a working ADMIXTOOLS installation and a path to EIGENSTRAT data (a trio of ind/snp/geno files), which we call prefix here.

library(admixr)

# download a small testing dataset to a temporary directory and
# process it for use in R
snp_data <- eigenstrat(download_data())

result <- d(
  W = c("French", "Sardinian"), X = "Yoruba", Y = "Vindija", Z = "Chimp",
  data = snp_data
)

result
#> # A tibble: 2 x 10
#>   W         X      Y       Z          D  stderr Zscore  BABA  ABBA  nsnps
#>   <chr>     <chr>  <chr>   <chr>  <dbl>   <dbl>  <dbl> <dbl> <dbl>  <dbl>
#> 1 French    Yoruba Vindija Chimp 0.0313 0.00693   4.51 15802 14844 487753
#> 2 Sardinian Yoruba Vindija Chimp 0.0287 0.00679   4.22 15729 14852 487646

Note that a single call to the d function generates all required intermediate config and population files, runs ADMIXTOOLS, parses its log output and returns the result as a data.frame object. It does all of this behind the scenes, without the user having to deal with low-level technical details.

To see many more examples, please check out the tutorial vignette.

Copy Link

Version

Install

install.packages('admixr')

Monthly Downloads

284

Version

0.9.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Martin Petr

Last Published

July 3rd, 2020

Functions in admixr (0.9.1)

count_snps

Count the number/proportion of present/missing sites in each sample
f4ratio

Calculate the D, f4, f4-ratio, or f3 statistic.
print.EIGENSTRAT

EIGENSTRAT print method
merge_eigenstrat

Merge two sets of EIGENSTRAT datasets
filter_bed

Filter EIGENSTRAT data based on a given BED file
loginfo

Print the full log output of an admixr wrapper to the console.
eigenstrat

EIGENSTRAT data constructor
%>%

Pipe operator
keep_transitions

Remove transversions (C->T and G->A substitutions)
reset

Reset modifications to an EIGENSTRAT object
relabel

Change labels of populations or samples
read_output

Read an output file from one of the ADMIXTOOLS programs.
download_data

Download example SNP data.
transversions_only

Remove transversions (C->T and G->A substitutions)
write_ind

Write an EIGENSTRAT ind/snp/geno file.
read_ind

Read an EIGENSTRAT ind/snp/geno file.
qpWave

Find the most likely number of ancestry waves using the qpWave method.
print.admixr_result

Print out the admixr result object (dataframe or a list) without showing the hidden attributes.
qpAdm

Calculate ancestry proportions in a set of target populations.
qpAdm_filter

Filter qpAdm rotation results for only 'sensible' models
qpAdm_rotation

Fit qpAdm models based on the rotation strategy described in Harney et al. 2020 (bioRxiv)