Learn R Programming

⚠️There's a newer version (2.3.1) of this package.Take me there.

pophelper 2.0.0

pophelper is an R package and web app to analyse and visualise population structure. pophelper curently supports output run files generated from population analysis programs such as STRUCTURE, TESS and numeric delimited formats such as ADMIXTURE or fastSTRUCTURE. The pophelper package can be used to tabulate runs, summarise runs, estimate K using the Evanno method, export files for CLUMPP, export files for DISTRUCT and generate barplot figures.

For a detailed demonstration and walkthrough, refer the online vignette. A quick intro video is available here. New versions and updates are shown only on this GitHub page.

Installation

You need to have R (> 3.3.0) statistical package installed on your system. R is open-source and freely available to download for Windows, Mac and other OS. Then, install the devtools package. Then, you can install pophelper from github using the devtools package.
Note that functions, function names, workflow and arguments are significantly different in this version (2.0.0) compared to previous versions. This version is not back compatible with previous versions. ggplot2 version must be 2.2.0 or above.

# install devtools package from CRAN
install.packages('devtools',dependencies=T)
library(devtools)

# install pophelper package from GitHub
install_github('royfrancis/pophelper')

# load library for use
library(pophelper)

Note that pophelper 1.2.0 and later includes binary executables for CLUMPP and DISTRUCT. This is experimental and may not work on all OS and versions.

pophelper 2.0.0 has been tested on the following systems:

  • Windows 10 64bit, R 3.3.2
  • Windows 10 64bit, R 3.3.1
  • Windows 7 64bit, R 3.3.2
  • Windows 7 64bit, R 3.3.0
  • Ubuntu Linux 16.04.1 LTS 64bit, R 3.3.2 (DISTRUCT executable does not work)
  • Scientific Linux 6.8 (Carbon) 64bit, R 3.3.1

Web App

An online interactive version of pophelper is available at pophelper.com. Not all features of the R package is supported in the online app.

List of Functions

For help on any function, use
?tabulateQ
?evannoMethodStructure

readQ()                   # Convert q-matrix run files to qlist
tabulateQ()               # Tabulate a qlist
summariseQ()              # Summarise an output from tabulateQ()
clumppExport()            # Generate CLUMPP input/output files
collectClumppOutput()     # Collect CLUMPP output into a common directory
plotQ()                   # Create barplots from qlist
PlotQMultiline()          # Create multi-line barplots from qlist
distructExport()          # Export files for DISTRUCT from qlist

evannoMethodStructure()   # Perform the Evanno method for STRUCTURE data
collectRunsTess()         # Collect TESS output from multiple directories into one

analyseQ()                # A wrapper function to quickly tabulate, summarise, 
                          # perform evanno method, clumpp output and generate
                          # barplots from filenames/paths.

Workflow


Fig. 1: Workflow for all filetypes. Files/objects are indicated in black text and functions are indicated in blue. External executables are indicated in green. For CLUMPP and DISTRUCT results, the executables must be run manually or set argument useexe=T. The analyseQ() function is a wrapper function which can be used to run several functions together. This is indicated by the orange path. Functions indicated with superscripts 1 and 2 are only applicable to specific filetypes. collectRunsTess() is used to consolidate and rename TESS runs. evannoMethodStructure() is used to perform the Evanno method for STRUCTURE runs.


Fig. 2: Plots from Evanno method.


Fig. 3: Left: Single run plotted separately without group labels (top) and with group labels (bottom). Right: Two runs joined together in one image without group labels (top) and with group labels (bottom).


Fig. 4: (A) Left: Single run plotted with two sets of group labels. (B) Right: Two runs joined together along with two sets of group labels and strip panel turned off (sp=F).


Fig. 5: Left: Single run plotted separately with group labels sorted by one cluster (top) and sorted by all clusters (bottom). Right Top: Single run with group labels reordered. Pop B before Pop A. Right Bottom: Two runs joined together in one image with group labels reordered and individuals sorted by all clusters.


Fig. 6: Left: Combined files (Three STRUCTURE runs for K=4). Middle: Aligned files (Three STRUCTURE runs for K=4 aligned using CLUMPP). Right: Merged file (Three runs for K=4 merged into one table/figure using CLUMPP).


Fig. 7: Left: plotQMultiline default output. Middle Left: Modified output where samples per line and lines per page were defined manually. Middle Right: Individuals sorted by cluster 1. Right: Individuals sorted by all clusters.

Fig. 8: Multiline plots with (left) standard colours, (middle) rich.colors() from gplots package and (right) brewer.pal(8,"Spectral") from RColorBrewer package.

For detailed demonstration and description, refer the vignette.

Disclaimer

The pophelper R package is offered free and without warranty of any kind, either expressed or implied. I will not be held liable to you for any damage arising out of the use, modification or inability to use this program. pophelper R package can be used, redistributed and/or modified freely for non-commercial purposes subject to the original source being properly cited. Licensed under GPL-3. Please make sure you verify all your results.

Contact

If you have an comments, suggestions, corrections or ideas on ways to improve or extend this package, feel free to contact me. Submit a report on the Github issues page.

2016 Roy M Francis

Copy Link

Version

Version

2.0.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Roy Francis

Last Published

December 29th, 2020

Functions in pophelper (2.0.0)

clumppExport

Generate CLUMPP output from a qlist
clumppExportAdmixture

Deprecated
getColours

Internal: Get Colours
plotMultiline

Deprecated.
summariseQ

Summarise a tabulated dataframe
getColors

Internal: Generate colours based on number of K
summariseRunsStructure

Deprecated
llToUtmzone

Internal: Find UTM zone from a latitude and longitude
summariseRunsTess

Deprecated
runsToDfTess

Deprecated.
clumppExportMatrix

Deprecated
clumppExportStructure

Deprecated
getDim

Internal: Get dimensions for figures.
getOS

Internal: Find current OS
getPlotParams

Internal: Generate parameters for plots with labels
grpLabels

Internal: Handles grp subset/order
readQClumpp

Convert CLUMPP format numeric text files to qlist.
readQStructure

Convert STRUCTURE run files to qlist.
summariseRunsAdmixture

Deprecated
summariseRunsMatrix

Deprecated
checkQ

Internal: Check input filetype.
checkRuns

Internal: Check if a selected run is STRUCTURE, TESS, MATRIX or TAB file.
collectRunsTess

Collect TESS cluster run files from multiple folders
determineRowsAndCols

Internal: Determine rows and columns for arbitrary number of plots
plotRuns

Deprecated.
plotRunsInterpolate

Deprecated.
readQ

Convert run files (q-matrices) to qlist.
readQBasic

Convert delimited text files to qlist.
tabulateRunsMatrix

Deprecated
tabulateRunsStructure

Deprecated
clumppExportTess

Deprecated
collectClumppOutput

Collect CLUMPP output files from multiple folders
distructColours

Internal: Vector of 90 Distruct colours
distructExport

Generate files for DISTRUCT.
plotQ

Generate barplots from qlists.
plotQMultiline

Plot a qlist as individual-level multiline barplot
tabulateQ

Tabulate runs from a qlist
tabulateRunsAdmixture

Deprecated
plotRunsSpatial

Deprecated.
popLabels

Internal: Handles pop subset/order
runsToDfMatrix

Deprecated.
runsToDfStructure

Deprecated.
analyseQ

Analyse STRUCTURE, TESS or BASIC text runs. Wrapper around several smaller functions.
analyseRuns

Deprecated.
ellipseCI

Internal: ellipseCI
evannoMethodStructure

Perform the Evanno method for STRUCTURE runs.
readQTess

Convert TESS cluster files to qlist.
runsToDfAdmixture

Deprecated.
tabulateRunsTess

Deprecated
unitConverter

Internal: Convert value between dimension units