Learn R Programming

spider

Overview

The official GitHub repository for the R package "SPecies IDentity and Evolution in R" (spider).

spider provides functions for the analysis of species limits and DNA barcoding data. Included are functions for generating important summary statistics from DNA barcode data, assessing specimen identification efficacy, testing and optimizing divergence threshold limits, assessment of diagnostic nucleotides, and calculation of the probability of reciprocal monophyly. Additionally, a sliding window function offers opportunities to analyse information across a gene, often used for marker design in degraded DNA studies. Further information on the package has been published in Brown et al. (2012).

For an introduction to the package, visit our spider tutorial and manual. Over time, the tutorial will be expanded and moved into GitHub vignettes and project pages.

If you are interested in previous versions (before v1.5.0) of the spider source code, check out our old repository hosted at r-forge.

Installation

Stable CRAN version (NOT YET WORKING).

install.packages("spider")

Or development version from GitHub (WORKING).

devtools::install_github("boopsboops/spider")

Examples

Here, we will do a quick "best close match" analysis (Meier et al., 2006) on a Anoteropsis wolf spider dataset (Vink & Paterson, 2003) to see how well DNA barcodes can identify individuals in a simulated identification scenario.

# load up the data
library("spider")
data(anoteropsis)
# make a quick species vector (unique species name for each individual) from the taxon labels
anoSpp <- sapply(strsplit(rownames(anoteropsis), split="_"), function(x) paste(x[1], x[2]))
head(anoSpp, n=4)
#> [1] "Artoria flavimanus" "Artoria separata" "Anoteropsis adumbrata" "Anoteropsis adumbrata"
# get some statistics about the sequence lengths
seqStat(anoteropsis)
#> Min    Max   Mean Median Thresh 
#> 395    409    408    409     33
# make a distance matrix from raw p-distances
anoDist <- ape::dist.dna(anoteropsis, model="raw", pairwise.deletion=TRUE)
# calculate identification success based on a 1% interspecific threshold
table(bestCloseMatch(distobj=anoDist, sppVector=anoSpp, threshold=0.01))
#> correct incorrect     no id 
#>      11         2        20 

Current contributors

Meta

Copy Link

Version

Install

install.packages('spider')

Monthly Downloads

330

Version

1.5.0

License

MIT + file LICENSE

Maintainer

Rupert Collins

Last Published

February 16th, 2018

Functions in spider (1.5.0)

chaoHaplo

Chao estimator of haplotype number
sarkar

Dummy sequences illustrating the categories of diagnostic nucleotides
rnucDiag

Nucleotide diagnostics for species alignments
heatmapSpp

Visualise a distance matrix using a heatmap
search.BOLD

Downloads DNA sequences from the Barcode of Life Database (BOLD)
threshOpt

Threshold optimisation
slidingWindow

Create windows along an alignment
stats.BOLD

Downloads DNA sequences from the Barcode of Life Database (BOLD)
tajima.K

Calculate Tajima's K index of divergence
tiporder

Orders tip labels by their position on the tree.
slideNucDiag

Sliding nucleotide diagnostics
minInDist

Nearest non-conspecific and maximum intra-specific distances
paa

Population Aggregate Analysis
read.GB

Download sequences from Genbank with metadata.
plot.haploAccum

Plotting haplotype accumulation curves
nucDiag

Nucleotide diagnostics for species alignments
rosenberg

Rosenberg's probability of reciprocal monophyly
seeBarcode

Create illustrative barcodes
rankSlidWin

Rank a 'slidWin' object.
ordinDNA

Calculates a Principal Components Ordination of genetic distances
salticidae

Cytochrome oxidase I (COI) sequences of world-wide species of Salticidae
polyBalance

Balance of a phylogenetic tree with polytomies
slideAnalyses

Sliding window analyses
slideBoxplots

Boxplots across windows
spider-package

Species Identity and Evolution in R
sppDistMatrix

Mean intra- and inter-specific distance matrix
titv

Number of pairwise transitions and transversions in an alignment.
sppDist

Intra and inter-specific distances
seqStat

Sequence statistics
sppVector

Species Vectors
tree.comp

Tree comparisons
tclust

Clustering by a threshold
threshID

Measures of identification accuracy
woodmouse

Cytochrome b Gene Sequences of Woodmice
anoteropsis

Cytochrome oxidase I (COI) sequences of New Zealand _Anoteropsis_ species
blockAlignment

Make all sequences the same length
dolomedes

Cytochrome oxidase I (COI) sequences of New Zealand _Dolomedes_ species
dataStat

Taxa statistics
cgraph

Complete graph
monophylyBoot

Species monophyly over a tree
checkDNA

Check a DNA alignment for missing data
monophyly

Species monophyly over a tree
nearNeighbour

Measures of identification accuracy
nonConDist

Nearest non-conspecific and maximum intra-specific distances
bestCloseMatch

Measures of identification accuracy
is.ambig

Missing bases in alignments
localMinima

Determine thresholds from a density plot
maxInDist

Nearest non-conspecific and maximum intra-specific distances
plot.ordinDNA

Plot an 'ordinDNA' object
plot.slidWin

Plot a 'slidWin' object
haploAccum

Haplotype accumulation curves
read.BOLD

Downloads DNA sequences from the Barcode of Life Database (BOLD)
rmSingletons

Detect and remove singletons