slideAnalyses: Sliding window analyses

Description

Wraps a number of measures used in sliding window analyses into one easy-to-use function.

Usage

slideAnalyses(DNAbin, sppVector, width, interval = 1, 
    distMeasures = TRUE, treeMeasures = FALSE)

Arguments

DNAbin

A DNA alignment of class `DNAbin'.

sppVector

Species vector (see sppVector).

width

Desired width of windows in number of nucleotides.

interval

Distance between each window in number of nucleotides. Default of 1. Giving the option of 'codons' sets the size to 3.

distMeasures

Logical. Should distance measures be calculated? Default of TRUE.

treeMeasures

Logical. Should tree-based measures be calculated? Default of FALSE.

Value

An object of class 'slidWin' which is a list containing the following elements:
win_mono_outProportion of species that are monophyletic.
comp_outProportion of clades that are identical between the NJ tree calculated for the window and the tree calculated for the full dataset.
comp_depth_outProportion of shallow clades that are identical.
pos_tr_outIndex of window position for tree-based analyses.
noncon_outProportion of zero non-conspecific distances.
nd_outThe sum of diagnostic nucleotides for each species.
zero_outThe number of zero-length distances.
dist_mean_outOverall mean K2P distance of each window.
pos_outIndex of window position.
dat_zero_outNumber of zero inter-specific distances in the full dataset.
boxplot_outAlways FALSE. Required for plot.slidWin.
distMeasuresValue of argument. Required for plot.slidWin.
treeMeasuresValue of argument. Required for plot.slidWin.

Details

Distance measures include the following: proportion of zero non-conspecific distances, number of diagnostic nucleotides, number of zero-length distances, and overall mean distance.

Tree-based measures include the following: proportion of species that are monophyletic, proportion of clades that are identical between the neighbour joining tree calculated for the window and the tree calculated for the full dataset, and the latter with method="shallow".

Tree-based measures are a lot more time-intensive than distance measures. When dealing with lots of taxa and short windows, this part of the function can take hours.

Both distance and tree measures are calculated from a K2P distance matrix created from the data with the option pairwise.deletion = TRUE. When sequences with missing data are compared with other sequences, a NA distance results. These are ignored in the calculation of slideAnalyses distance metrics. However, the tree measures cannot cope with this missing data, and so no result is returned for windows where some sequences solely contain missing data.

Examples

Run this code

data(dolomedes)
doloDist <- dist.dna(dolomedes)
doloSpp <- substr(dimnames(dolomedes)[[1]], 1, 5)

slideAnalyses(dolomedes, doloSpp, 200, interval=10, treeMeasures=TRUE)

Run the code above in your browser using DataLab