ia(pop, sample = 0, method = 1, quiet = FALSE, missing = "ignore",
hist = TRUE, valuereturn = FALSE)
genind
object OR any fstat, structure, genetix,
genpop, or genalex formatted files.shufflepop
for details. TRUE
prints nothing.
FALSE
(defualt) will print the population name and progress bar.
missingno
for details.logical
if TRUE
, a histogram will be printed for
each population if there is sampling.logical
if TRUE
, the index values from the
reshuffled data is returned. If FALSE
(default), the index is
returned with associated p-values in a 4 element numeric vector.The calculation for the distance between two individuals at a single locus with a allelic states and a ploidy of k is as follows (except for Presence/Absence data): $$d = \displaystyle \frac{k}{2}\sum_{i=1}^{a} \mid A_{i} - B_{i}\mid$$ To find the total number of differences between two individuals over all loci, you just take d over m loci, a value we'll call D:
$$D = \displaystyle \sum_{i=1}^{m} d_i$$
These values are calculated over all possible combinations of individuals in the data set, ${n \choose 2}$ after which you end up with ${n \choose 2}\cdot{}m$ values of d and ${n \choose 2}$ values of D. Calculating the observed variances is fairly straightforward (modified from Agapow and Burt, 2001):
$$V_O = \frac{\displaystyle \sum_{i=1}^{n \choose 2} D_{i}^2 - \frac{(\displaystyle\sum_{i=1}^{n \choose 2} D_{i})^2}{{n \choose 2}}}{{n \choose 2}}$$
Calculating the expected variance is the sum of each of the variances of the individual loci. The calculation at a single locus, j is the same as the previous equation, substituting values of D for d:
$$var_j = \frac{\displaystyle \sum_{i=1}^{n \choose 2} d_{i}^2 - \frac{(\displaystyle\sum_{i=1}^{n \choose 2} d_i)^2}{{n \choose 2}}}{{n \choose 2}}$$
The expected variance is then the sum of all the variances over all m loci:
$$V_E = \displaystyle \sum_{j=1}^{m} var_j$$
Agapow and Burt showed that $I_A$ increases steadily with the number of loci, so they came up with an approximation that is widely used, $\bar r_d$. For the derivation, see the manual for multilocus.
$$\bar r_d = \frac{V_O - V_E} {2\displaystyle \sum_{j=1}^{m}\displaystyle \sum_{k \neq j}^{m}\sqrt{var_j\cdot{}var_k}}$$
A.H.D. Brown, M.W. Feldman, and E. Nevo. Multilocus structure of natural populations of Hordeum spontaneum. Genetics, 96(2):523-536, 1980.
J M Smith, N H Smith, M O'Rourke, and B G Spratt. How clonal are bacteria? Proceedings of the National Academy of Sciences, 90(10):4384-4388, 1993.
poppr
, missingno
,
import2genind
, read.genalex
,
clonecorrect
data(nancycats)
ia(nancycats)
# Get the indices back and plot them using base R graphics:
nansamp <- ia(nancycats, sample = 999, valuereturn = TRUE)
layout(matrix(c(1,1,2,2,), 2, 2, byrow = TRUE))
hist(nansamp$samples$Ia); abline(v = nansamp$index[1])
hist(nansamp$samples$rbarD); abline(v = nansamp$index[3])
# Get the index for each population.
lapply(seppop(nancycats), ia)
# With sampling
lapply(seppop(nancycats), ia, sample=999)
Run the code above in your browser using DataLab