The function computes a pairwise matrix of genetic distances between populations and allows to implement several formula.
mat_gen_dist(x, dist = "basic", null_val = FALSE)
An object of class matrix
An object of class genind
that contains the multilocus
genotypes (format 'locus') of the individuals as well as their populations.
A character string indicating the method used to compute the multilocus genetic distance between populations
If 'dist = 'basic'' (default), then the multilocus genetic distance is computed using a formula of Euclidean genetic distance (Excoffier et al., 1992)
If 'dist = 'weight'', then the multilocus genetic distance is computed as in Fortuna et al. (2009). It is a Euclidean genetic distance giving more weight to rare alleles
If 'dist = 'PG'', then the multilocus genetic distance is computed as in popgraph::popgraph function, following several steps of PCA and SVD (Dyer et Nason, 2004).
If 'dist = 'DPS'', then the genetic distance used is equal to 1 - the proportion of shared alleles (Bowcock, 1994)
If 'dist = 'FST'', then the genetic distance used is the pairwise FST (Weir et Cockerham, 1984)
If 'dist = 'FST_lin'', then the genetic distance used is the linearised pairwise FST (Weir et Cockerham, 1984)(FST_lin = FST/(1-FST))
If 'dist = 'PCA'', then the genetic distance is computed following a PCA of the matrix of allelic frequencies by population. It is a Euclidean genetic distance between populations in the multidimensional space defined by all the independent principal components.
If 'dist = 'GST'', then the genetic distance used is the G'ST (Hedrick, 2005). See graph4lg <= 1.6.0 only, because it used diveRsity
If 'dist = 'D'', then the genetic distance used is Jost's D (Jost, 2008). See graph4lg <= 1.6.0 only, because it used diveRsity
(optional) Logical. Should negative and null FST, FST_lin, GST or D values be replaced by half the minimum positive value? This option allows to compute Gabriel graphs from these "distances". Default is null_val = FALSE. This option only works if 'dist = 'FST'' or 'FST_lin' or 'GST' or 'D'
P. Savary
Negative values are converted into 0. Euclidean genetic distance \(d_{ij}\) between population i and j is computed as follows: $$d_{ij}^{2} = \sum_{k=1}^{n} (x_{ki} - x_{kj})^{2} $$ where \(x_{ki}\) is the allelic frequency of allele k in population i and n is the total number of alleles. Note that when 'dist = 'weight'', the formula becomes $$d_{ij}^{2} = \sum_{k=1}^{n} (1/(K*p_{k}))(x_{ki} - x_{kj})^{2}$$ where K is the number of alleles at the locus of the allele k and \(p_{k}\) is the frequency of the allele k in all populations. Note that when 'dist = 'PCA'', n is the number of conserved independent principal components and \(x_{ki}\) is the value taken by the principal component k in population i.
bowcock1994highgraph4lg excoffier1992analysisgraph4lg dyer2004populationgraph4lg fortuna2009networksgraph4lg weir1984estimatinggraph4lg hedrick2005standardizedgraph4lg jost2008gstgraph4lg
data(data_ex_genind)
x <- data_ex_genind
D <- mat_gen_dist(x = x, dist = "basic")
Run the code above in your browser using DataLab