We recommend that imputation be performed on sampling locations, before
any aggregation. Imputation is achieved by replacing missing values using
either of two methods:
If "frequency", genotypes scored as missing at a locus in an individual
are imputed using the average allele frequencies at that locus in the
population from which the individual was drawn.
If "HW", genotypes scored as missing at a locus in an individual are
imputed by sampling at random assuming Hardy-Weinberg equilibrium. Applies
only to genotype data.
If "neighbour", substitute the missing values for the focal individual
with the values taken from the nearest neighbour. Repeat with next nearest
and so on until all missing values are replaced.
if "random", missing data are substituted by random values (0, 1 or 2).
The nearest neighbour is the one with the smallest Euclidean distance in
all the dataset.
The advantage of this approach is that it works regardless of how many
individuals are in the population to which the focal individual belongs,
and the displacement of the individual is haphazard as opposed to:
(a) Drawing the individual toward the population centroid (HW and Frequency).
(b) Drawing the individual toward the global centroid (glPCA).
Note that loci that are missing for all individuals in a population are not
imputed with method 'frequency' or 'HW'. Consider using the function
gl.filter.allna
with by.pop=TRUE to remove them first.