This script generates a fixed difference matrix from a genlight object {adegenet} and from it generates a population recode table used to amalgamate populations with a fixed difference count less than or equal to a specified threshold, tpop. The script then repeats the process until there is no further amalgamation of populations.
gl.collapse.recursive(x, prefix = "collapse", tloc = 0, tpop = 1,
test = TRUE, alpha = 0.05, delta = 0.02, reps = 1000, v = 2)
-- name of the genlight object from which the distance matricies are to be calculated [required]
-- a string to be used as a prefix in generating the matricies of fixed differences (stored to disk) and the recode tables (also stored to disk) [default "collapse"]
-- threshold defining a fixed difference (e.g. 0.05 implies 95:5 vs 5:95 is fixed) [default 0]
-- max number of fixed differences allowed in amalgamating populations [default 0]
-- if TRUE, calculate p values for the observed fixed differences [default FALSE]
-- significance level for test of false positives [default 0.05]
-- threshold value for the population minor allele frequency (MAF) from which resultant sample fixed differences are considered true positives [default 0.02]
-- number of replications to undertake in the simulation to estimate probability of false positives [default 1000]
-- verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default 2]
A list containing the gl object x and the following square matricies [[1]] $gl -- the input genlight object; [[2]] $fd -- raw fixed differences; [[3]] $pcfd -- percent fixed differences; [[4]] $nobs -- mean no. of individuals used in each comparison; [[5]] $nloc -- total number of loci used in each comparison; [[6]] $expobs -- if test=TRUE, the expected count of false positives for each comparison [by simulation], otherwise NAs [[7]] $prob -- if test=TRUE, the significance of the count of fixed differences [by simulation], otherwise NAs
The distance matricies are generated by gl.fixed.diff(), a recode table is generated using gl.collapse() and the resultant recode table is applied to the genlight object using gl.recode.pop(). The process is repeated as many times as necessary to yield a final table with no fixed differences less than or equal to the specified threshold, tpop.
Optionally, if test=TRUE, the script will test the fixed differences between final OTUs for statistical significance, using simulation, and then further amalgamate populations that for which there are no significant fixed differences at a specified level of significance (alpha). To avoid conflation of true fixed differences with false positives in the simulations, it is necessary to decide a threshold value (delta) for extreme true allele frequencies that will be considered fixed for practical purposes. That is, fixed differences in the sample set will be considered to be positives (not false positives) if they arise from true allele frequencies of less than 1-delta in one or both populations. The parameter delta is typically set to be small (e.g. delta = 0.02).
The intermediate and final recode tables and distance matricies are stored to disk as csv files for use with other analyses. In particular, the recode tables can be edited to replace populaton labels with meaninful names and reapplied in sequence.
# NOT RUN {
fd <- gl.collapse.recursive(testset.gl, prefix="testset", test=TRUE, tloc=0, tpop=2, v=2)
# }
Run the code above in your browser using DataLab