Performs a recursive hierarchical clustering on an opposing-homozygotes (OH)
matrix using Ward clustering. Clusters are split until the maximum within-
cluster OH value is below a threshold computed from the number of SNPs
(snpNooh) using a linear rule.
.prSimple(oh, snpNooh, intercept = 26.3415, coefficient = 77.3171)A data.frame with columns:
Individual ID (character).
An integer-like group code (generated randomly; not reproducible).
A numeric matrix representing opposing-homozygotes (OH) counts between individuals. Row and column names should be individual IDs. The matrix is expected to be square and symmetric.
Numeric scalar. Number of SNPs used for OH calculation (or a proxy for SNP density) used to derive the stopping threshold.
Numeric scalar. Intercept for the linear threshold rule.
Numeric scalar. Slope for the linear threshold rule.
This function writes to and reads from a file named "temp.txt" in the
current working directory, and then deletes it.
The threshold is computed as: $$maxsnpnooh = (intercept + coefficient * snpNooh) - 15 * snpNooh$$
The function returns a two-column data frame with individual IDs and a group
code. Group codes are generated randomly (via rnorm()) and therefore
are not stable across runs.
The recursion proceeds as follows:
Compute a distance object from oh using .fastdist and
convert it to a dist object.
Apply hierarchical clustering using hclust with
method = "ward.D".
Cut the dendrogram into two groups using cutree.
For each group, compute the maximum within-group OH value; if it
exceeds maxsnpnooh and the group has more than two individuals,
recurse into that subgroup. Otherwise, write group assignments and stop.