Adjacency-constrained hierarchical agglomerative clustering (HAC) is HAC in
which each observation is associated to a position, and the clustering is
constrained so as only adjacent clusters are merged. SNPs are clustered based
on their similarity as measured by the linkage disequilibrium.
In the special case where genotypes are given as input and the corresponding
LD matrix has missing entries, the clustering cannot be performed. This can
typically happen when there is insufficient variability in the sample
genotypes. In this special case, the indices of the SNP pairs which yield
missing values are returned.
If x
is of class
SnpMatrix
or matrix
,
it is assumed to be a \(n \times p\) matrix of \(p\) genotypes for
\(n\) individuals. This input is converted to a LD similarity matrix
using the snpStats::ld
. If x
is of class
dgCMatrix
, it is assumed to be a
(squared) LD matrix.
Clustering on a LD similarity other than "R.squared" or "D.prime" can be
performed by providing the LD values directly as argument x
. These
values are expected to be in [0,1], otherwise they are truncated to [0,1].