Let \(X \in \mathbb{R}^{n \times p}\) be a
numeric matrix with rows as observations and columns as variables, and let
\(\mathbf{1} \in \mathbb{R}^n\) denote the all-ones vector. Define the column
means \(\mu = (1/n)\,\mathbf{1}^\top X\) and the centred cross-product
matrix
$$ S \;=\; (X - \mathbf{1}\mu)^\top (X - \mathbf{1}\mu)
\;=\; X^\top \!\Big(I_n - \tfrac{1}{n}\mathbf{1}\mathbf{1}^\top\Big) X
\;=\; X^\top X \;-\; n\,\mu\,\mu^\top. $$
The (unbiased) sample covariance is
$$ \widehat{\Sigma} \;=\; \tfrac{1}{n-1}\,S, $$
and the sample standard deviations are \(s_i = \sqrt{\widehat{\Sigma}_{ii}}\).
The Pearson correlation matrix is obtained by standardising \(\widehat{\Sigma}\), and it is given by
$$ R \;=\; D^{-1/2}\,\widehat{\Sigma}\,D^{-1/2}, \qquad
D \;=\; \mathrm{diag}(\widehat{\Sigma}_{11},\ldots,\widehat{\Sigma}_{pp}), $$
equivalently, entrywise \(R_{ij} = \widehat{\Sigma}_{ij}/(s_i s_j)\) for
\(i \neq j\) and \(R_{ii} = 1\). With \(1/(n-1)\) scaling,
\(\widehat{\Sigma}\) is unbiased for the covariance; the induced
correlations are biased in finite samples.
The implementation forms \(X^\top X\) via a BLAS
symmetric rank-\(k\) update (SYRK) on the upper triangle, then applies the
rank-1 correction \(-\,n\,\mu\,\mu^\top\) to obtain \(S\) without
explicitly materialising \(X - \mathbf{1}\mu\). After scaling by
\(1/(n-1)\), triangular normalisation by \(D^{-1/2}\) yields \(R\),
which is then symmetrised to remove round-off asymmetry. Tiny negative values
on the covariance diagonal due to floating-point rounding are truncated to
zero before taking square roots.
If a variable has zero variance (\(s_i = 0\)), the
corresponding row and column of \(R\) are set to NA. No missing
values are permitted in \(X\); columns must have at least two distinct,
non-missing values.
Computational complexity. The dominant cost is \(O(n p^2)\) flops
with \(O(p^2)\) memory.