Visualizes the most influential feature interactions (based on the L2 norm) from Ridge Redundancy Analysis (RRDA) as a heatmap.
Let the (rank-\(r\) truncated) decomposition of \(\hat{B}(\lambda)\) be
$$\hat{B}(\lambda, r) = U_{\hat{B}(\lambda)} \, D_{\hat{B}(\lambda)} \, V_{\hat{B}(\lambda)}^{\prime}.$$
The following three biplot scalings are defined:
Symmetric scaling (default):
$$\tilde{F} = U_{\hat{B}(\lambda)} \, D_{\hat{B}(\lambda)}^{1/2}, \qquad
\tilde{G} = V_{\hat{B}(\lambda)} \, D_{\hat{B}(\lambda)}^{1/2}.$$
X scaling:
$$\tilde{F} = U_{\hat{B}(\lambda)} \, D_{\hat{B}(\lambda)}, \qquad
\tilde{G} = V_{\hat{B}(\lambda)}.$$
Y scaling:
$$\tilde{F} = U_{\hat{B}(\lambda)}, \qquad
\tilde{G} = V_{\hat{B}(\lambda)} \, D_{\hat{B}(\lambda)}.$$
In all three cases, \(\hat{B}(\lambda, r) = \tilde{F} \, \tilde{G}^{\prime}.\)
Variable importance is scored by the row-wise \(\ell_2\)-norms:
$$s_i^{(\tilde{F})} = \| \tilde{F}_{i,\cdot} \|_2, \qquad
s_j^{(\tilde{G})} = \| \tilde{G}_{j,\cdot} \|_2.$$
Selecting the top \(m_x\) predictors and \(m_y\) responses yields the submatrices of the scaled factor matrices (each with \(r\) columns).
The reduced coefficient submatrix is then
$$\hat{B}_{\mathrm{sub}}(\lambda, r) =
\tilde{F}_{\mathrm{sub}} \, \tilde{G}_{\mathrm{sub}}^{\prime}.$$
The matrix \(\hat{B}_{\mathrm{sub}}(\lambda, r)\) retains the dominant low-rank structure and is visualized as a heatmap (with \(m_x = m_y = 20\) by default).