rSVDdpd: Robust Singular Value Decomposition using Density Power Divergence

Description

rSVDdpd returns the singular value decomposition of a matrix with robust singular values in presence of outliers

Usage

rSVDdpd(
  X,
  alpha,
  nd = NA,
  maxrank = NA,
  tol = 1e-04,
  eps = 1e-04,
  maxiter = 100L,
  initu = NULL,
  initv = NULL
)

Value

A list containing different components of the decomposition $X = U D V'$

d - The robust singular values, namely the diagonal entries of $D$.
u - The matrix of left singular vectors $U$. Each column is a singular vector.
v - The matrix of right singular vectors $V$. Each column is a singular vector.

Arguments

X: matrix, whose singular value decomposition is required
alpha: numeric, robustness parameter between 0 and 1. See details for more.
nd: integer, must be lower than nrow(X) and ncol(X) both. If NA, determined by rank.rSVDdpd(X, alpha, maxrank)
maxrank: integer, maximum rank to be considered if nd is not specified. If NA, defaults to min(nrow(X), ncol(X))
tol: numeric, a tolerance level. If the residual matrix has lower norm than this, then subsequent singular values will be taken as 0.
eps: numeric, a tolerance level for the convergence of singular vectors. If in subsequent iterations the singular vectors do not change its norm beyond this, then the iteration will stop.
maxiter: integer, upper limit to the maximum number of iterations.
initu: matrix, initializing vectors for left singular values. Must be of dimension nrow(X) $\times$ min(nrow(X), ncol(X)). If NULL, defaults to random initialization.
initv: matrix, initializing vectors for right singular values. Must be of dimension ncol(X) $\times$ min(nrow(X), ncol(X)). If NULL, defaults to random initialization.

Details

The usual singular value decomposition is highly prone to error in presence of outliers, since it tries to minimize the $L_2$ norm of the errors between the matrix $X$ and its best lower rank approximation. While there is considerable effort to impose robustness using $L_1$ norm of the errors instead of $L_2$ norm, such estimation lacks efficiency. Application of density power divergence bridges the gap. $$DPD(f|g) = \int f^{(1+\alpha)} - (1 + \frac{1}{\alpha}) \int f^{\alpha}g + \frac{1}{\alpha} \int g^{(1 + \alpha)} $$ The parameter alpha should be between 0 and 1, if not, then a warning is shown. Lower alpha means less robustness but more efficiency in estimation, while higher alpha means high robustness but less efficiency in estimation. The recommended value of alpha is 0.3. The function tries to obtain the best rank one approximation of a matrix by minimizing this density power divergence of the true errors with that of a normal distribution centered at the origin.

References

S. Roy, A. Basu and A. Ghosh (2021), A New Robust Scalable Singular Value Decomposition Algorithm for Video Surveillance Background Modelling https://arxiv.org/abs/2109.10680

Examples

Run this code

X = matrix(1:20, nrow = 4, ncol = 5)
rSVDdpd(X, alpha = 0.3)

Run the code above in your browser using DataLab