Matrix (version 1.6-5)

rankMatrix: Rank of a Matrix

Description

Compute ‘the’ matrix rank, a well-defined functional in theory(*), somewhat ambiguous in practice. We provide several methods, the default corresponding to Matlab's definition.

(*) The rank of a $$n \times m$$ matrix $$A$$, $$rk(A)$$, is the maximal number of linearly independent columns (or rows); hence $$rk(A) \le min(n,m)$$.

Usage

rankMatrix(x, tol = NULL,
method = c("tolNorm2", "qr.R", "qrLINPACK", "qr",
sval = svd(x, 0, 0)$d, warn.t = TRUE, warn.qr = TRUE)qr2rankMatrix(qr, tol = NULL, isBqr = is.qr(qr), do.warn = TRUE) Value If x is a matrix of all 0 (or of zero dimension), the rank is zero; otherwise, typically a positive integer in 1:min(dim(x)) with attributes detailing the method used. There are rare cases where the sparse $$QR$$ decomposition “fails” in so far as the diagonal entries of $$R$$, the $$d_i$$ (see above), end with non-finite, typically NaN entries. Then, a warning is signalled (unless warn.qr / do.warn is not true) and NA (specifically, NA_integer_) is returned. Arguments x numeric matrix, of dimension $$n \times m$$, say. tol nonnegative number specifying a (relative, “scalefree”) tolerance for testing of “practically zero” with specific meaning depending on method; by default, max(dim(x)) * .Machine$double.eps is according to Matlab's default (for its only method which is our method="tolNorm2").

method

a character string specifying the computational method for the rank, can be abbreviated:

"tolNorm2":

the number of singular values >= tol * max(sval);

"qrLINPACK":

for a dense matrix, this is the rank of qr(x, tol, LAPACK=FALSE) (which is qr(...)$rank); This ("qr*", dense) version used to be the recommended way to compute a matrix rank for a while in the past. For sparse x, this is equivalent to "qr.R". "qr.R": this is the rank of triangular matrix $$R$$, where qr() uses LAPACK or a "sparseQR" method (see qr-methods) to compute the decomposition $$QR$$. The rank of $$R$$ is then defined as the number of “non-zero” diagonal entries $$d_i$$ of $$R$$, and “non-zero”s fulfill $$|d_i| \ge \mathrm{tol}\cdot\max(|d_i|)$$. "qr": is for back compatibility; for dense x, it corresponds to "qrLINPACK", whereas for sparse x, it uses "qr.R". For all the "qr*" methods, singular values sval are not used, which may be crucially important for a large sparse matrix x, as in that case, when sval is not specified, the default, computing svd() currently coerces x to a dense matrix. "useGrad": considering the “gradient” of the (decreasing) singular values, the index of the smallest gap. "maybeGrad": choosing method "useGrad" only when that seems reasonable; otherwise using "tolNorm2". %% FIXME say more sval numeric vector of non-increasing singular values of x; typically unspecified and computed from x when needed, i.e., unless method = "qr". warn.t logical indicating if rankMatrix() should warn when it needs t(x) instead of x. Currently, for method = "qr" only, gives a warning by default because the caller often could have passed t(x) directly, more efficiently. warn.qr in the $$QR$$ cases (i.e., if method starts with "qr"), rankMatrix() calls qr2rankMarix(.., do.warn = warn.qr), see below. qr an R object resulting from qr(x,..), i.e., typically inheriting from class "qr" or "sparseQR". isBqr logical indicating if qr is resulting from base qr(). (Otherwise, it is typically from Matrix package sparse qr.) do.warn logical; if true, warn about non-finite diagonal entries in the $$R$$ matrix of the $$QR$$ decomposition. Do not change lightly! Author Martin Maechler; for the "*Grad" methods building on suggestions by Ravi Varadhan. Details qr2rankMatrix() is typically called from rankMatrix() for the "qr"* methods, but can be used directly - much more efficiently in case the qr-decomposition is available anyway. See Also qr, svd. Examples Run this code  library(stats, pos = "package:base", verbose = FALSE) rankMatrix(cbind(1, 0, 1:3)) # 2 (meths <- eval(formals(rankMatrix)$method))

## a "border" case:
H12 <- Hilbert(12)
rankMatrix(H12, tol = 1e-20) # 12;  but  11  with default method & tol.
sapply(meths, function(.m.) rankMatrix(H12, method = .m.))
##       11     11         12   12       11        11
## The meaning of 'tol' for method="qrLINPACK" and *dense* x is not entirely "scale free"
rMQL <- function(ex, M) rankMatrix(M, method="qrLINPACK",tol = 10^-ex)
rMQR <- function(ex, M) rankMatrix(M, method="qr.R",     tol = 10^-ex)
sapply(5:15, rMQL, M = H12) # result is platform dependent
##  7  7  8 10 10 11 11 11 12 12 12  {x86_64}
sapply(5:15, rMQL, M = 1000 * H12) # not identical unfortunately
##  7  7  8 10 11 11 12 12 12 12 12
sapply(5:15, rMQR, M = H12)
##  5  6  7  8  8  9  9 10 10 11 11
sapply(5:15, rMQR, M = 1000 * H12) # the *same*
# \dontshow{
(r12 <- sapply(5:15, rMQR, M = H12))
stopifnot(identical(r12, sapply(5:15, rMQR, M = H12 / 100)),
identical(r12, sapply(5:15, rMQR, M = H12 * 1e5)))

rM1 <- function(ex, M) rankMatrix(M, tol = 10^-ex)
(r12 <- sapply(5:15, rM1, M = H12))
stopifnot(identical(r12, sapply(5:15, rM1, M = H12 / 100)),
identical(r12, sapply(5:15, rM1, M = H12 * 1e5)))
# }

## "sparse" case:
M15 <- kronecker(diag(x=c(100,1,10)), Hilbert(5))
sapply(meths, function(.m.) rankMatrix(M15, method = .m.))
#--> all 15, but 'useGrad' has 14.
sapply(meths, function(.m.) rankMatrix(M15, method = .m., tol = 1e-7)) # all 14

## "large" sparse
n <- 250000; p <- 33; nnz <- 10000
L <- sparseMatrix(i = sample.int(n, nnz, replace=TRUE),
j = sample.int(p, nnz, replace=TRUE),
x = rnorm(nnz))
(st1 <- system.time(r1 <- rankMatrix(L)))                # warning+ ~1.5 sec (2013)
(st2 <- system.time(r2 <- rankMatrix(L, method = "qr"))) # considerably faster!
r1[[1]] == print(r2[[1]]) ## -->  ( 33  TRUE )
# \dontshow{
stopifnot(r1[[1]] == 33, 33 == r2[[1]])
if(interactive() || nzchar(Sys.getenv("R_MATRIX_CHECK_EXTRA")))
stopifnot(st2[[1]] < 0.2) # seeing 0.03 (on ~ 2010-hardware; R 3.0.2)
# }
## another sparse-"qr" one, which failed'' till 2013-11-23:
set.seed(42)
f1 <- factor(sample(50, 1000, replace=TRUE))
f2 <- factor(sample(50, 1000, replace=TRUE))
f3 <- factor(sample(50, 1000, replace=TRUE))
D <- t(do.call(rbind, lapply(list(f1,f2,f3), as, 'sparseMatrix')))
dim(D); nnzero(D) ## 1000 x 150 // 3000 non-zeros (= 2%)
stopifnot(rankMatrix(D,           method='qr') == 148,
rankMatrix(crossprod(D),method='qr') == 148)

## zero matrix has rank 0 :
stopifnot(sapply(meths, function(.m.)
rankMatrix(matrix(0, 2, 2), method = .m.)) == 0)


Run the code above in your browser using DataCamp Workspace