Learn R Programming

BigDataStatMeth (version 1.0.3)

bdEigen_hdf5: Eigenvalue Decomposition for HDF5-Stored Matrices using Spectra

Description

Computes the eigenvalue decomposition of a large matrix stored in an HDF5 file using the Spectra library. This provides consistent results with the RSpectra package and can handle both symmetric and non-symmetric matrices.

Usage

bdEigen_hdf5(
  filename,
  group = NULL,
  dataset = NULL,
  k = NULL,
  which = NULL,
  ncv = NULL,
  bcenter = NULL,
  bscale = NULL,
  tolerance = NULL,
  max_iter = NULL,
  compute_vectors = NULL,
  overwrite = NULL,
  threads = NULL
)

Value

List with components:

fn

Character string with the HDF5 filename

values

Character string with the full dataset path to the eigenvalues (real part) (group/dataset)

vectors

Character string with the full dataset path to the eigenvectors (real part) (group/dataset)

values_imag

Character string with the full dataset path to the eigenvalues (imaginary part), or NULL if all eigenvalues are real

vectors_imag

Character string with the full dataset path to the eigenvectors (imaginary part), or NULL if all eigenvectors are real

is_symmetric

Logical indicating whether the matrix was detected as symmetric

Arguments

filename

Character string. Path to the HDF5 file containing the input matrix.

group

Character string. Path to the group containing the input dataset.

dataset

Character string. Name of the input dataset to decompose.

k

Integer. Number of eigenvalues to compute (default = 6, following Spectra convention).

which

Character string. Which eigenvalues to compute (default = "LM"):

  • "LM": Largest magnitude

  • "SM": Smallest magnitude

  • "LR": Largest real part (non-symmetric matrices)

  • "SR": Smallest real part (non-symmetric matrices)

  • "LI": Largest imaginary part (non-symmetric matrices)

  • "SI": Smallest imaginary part (non-symmetric matrices)

  • "LA": Largest algebraic (symmetric matrices)

  • "SA": Smallest algebraic (symmetric matrices)

ncv

Integer. Number of Arnoldi vectors (default = 0, auto-selected as max(2*k+1, 20)).

bcenter

Logical. If TRUE, centers the data by subtracting column means (default = FALSE).

bscale

Logical. If TRUE, scales the centered columns by their standard deviations (default = FALSE).

tolerance

Numeric. Convergence tolerance for Spectra algorithms (default = 1e-10).

max_iter

Integer. Maximum number of iterations for Spectra algorithms (default = 1000).

compute_vectors

Logical. If TRUE (default), computes both eigenvalues and eigenvectors.

overwrite

Logical. If TRUE, allows overwriting existing results (default = FALSE).

threads

Integer. Number of threads for parallel computation (default = NULL, uses available cores).

Details

This function uses the Spectra library (same as RSpectra) for eigenvalue computation, ensuring consistent results. Key features include:

  • Automatic detection of symmetric vs non-symmetric matrices

  • Support for both real and complex eigenvalues/eigenvectors

  • Memory-efficient block-based processing for large matrices

  • Parallel processing support

  • Various eigenvalue selection criteria

  • Consistent interface with RSpectra::eigs()

The implementation automatically:

  • Detects matrix symmetry and uses appropriate solver (SymEigsSolver vs GenEigsSolver)

  • Handles complex eigenvalues for non-symmetric matrices

  • Saves imaginary parts separately when non-zero

  • Provides the same results as RSpectra::eigs() function

References

  • Qiu, Y., & Mei, J. (2022). RSpectra: Solvers for Large-Scale Eigenvalue and SVD Problems.

  • Li, R. (2021). Spectra: C++ Library For Large Scale Eigenvalue Problems.

See Also

  • bdSVD_hdf5 for Singular Value Decomposition

  • bdPCA_hdf5 for Principal Component Analysis

  • RSpectra::eigs for the R equivalent function

Examples

Run this code
if (FALSE) {
library(BigDataStatMeth)
library(rhdf5)
library(RSpectra)

# Create a sample matrix (can be non-symmetric)
set.seed(123)
A <- matrix(rnorm(2500), 50, 50)

fn <- "test_eigen.hdf5"
bdCreate_hdf5_matrix_file(filename = fn, object = A, group = "data", dataset = "matrix")

# Compute eigendecomposition with BigDataStatMeth
res <- bdEigen_hdf5(fn, "data", "matrix", k = 6, which = "LM")

# Compare with RSpectra (should give same results)
rspectra_result <- eigs(A, k = 6, which = "LM")

# Extract results from HDF5
eigenvals_bd <- h5read(res$fn, res$values)
eigenvecs_bd <- h5read(res$fn, res$vectors)

# Compare eigenvalues (should be identical)
all.equal(eigenvals_bd, Re(rspectra_result$values), tolerance = 1e-12)

# For non-symmetric matrices, check imaginary parts
if (!is.null(res$values_imag)) {
  eigenvals_imag <- h5read(res$fn, res$values_imag)
  all.equal(eigenvals_imag, Im(rspectra_result$values), tolerance = 1e-12)
}

# Remove file
if (file.exists(fn)) {
  file.remove(fn)
}
}

Run the code above in your browser using DataLab