Learn R Programming

BigDataStatMeth (version 1.0.3)

bdDiag_subtract_hdf5: Subtract Diagonal Elements from HDF5 Matrices or Vectors

Description

Performs optimized diagonal subtraction between two datasets stored in HDF5 format. Automatically detects whether inputs are matrices (extracts diagonals) or vectors (direct operation) and uses the most efficient approach. This function is ~50-250x faster than traditional matrix operations for diagonal computations.

Usage

bdDiag_subtract_hdf5(
  filename,
  group,
  A,
  B,
  groupB = NULL,
  target = NULL,
  outgroup = NULL,
  outdataset = NULL,
  paral = NULL,
  threads = NULL,
  overwrite = NULL
)

Value

List with components:

fn

Character string with the HDF5 filename

ds

Character string with the full dataset path to the diagonal subtraction result (group/dataset)

Arguments

filename

String. Path to the HDF5 file containing the datasets.

group

String. Group path containing the first dataset (A, minuend).

A

String. Name of the first dataset (minuend).

B

String. Name of the second dataset (subtrahend).

groupB

Optional string. Group path containing dataset B. If NULL, uses same group as A.

target

Optional string. Where to write result: "A", "B", or "new" (default: "new").

outgroup

Optional string. Output group path. Default is "OUTPUT".

outdataset

Optional string. Output dataset name. Default is "A_-_B" with .diag suffix if appropriate.

paral

Optional logical. Whether to use parallel processing. Default is FALSE.

threads

Optional integer. Number of threads for parallel processing. If NULL, uses maximum available threads.

overwrite

Optional logical. Whether to overwrite existing datasets. Default is FALSE.

Details

This function provides flexible diagonal subtraction with automatic optimization:

  • Operation modes:

    • Matrix - Matrix: Extract diagonals → vector subtraction → save as vector

    • Matrix - Vector: Extract diagonal → vector subtraction → save as vector

    • Vector - Vector: Direct vector subtraction (most efficient)

  • Performance features:

    • Uses optimized vector operations for maximum efficiency

    • Automatic type detection and dimension validation

    • Memory-efficient processing for large datasets

    • Parallel processing support for improved performance

  • Validation checks:

    • Matrix inputs must be square (N×N)

    • Vector inputs must have compatible dimensions

    • Automatic dimension matching between operands

Examples

Run this code
if (FALSE) {
library(BigDataStatMeth)

# Create test matrices
N <- 1000
set.seed(123)
A <- matrix(rnorm(N*N), N, N)
B <- matrix(rnorm(N*N), N, N)

# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", A, "data", "matrixA",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix("test.hdf5", B, "data", "matrixB",
                     overwriteFile = FALSE)

# Subtract diagonals
result <- bdDiag_subtract_hdf5("test.hdf5", "data", "matrixA", "matrixB",
                              outgroup = "results",
                              outdataset = "diagonal_diff",
                              paral = TRUE)
}

Run the code above in your browser using DataLab