Learn R Programming

BigDataStatMeth (version 1.0.3)

bdblockSubstract_hdf5: HDF5 dataset subtraction

Description

Performs optimized block-wise subtraction between two datasets stored in HDF5 format. Supports both matrix-matrix and matrix-vector operations with memory-efficient block processing.

Usage

bdblockSubstract_hdf5(
  filename,
  group,
  A,
  B,
  groupB = NULL,
  block_size = NULL,
  paral = NULL,
  threads = NULL,
  outgroup = NULL,
  outdataset = NULL,
  overwrite = NULL
)

Value

A list containing the location of the subtraction result:

fn

Character string. Path to the HDF5 file containing the result

ds

Character string. Full dataset path to the subtraction result (A - B) within the HDF5 file

Arguments

filename

String indicating the HDF5 file path

group

String indicating the group containing matrix A

A

String specifying the dataset name for matrix A

B

String specifying the dataset name for matrix B

groupB

Optional string indicating group containing matrix B. If NULL, uses same group as A

block_size

Optional integer specifying block size for processing. If NULL, automatically determined based on matrix dimensions

paral

Optional boolean indicating whether to use parallel processing. Default is false

threads

Optional integer specifying number of threads for parallel processing. If NULL, uses maximum available threads

outgroup

Optional string specifying output group. Default is "OUTPUT"

outdataset

Optional string specifying output dataset name. Default is "A_-_B"

overwrite

Optional boolean indicating whether to overwrite existing datasets. Default is false

Details

The function implements optimized subtraction through:

Operation modes:

  • Matrix-matrix subtraction (A - B)

  • Matrix-vector subtraction

  • Vector-matrix subtraction

Block processing:

  • Automatic block size selection

  • Memory-efficient operations

  • Parallel computation support

Block size optimization based on:

  • Matrix dimensions

  • Available memory

  • Operation type (matrix/vector)

Error handling:

  • Dimension validation

  • Resource management

  • Exception handling

Examples

Run this code
if (FALSE) {
library(BigDataStatMeth)

# Create test matrices
N <- 1500
M <- 1500
set.seed(555)
a <- matrix(rnorm(N*M), N, M)
b <- matrix(rnorm(N*M), N, M)

# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", a, "data", "A",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix("test.hdf5", b, "data", "B",
                     overwriteFile = FALSE)

# Perform subtraction
bdblockSubstract_hdf5("test.hdf5", "data", "A", "B",
                      outgroup = "results",
                      outdataset = "diff",
                      block_size = 1024,
                      paral = TRUE)
}

Run the code above in your browser using DataLab