Learn R Programming

BigDataStatMeth (version 1.0.3)

bdBind_hdf5_datasets: Bind matrices by rows or columns

Description

This function merges existing matrices within an HDF5 data file either by combining their rows (stacking vertically) or columns (joining horizontally). It provides functionality similar to R's rbind and cbind operations.

Usage

bdBind_hdf5_datasets(
  filename,
  group,
  datasets,
  outgroup,
  outdataset,
  func,
  overwrite = FALSE
)

Value

A list containing the location of the combined dataset:

fn

Character string. Path to the HDF5 file containing the result

ds

Character string. Full dataset path to the bound/combined dataset within the HDF5 file

Arguments

filename

Character array indicating the name of the file to create

group

Character array indicating the input group containing the datasets

datasets

Character array specifying the input datasets to bind

outgroup

Character array indicating the output group for the merged dataset. If NULL, output is stored in the same input group

outdataset

Character array specifying the name for the new merged dataset

func

Character array specifying the binding operation: - "bindRows": Merge datasets by rows (vertical stacking) - "bindCols": Merge datasets by columns (horizontal joining) - "bindRowsbyIndex": Merge datasets by rows using an index

overwrite

Boolean indicating whether to overwrite existing datasets. Defaults to false

Details

The function performs dimension validation before binding:

  • For row binding: All datasets must have the same number of columns

  • For column binding: All datasets must have the same number of rows

Memory efficiency is achieved through:

  • Block-wise reading and writing

  • Minimal data copying

  • Proper resource cleanup

Examples

Run this code
if (FALSE) {
library(BigDataStatMeth)

# Create test matrices
a <- matrix(1:12, 4, 3)
b <- matrix(13:24, 4, 3)

# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", a, "data", "A")
bdCreate_hdf5_matrix("test.hdf5", b, "data", "B")

# Bind by rows
bdBind_hdf5_datasets("test.hdf5", "data", 
                     c("A", "B"),
                     "results", "combined",
                     "bindRows")
}

Run the code above in your browser using DataLab