Learn R Programming

BigDataStatMeth (version 1.0.3)

bdapply_Function_hdf5: Apply function to different datasets inside a group

Description

This function provides a unified interface for applying various mathematical operations to HDF5 datasets. It supports both single-dataset operations and operations between multiple datasets.

Usage

bdapply_Function_hdf5(
  filename,
  group,
  datasets,
  outgroup,
  func,
  b_group = NULL,
  b_datasets = NULL,
  overwrite = FALSE,
  transp_dataset = FALSE,
  transp_bdataset = FALSE,
  fullMatrix = FALSE,
  byrows = FALSE,
  threads = 2L
)

Value

Modifies the HDF5 file in place, adding computed results

Arguments

filename

Character array, indicating the name of the file to create

group

Character array, indicating the input group where the data set to be imputed is

datasets

Character array, indicating the input datasets to be used

outgroup

Character array, indicating group where the data set will be saved after imputation. If NULL, output dataset is stored in the same input group

func

Character array, function to be applied: - "QR": QR decomposition via bdQR() - "CrossProd": Cross product via bdCrossprod() - "tCrossProd": Transposed cross product via bdtCrossprod() - "invChol": Inverse via Cholesky decomposition - "blockmult": Matrix multiplication - "CrossProd_double": Cross product with two matrices - "tCrossProd_double": Transposed cross product with two matrices - "solve": Matrix equation solving - "sdmean": Standard deviation and mean computation

b_group

Optional character array indicating the input group for secondary datasets (used in two-matrix operations)

b_datasets

Optional character array indicating the secondary datasets for two-matrix operations

overwrite

Optional boolean. If true, overwrites existing results

transp_dataset

Optional boolean. If true, transposes first dataset

transp_bdataset

Optional boolean. If true, transposes second dataset

fullMatrix

Optional boolean for Cholesky operations. If true, stores complete matrix; if false, stores only lower triangular

byrows

Optional boolean for statistical operations. If true, computes by rows; if false, by columns

threads

Optional integer specifying number of threads for parallel processing

Details

//' For matrix multiplication operations (blockmult, CrossProd_double, tCrossProd_double), the datasets and b_datasets vectors must have the same length. Each operation is performed element-wise between the corresponding pairs of datasets. Specifically, the b_datasets vector defines the second operand for each matrix multiplication. For example, if datasets = {"A1", "A2", "A3"} and b_datasets = {"B1", "B2", "B3"}, the operations executed are: A1 %*% B1, A2 %*% B2, and A3 %*% B3.

Example: If datasets = {"A1", "A2", "A3"} and b_datasets = {"B1", "B2", "B3"}, the function computes: A1 %*% B1, A2 %*% B2, and A3 %*% B3

Examples

Run this code
if (FALSE) {
# Create a sample large matrix in HDF5
# Create hdf5 datasets
bdCreate_hdf5_matrix(filename = "test_temp.hdf5", 
                    object = Y, group = "data", dataset = "Y",
                    transp = FALSE,
                    overwriteFile = TRUE, overwriteDataset = TRUE, 
                    unlimited = FALSE)

bdCreate_hdf5_matrix(filename = "test_temp.hdf5", 
                    object = X,  group = "data",  dataset = "X",
                    transp = FALSE,
                    overwriteFile = FALSE, overwriteDataset = TRUE, 
                    unlimited = FALSE)

bdCreate_hdf5_matrix(filename = "test_temp.hdf5",
                    object = Z,  group = "data",  dataset = "Z",
                    transp = FALSE,
                    overwriteFile = FALSE, overwriteDataset = TRUE,
                    unlimited = FALSE)

dsets <- bdgetDatasetsList_hdf5("test_temp.hdf5", group = "data")
dsets

# Apply function :  QR Decomposition
bdapply_Function_hdf5(filename = "test_temp.hdf5",
                     group = "data",datasets = dsets,
                     outgroup = "QR",func = "QR",
                     overwrite = TRUE)
}

Run the code above in your browser using DataLab