Learn R Programming

BigDataStatMeth (version 1.0.3)

bdImportTextFile_hdf5: Import Text File to HDF5

Description

Converts a text file (e.g., CSV, TSV) to HDF5 format, providing efficient storage and access capabilities.

Usage

bdImportTextFile_hdf5(
  filename,
  outputfile,
  outGroup,
  outDataset,
  sep = NULL,
  header = FALSE,
  rownames = FALSE,
  overwrite = FALSE,
  paral = NULL,
  threads = NULL,
  overwriteFile = NULL
)

Value

List with components:

fn

Character string with the HDF5 filename

ds

Character string with the full dataset path to the imported data (group/dataset)

ds_rows

Character string with the full dataset path to the row names

ds_cols

Character string with the full dataset path to the column names

Arguments

filename

Character string. Path to the input text file.

outputfile

Character string. Path to the output HDF5 file.

outGroup

Character string. Name of the group to create in HDF5 file.

outDataset

Character string. Name of the dataset to create.

sep

Character string (optional). Field separator, default is "\t".

header

Logical (optional). Whether first row contains column names.

rownames

Logical (optional). Whether first column contains row names.

overwrite

Logical (optional). Whether to overwrite existing dataset.

paral

Logical (optional). Whether to use parallel processing.

threads

Integer (optional). Number of threads for parallel processing.

overwriteFile

Logical (optional). Whether to overwrite existing HDF5 file.

Details

This function provides flexible text file import capabilities with support for:

  • Input format options:

    • Custom field separators

    • Header row handling

    • Row names handling

  • Processing options:

    • Parallel processing

    • Memory-efficient import

    • Configurable thread count

  • File handling:

    • Safe file operations

    • Overwrite protection

    • Comprehensive error handling

The function supports parallel processing for large files and provides memory-efficient import capabilities.

References

  • The HDF Group. (2000-2010). HDF5 User's Guide.

See Also

  • bdCreate_hdf5_matrix for creating HDF5 matrices directly

Examples

Run this code
if (FALSE) {
library(BigDataStatMeth)

# Create a test CSV file
data <- matrix(rnorm(100), 10, 10)
write.csv(data, "test.csv", row.names = FALSE)

# Import to HDF5
bdImportTextFile_hdf5(
  filename = "test.csv",
  outputfile = "output.hdf5",
  outGroup = "data",
  outDataset = "matrix1",
  sep = ",",
  header = TRUE,
  overwriteFile = TRUE
)

# Cleanup
unlink(c("test.csv", "output.hdf5"))
}

Run the code above in your browser using DataLab