Learn R Programming

h5vc (version 2.6.3)

h5readBlock: h5readBlock

Description

A simple access function for extracting a single block of data from a tally file, use h5dapply for applying functions on multiple blocks / extracting multiple blocks form a tally file.

Usage

h5readBlock( filename, group, names, dims, range, samples = NULL, sampleDimMap = .sampleDimMap, verbose = FALSE )

Arguments

filename
The name of a tally file to process
group
The name of a group in that tally file
names
The names of the datasets to extract, e.g. c("Counts","Coverages") - optional (defaults to all datasets)
dims
The dimension in which the block shall be extracted for each dataset in the same order as names, these should correspond to compatible dimensions between the datsets. - optional (defaults to the genomic position dimension)
range
The range along the specified dimensions which should be extracted
samples
Character vector of sample names - must match contents of sampleData stored in the tallyFile
sampleDimMap
A list mapping dataset names to their respective sample dimensions - default provides values for "Counts", "Coverages", "Deletions" and "Reference"
verbose
Boolean flag that controls the amount of messages being printed by h5dapply

Value

  • A list with one entry per dataset and an additional slot h5dapplyInfo containing auxiliary information.

Details

This function extracts a block along the dimensions specified in dims (default: genomic position) from the datasets specified in names and returns it. The block is defined by the parameter range.

The function returns a list with one slot for each dataset specified in the names argument to containing the array corresponding to the specified block in the given dataset. Furthemore the slot h5dapplyInfo is reserved and contains another list with the following content:

Blockstart is an integer specifying the starting position of the current block (in the dimension specified by the dims argument to h5dapply)

Blockend is an integer specifying the end position of the current block (in the dimension specified by the dims argument to h5dapply)

Datasets Contains a data.frame as it is returned by h5ls listing all datasets present in the other slots of data with their group, name, dimensions, number of dimensions (DimCount) and the dimension that is used for splitting into blocks (PosDim)

Group contains the name of the group as specified by the group argument to h5dapply

Examples

Run this code
library(h5vc) # loading the library
tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
data <- h5readBlock( #extracting coverage, deletions and reference using h5dreadBlock
  filename = tallyFile,
  group = "/ExampleStudy/16",
  names = c( "Coverages", "Deletions", "Reference" ),
  range = c(29000000,29010000),
  verbose = TRUE
)
str(data)
sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
#Subsetting by Sample
sampleData <- sampleData[sampleData$Patient == "Patient8",]
data <- h5readBlock( #extracting coverage, deletions and reference using h5dreadBlock
  filename = tallyFile,
  group = "/ExampleStudy/16",
  names = c( "Coverages", "Deletions", "Reference" ),
  range = c(29000000,29010000),
  samples = sampleData$Sample,
  verbose = TRUE
)
str(data)

Run the code above in your browser using DataLab