Learn R Programming

h5vc (version 2.6.3)

getSampleData: Reading and writing sample data from / to a tally file

Description

These functions allow reading and writing of sample data to the HDF5-based tally files. The sample data is stored as group attribute.

Usage

getSampleData( filename, group )
setSampleData( filename, group, sampleData, largeAttributes = FALSE, stringSize = 64 )

Arguments

filename
The name of a tally file
group
The name of a group within that tally file, e.g. /ExampleStudy/22
sampleData
A data.frame with k rows (one for each sample) and columns Type, Column and (SampleGroup or Patient. Additional column will be added as well but are not required.)
largeAttributes
HDF5 limits the size of attributes to 64KB, if you have many samples setting this flag will write the attributes in a separate dataset instead. getSampleData is aware of this and automatically chooses the dataset-stored attributes if they are present
stringSize
Maximum length for string attributes (number of characters) - default of 64 characters should be fine for most cases; This has to be specified since we do not support variable length strings as of now.

Value

  • sampledataA data.frame with k rows (one for each sample) and columns Type, Column and (SampleGroup or Patient).

Details

The returned data.frame contains information about the sample ids, sample columns in the sample dimension of the dataset. The type of sample must be one of c("Case","Control") to be used with the provided SNV calling function. Additional relevant per-sample information may be stored here.

Note that the following columns are required in the sample data where the rows represent samples in the cohort:

Sample: the sample id of the corresponding sample

Column: the index within the genomic position dimension of the corresponding sample, be aware that getSampleData and setSampleData automatically add / remove 1 from this value since internally the tally files store the dimension 0-based whereas within R we count 1-based.

Patient the patient id of the corresponding sample

Type the type of sample

Examples

Run this code
# loading library and example data
  library(h5vc)
  tallyFile <- system.file( "extdata", "example.tally.hfs5", package = "h5vcData" )
  sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
  sampleData
  # modify  the sample data
  sampleData$AnotherColumn <- paste( sampleData$Patient, "Modified" )
  # write to tallyFile
  setSampleData( tallyFile, "/ExampleStudy/16", sampleData )
  # re-load and check if it worked
  sampleData <- getSampleData( tallyFile, "/ExampleStudy/16" )
  sampleData

Run the code above in your browser using DataLab