ll{
Package: h5vc
Type: Package
Version: 1.0.4
Date: 2013-10-11
License: GPL (>= 3)
}
This package is desgned to facilitate the analysis of genomics data
through tallies stored in a HDF5 file.
Within a HDF5 file the tally is simply a table of
bases times genomic positions listing for each position the count of
each base observed as a mismatch in the sample at any given position.
Strand and sample are additional dimension in this array, which leads to
a 4D-array called 'Counts'. The total coverage is stored in a separate
array of 3 dimensions (Sample x Strand x Genomic Position) called
'Coverages', there is a 3 dimensional 'Deletions' array and a 1D-vector
encoding the reference base ('Reference'). Those 4 arrays are stored as
datasets within a HDF5 tally file in which the group-structure of the
tally file encodes for the organisatorial levels of 'Study' and
'Chromosome'. For details on the layout of HDF5 files visit
(http://www.hdfgroup.org), a short description is given in the
vignettes.Creating those HDF5 tally files can be accomplished from within R
or through a Python script that will generate a tally file from a set of
.bam files. The workflow is described in the vignettes
h5vc.creating.tallies
and h5vc.creating.tallies.within.R
.