rtracklayer (version 1.32.1)

BigWigFile-class: BigWig Import and Export

Description

These functions support the import and export of the UCSC BigWig format, a compressed, binary form of WIG/BEDGraph with a spatial index and precomputed summaries. These functions do not work on Windows.

Usage

"import"(con, format, text, selection = BigWigSelection(which, ...), which = con, asRle = FALSE, as = c("GRanges", "RleList", "NumericList"), ...) import.bw(con, ...)
"export"(object, con, format, ...) "export"(object, con, format, dataFormat = c("auto", "variableStep", "fixedStep", "bedGraph"), compress = TRUE, fixedSummaries = FALSE) export.bw(object, con, ...)

Arguments

con
A path, URL or BigWigFile object. Connections are not supported. For the functions ending in .bw, the file format is indicated by the function name. For the export and import methods, the format must be indicated another way. If con is a path, or URL, either the file extension or the format argument needs to be “bigWig” or “bw”.
object
The object to export, should be an RleList, IntegerList, NumericList, GRanges or something coercible to a GRanges.
format
If not missing, should be “bigWig” or “bw” (case insensitive).
text
Not supported.
as
Specifies the class of the return object. Default is GRanges, which has one range per range in the file, and a score column holding the value for each range. For NumericList, one numeric vector is returned for each range in the selection argument. For RleList, there is one Rle per sequence, and that Rle spans the entire sequence.
asRle
Deprecated. Use as instead. If TRUE, the BigWig file is assumed to contain contiguous ranges that define a run-length encoding of a vector (like coverage), and a RleList is returned.
selection
A BigWigSelection object indicating the ranges to load.
which
A range data structure coercible to RangesList, like a GRanges, or a BigWigFile. Only the intervals in the file overlapping the given ranges are returned. By default, the value is the BigWigFile itself. Its Seqinfo object is extracted and coerced to a RangesList that represents the entirety of the file.
dataFormat
Probably best left to “auto”. Exists only for historical reasons.
compress
If TRUE, compress the data. No reason to change this.
fixedSummaries
If TRUE, compute summaries at fixed resolutions corresponding to the default zoom levels in the Ensembl genome browser (with some extrapolation): 30X, 65X, 130X, 260X, 450X, 648X, 950X, 1296X, 4800X, 19200X. Otherwise, the resolutions are dynamically determined by an algorithm that computes an initial summary size by initializing to 10X the size of the smallest feature and doubling the size as needed until the size of the summary is less than half that of the data (or there are no further gains). It then computes up to 10 more levels of summary, quadrupling the size each time, until the summaries start to exceed the sequence size.
...
Arguments to pass down to methods to other methods. For import, the flow eventually reaches the BigWigFile method on import.

Value

A GRanges (default), RleList or NumericList. GRanges return ranges with non-zero score values in a score metadata column. The length of the NumericList is the same length as the selection argument (one list element per range). The return order in the NumericList matches the order of the BigWigSelection object.

<code>BigWigFile</code> objects

A BigWigFile object, an extension of RTLFile is a reference to a BigWig file. To cast a path, URL or connection to a BigWigFile, pass it to the BigWigFile constructor. BigWig files are more complex than most track files, and there are a number of methods on BigWigFile for accessing the additional information:
seqinfo(x): Gets the Seqinfo object indicating the lengths of the sequences for the intervals in the file. No circularity or genome information is available.
summary(ranges = as(seqinfo(object), "GenomicRanges"), size = 1L, type = c("mean", "min", "max", "coverage", "sd"), defaultValue = NA_real_), ...: Aggregates the intervals in the file that fall into ranges, which should be something coercible to GRanges. The aggregation essentially compresses each sequence to a length of size. The algorithm is specified by type; available algorithms include the mean, min, max, coverage (percent sequence covered by at least one feature), and standard deviation. When a window contains no features, defaultValue is assumed. The result is an RleList, with an element for each element in ranges. The driving use case for this is visualization of coverage when the screen space is small compared to the viewed portion of the sequence. The operation is very fast, as it leverages cached multi-level summaries present in every BigWig file. If a summary statistic is not available / cannot be computed for a given range a warning is thrown and the defaultValue NA_real_ is returned.

<code>BigWigFileList</code> objects

A BigWigFileList() provides a convenient way of managing a list of BigWigFile instances.

See Also

wigToBigWig for converting a WIG file to BigWig.

Examples

Run this code
if (.Platform$OS.type != "windows") {
  test_path <- system.file("tests", package = "rtracklayer")
  test_bw <- file.path(test_path, "test.bw")

  ## GRanges
  ## Returns ranges with non-zero scores.
  gr <- import(test_bw)
  gr 

  which <- GRanges(c("chr2", "chr2"), IRanges(c(1, 300), c(400, 1000)))
  import(test_bw, which = which)

  ## RleList
  ## Scores returned as an RleList is equivalent to the coverage.
  ## Best option when 'which' or 'selection' contain many small ranges.
  mini <- narrow(unlist(tile(which, 50)), 2)
  rle <- import(test_bw, which = mini, as = "RleList")
  rle 

  ## NumericList
  ## The 'which' is stored as metadata:
  track <- import(test_bw, which = which, as = "NumericList")
  metadata(track)

## Not run: 
#   test_bw_out <- file.path(tempdir(), "test_out.bw")
#   export(test, test_bw_out)
# ## End(Not run)

  bwf <- BigWigFile(test_bw)
  track <- import(bwf)

  seqinfo(bwf)  

  summary(bwf) # for each sequence, average all values into one
  summary(bwf, range(head(track))) # just average the first few features
  summary(bwf, size = seqlengths(bwf) / 10) # 10X reduction
  summary(bwf, type = "min") # min instead of mean
}

Run the code above in your browser using DataLab