In-memory Compression and Decompression
In-memory compression or decompression for raw vectors.
memCompress(from, type = c("gzip", "bzip2", "xz", "none"))memDecompress(from, type = c("unknown", "gzip", "bzip2", "xz", "none"), asChar = FALSE)
- A raw vector. For
memCompressa character vector will be converted to a raw vector with character strings separated by
- character string, the type of compression. May be abbreviated to a single letter, defaults to the first of the alternatives.
- logical: should the result be converted to a character string?
type = "none" passes the input through unchanged, but may be
type is a variable.
type = "unknown" attempts to detect the type of compression
applied (if any): this will always succeed for
compression, and will succeed for other forms if there is a suitable
header. It will auto-detect the magic header
"\x1f\x8b") added to files by the
gzip program (and
to files written by
not add such a header.
bzip2 compression always adds a header (
type = "xz" is equivalent to compressing a
xz -9e (including adding the magic
header): decompression should cope with the contents of any file
xz version 4.999 and some versions of
lzma. There are other versions, in particular raw
streams, that are not currently handled.
All the types of compression can expand the input: for
"bzip" the maximum expansion is known and so
memCompress can always allocate sufficient space. For
"xz" it is possible (but extremely unlikely) that compression
will fail if the output would have been too large.
A raw vector or a character string (if
asChar = TRUE).
https://en.wikipedia.org/wiki/Data_compression for background on data compression, http://zlib.net/, https://en.wikipedia.org/wiki/Gzip, http://www.bzip.org/, https://en.wikipedia.org/wiki/Bzip2, http://tukaani.org/xz/ and https://en.wikipedia.org/wiki/Xz for references about the particular schemes used.
txt <- readLines(file.path(R.home("doc"), "COPYING")) sum(nchar(txt)) txt.gz <- memCompress(txt, "g") length(txt.gz) txt2 <- strsplit(memDecompress(txt.gz, "g", asChar = TRUE), "\n")[] stopifnot(identical(txt, txt2)) txt.bz2 <- memCompress(txt, "b") length(txt.bz2) ## can auto-detect bzip2: txt3 <- strsplit(memDecompress(txt.bz2, asChar = TRUE), "\n")[] stopifnot(identical(txt, txt3)) ## xz compression is only worthwhile for large objects txt.xz <- memCompress(txt, "x") length(txt.xz) txt3 <- strsplit(memDecompress(txt.xz, asChar = TRUE), "\n")[] stopifnot(identical(txt, txt3))