Learn R Programming

bitsqueezr (version 0.1.1)

squeeze_bits: Change insignificant bits of numeric values for improved compressibility

Description

Change insignificant bits of numeric values to zero or one, increasing the compressibility of files containing the values. Insignificant bits can be "trimmed" (set to zero), "padded" (set to one), or "groomed" (element-wise alternation between trimming and padding). A discussion of these schemes is provided by Zender, Charles (2016) Statistically-accurate precision-preserving quantization with compression, evaluated in the netCDF operators. Geoscientific Model Development 9(9). The file size reduction depends on the level of quantization and the compression algorithm used.

Usage

squeeze_bits(x, digits, method = 'trim', decimal = FALSE)

Arguments

x

a numeric vector

digits

number of digits to preserve

method

'trim' sets insignificant bits to zero, 'pad' sets insignificant bits to one, and 'groom' alternates between 'trim' and 'pad'

decimal

if TRUE, d will be interpreted to refer to decimal digits rather than significant digits.

Examples

Run this code
# NOT RUN {
# Check file size reduction when retaining 6 siginificant digits
x <- runif(100)
raw <- tempfile(fileext='.rds')
quantized <- tempfile(fileext='.rds')

saveRDS(x, raw, compress='xz')
saveRDS(squeeze_bits(x, 6, method='trim'), quantized, compress='xz')

file.size(quantized) / file.size(raw)
# 0.6776316

# Display binary representation of pi with various levels of trimming
for (d in 1:15) {
  cat(bits_as_string(squeeze_bits(pi, d, method='trim')), '\n')
}

# }

Run the code above in your browser using DataLab