Learn R Programming

reproducible (version 0.1.3)

.robustDigest: Create reproducible digests of objects in R

Description

Not all aspects of R objects are captured by current hashing tools in R (e.g. digest::digest, fastdigest::fastdigest, knitr caching, archivist::cache). This is mostly because many objects have "transient" (e.g., functions have environments), or "disk-backed" features. This function allows for these accommodations to be made and uses fastdigest internally. Since the goal of using reproducibility is to have tools that are not session specific, this function attempts to strip all session specific information so that the fastdigest works between sessions and operating systems. It is tested under many conditions and object types, there are bound to be others that don't work correctly.

Usage

.robustDigest(object, objects, compareRasterFileLength = 1e+06,
  algo = "xxhash64", digestPathContent = FALSE, classOptions = list())

# S4 method for ANY .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for cluster .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for `function` .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for expression .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for character .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for Path .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for environment .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for list .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for Raster .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

# S4 method for Spatial .robustDigest(object, compareRasterFileLength, algo, digestPathContent, classOptions)

Arguments

object

an object to digest.

objects

Optional character vector indicating which objects are to be considered while making digestible. This is only relevant if the object being passed is an environment or list or the like.

compareRasterFileLength

Numeric. Optional. When there are Rasters, that have file-backed storage, this is passed to the length arg in digest when determining if the Raster file is already in the database. Note: uses digest for file-backed Raster. Default 1e6. Passed to .prepareFileBackedRaster.

algo

The algorithms to be used; currently available choices are md5, which is also the default, sha1, crc32, sha256, sha512, xxhash32, xxhash64 and murmur32.

digestPathContent

Logical. Should arguments that are of class Path (see examples below) have their name digested (FALSE; default), or their file contents (TRUE).

classOptions

Optional list. This will pass into .robustDigest for specific classes. Should be options that the .robustDigest knows what to do with.

Value

A hash i.e., digest of the object passed in.

Classes

Raster* objects have the potential for disk-backed storage. If the object in the R session is cached using archivist::cache, only the header component will be assessed for caching. Thus, objects like this require more work. Also, because Raster* can have a built-in representation for having their data content located on disk, this format will be maintained if the raster already is file-backed, i.e., to create .tif or .grd backed rasters, use writeRaster first, then Cache. The .tif or .grd will be copied to the "raster" subdirectory of the cacheRepo. Their RAM representation (as an R object) will still be in the usual gallery/ directory. For inMemory raster objects, they will remain as binary .RData files.

Functions (which are contained within environments) are converted to a text representation via a call to format(FUN).

Objects contained within a list or environment are recursively hashed using fastdigest, while removing all references to environments.

Character strings are first assessed with dir.exists and file.exists to check for paths. If they are found to be paths, then the path is hashed with only its filename via basename(filename).

See Also

cache.

fastdigest.