Learn R Programming

storr (version 1.0.1)

storr: Object cache

Description

Create an object cache; a "storr". A storr is a simple key-value store where the actual content is stored in a content-addressible way (so that duplicate objects are only stored once) and with a caching layer so that repeated lookups are fast even if the underlying storage driver is slow.

Usage

storr(driver, default_namespace = "objects")

Arguments

driver
A driver object
default_namespace
Default namespace to store objects in. By default "objects" is used, but this might be useful to have two diffent storr objects pointing at the same underlying storage, but storing things in different namespaces.

Methods

destroy
Totally destroys the storr by telling the driver to destroy all the data and then deleting the driver. This will remove all data and cannot be undone. Usage: destroy()
flush_cache
Flush the temporary cache of objects that accumulates as the storr is used. Should not need to be called often. Usage: flush_cache()
set
Set a key to a value. Usage: set(key, value, namespace = self$default_namespace, use_cache = TRUE) Arguments:
  • key: The key name. Can be any string.
  • value: Any R object to store. The object will generally be serialised (this is not actually true for the environment storr) so only objects that would usually be expected to survive a saveRDS/readRDS roundtrip will work. This excludes Rcpp modules objects, external pointers, etc. But any "normal" R object will work fine.
  • namespace: An optional namespace. By default the default namespace that the storr was created with will be used (by default that is "objects"). Different namespaces allow different types of objects to be stored without risk of names colliding. Use of namespaces is optional, but if used they must be a string.
  • use_cache: Use the internal cache to avoid reading or writing to the underlying storage if the data has already been seen (i.e., we have seen the hash of the object before).
Value: Invisibly, the hash of the saved object.
set_by_value
Like set but saves the object with a key that is the same as the hash of the object. Equivalent to $set(digest::digest(value), value). Usage: set_by_value(value, namespace = self$default_namespace, use_cache = TRUE) Arguments:
  • value: An R object to save, with the same limitations as set.
  • namespace: Optional namespace to save the key into.
  • use_cache: Use the internal cache to avoid reading or writing to the underlying storage if the data has already been seen (i.e., we have seen the hash of the object before).
get
Retrieve an object from the storr. If the requested value is not found then a KeyError will be raised (an R error, but can be caught with tryCatch; see the "storr" vignette). Usage: get(key, namespace = self$default_namespace, use_cache = TRUE) Arguments:
  • key: The name of the key to get.
  • namespace: Optional namespace to look for the key within.
  • use_cache: Use the internal cache to avoid reading or writing to the underlying storage if the data has already been seen (i.e., we have seen the hash of the object before).
get_hash
Retrieve the hash of an object stored in the storr (rather than the object itself). Usage: get_hash(key, namespace = self$default_namespace) Arguments:
  • key: The name of the key to get.
  • namespace: Optional namespace to look for the key within.
del
Delete an object fom the storr. Usage: del(key, namespace = self$default_namespace) Arguments:
  • key: The name of the key
  • namespace: The namespace of the key.
Value: TRUE if an object was deleted, FALSE otherwise.
clear
Clear a storr. This function might be slow as it will iterate over each key. Future versions of storr might allow drivers to implement a bulk clear method that will allow faster clearing. Usage: clear(namespace = self$default_namespace) Arguments:
  • namespace: A namespace, to clear a single namespace, or NULL to clear all namespaces.
exists
Test if a key exists within a namespace Usage: exists(key, namespace = self$default_namespace) Arguments:
  • key: The name of the key
  • namespace: The namespace of the key.
exists_object
Test if an object with a given hash exists within the storr Usage: exists_object(hash) Arguments:
  • hash: Hash to test
gc
Garbage collect the storr. Because keys do not directly map to objects, but instead map to hashes which map to objects, it is possible that hash/object pairs can persist with nothing pointing at them. Running gc will remove these objects from the storr. Usage: gc()
get_value
Get the content of an object given its hash. Usage: get_value(hash, use_cache = TRUE) Arguments:
  • hash: The hash of the object to retrieve.
  • use_cache: Use the internal cache to avoid reading or writing to the underlying storage if the data has already been seen (i.e., we have seen the hash of the object before).
Value: The object if it is present, otherwise throw a HashError.
set_value
Add an object value, but don't add a key. You will not need to use this very often, but it is used internally. Usage: set_value(value, use_cache = TRUE) Arguments:
  • value: An R object to set.
  • use_cache: Use the internal cache to avoid reading or writing to the underlying storage if the data has already been seen (i.e., we have seen the hash of the object before).
Value: Invisibly, the hash of the object.
list
List all keys stored in a namespace. Usage: list(namespace = self$default_namespace) Arguments:
  • namespace: The namespace to list keys within.
Value: A sorted character vector (possibly zero-length).
list_hashes
List all hashes stored in the storr Usage: list_hashes() Value: A sorted character vector (possibly zero-length).
list_namespaces
List all namespaces known to the database Usage: list_namespaces() Value: A sorted character vector (possibly zero-length).
import
Import R objects from an environment. Usage: import(src, list = NULL, namespace = self$default_namespace) Arguments:
  • src: Object to import objects from; can be a list, environment or another storr.
  • list: Names of of objects to import (or NULL to import all objects in envir. If given it must be a character vector. If named, the names of the character vector will be the names of the objects as created in the storr.
  • namespace: Namespace to get objects from, and to put objects into.
export
Export objects from the storr into something else. Usage: export(dest, list = NULL, namespace = self$default_namespace) Arguments:
  • dest: A target destination to export objects to; can be a list, environment, or another storr. Use list() to export to a brand new list, or use as.list(object) for a shorthand.
  • list: Names of objects to export, with the same rules as list in $import.
  • namespace: Namespace to get objects from, and to put objects into.
Value: Invisibly, dest, which allows use of e <- st$export(new.env()).
archive_export
Export objects from the storr into a special "archive" storr, which is an storr_rds with name mangling turned on (which encodes keys with base64 so that they do not voilate filesystem naming conventions). Usage: archive_export(path, names = NULL, namespace = self$default_namespace) Arguments:
  • path: Path to create the storr at; can exist already.
  • names: As for $export
  • namespace: Namespace to get objects from.
archive_import
Inverse of archive_export; import objects from a storr that was created by archive_export. Usage: archive_import(path, names = NULL, namespace = self$default_namespace) Arguments:
  • path: Path of the exported storr.
  • names: As for $import
  • namespace: Namespace to import objects into.

Details

To create a storr you need to provide a "driver" object. There are three in the package: driver_environment for ephemeral in-memory storage, driver_rds for serialised storage to disk and driver_redis_api which stores data in Redis but requires packages that are not on CRAN to function (RedisAPI and one of rrlite or redux). New drivers are relatively easy to add -- see the "drivers" vignette (vignette("drivers", package="storr")).

There are convenience functions (e.g., storr_environment and storr_rds) that may be more convenient to use than this function.

Once a storr has been made it provides a number of methods. Because storr uses R6 (R6Class) objects, each method is accessed by using $ on a storr object (see the examples). The methods are described below in the "Methods" section.

The default_namespace affects all methods of the storr object that refer to namespaces; if a namespace is not given, then the action (get, set, del, list, import, export) will affect the default_namespace. By default this is "objects".

Examples

Run this code
st <- storr(driver_environment())
## Set "mykey" to hold the mtcars dataset:
st$set("mykey", mtcars)
## and get the object:
st$get("mykey")
## List known keys:
st$list()
## List hashes
st$list_hashes()
## List keys in another namespace:
st$list("namespace2")
## We can store things in other namespaces:
st$set("x", mtcars, "namespace2")
st$set("y", mtcars, "namespace2")
st$list("namespace2")
## Duplicate data do not cause duplicate storage: despite having three
## keys we only have one bit of data:
st$list_hashes()
st$del("mykey")

## Storr objects can be created that have a default namespace that is
## not "objects" by using the \code{default_namespace} argument (this
## one also points at the same memory as the first storr).
st2 <- storr(driver_environment(st$driver$envir),
             default_namespace="namespace2")
## All functions now use "namespace2" as the default namespace:
st2$list()
st2$del("x")
st2$del("y")

Run the code above in your browser using DataLab