reproducible
optionsThese provide top-level, powerful settings for a comprehensive
reproducible workflow. To see defaults, run reproducibleOptions()
.
See Details below.
reproducibleOptions()
The following options are likely not needed by a user.
cloudChecksumsFilename |
file.path(dirname(.reproducibleTempCacheDir), "checksums.rds")
|
Used in cloudCache |
length |
Inf |
Used in Cache , specifically to the internal
calls to CacheDigest . This is passed to digest::digest .
Mostly this would be changed from default Inf if the digesting is
taking too long. Use this with caution, as some objects will have MANY
NA values in their first MANY elements |
useragent |
"http://github.com/PredictiveEcology/reproducible"
|
User agent for downloads using this package. |
Below are options that can be set with options("reproducible.xxx" = newValue)
,
where xxx
is one of the values below, and newValue
is a new value to
give the option. Sometimes these options can be placed in the user's .Rprofile
file so they persist between sessions.
The following options are likely of interest to most users
OPTION | DEFAULT VALUE | DESCRIPTION |
ask |
TRUE |
Used in clearCache and
keepCache |
cachePath |
.reproducibleTempCacheDir
|
Used in Cache and many others.
The default path for repositories if not passed as an argument. |
destinationPath |
NULL |
Used in prepInputs ,
preProcess . Can be set globally here. |
futurePlan |
FALSE |
On Linux OSs, Cache and cloudCache
have some functionality that uses the future package.
Default is to not use these, as they are experimental. They may,
however, be very effective in speeding up some things, specifically,
uploading cached elements via googledrive in cloudCache |
inputPaths |
NULL |
Used in prepInputs ,
preProcess . If set to a path, this
will cause these functions to save their downloaded
and preprocessed file to this location, with a hardlink
(via file.link ) to the file
created in the destinationPath . This can be used
so that individual projects that use common data sets
can maintain modularity (by placing downloaded objects
in their destinationPath , but also minimize
re-downloading the same (perhaps large) file over and over
for each project. Because the files are hardlinks, there
is no extra space taken up by the apparently duplicated files |
inputPathsRecursive |
FALSE |
Used in prepInputs ,
preProcess . Should the reproducible.inputPaths
be searched recursively for existence of a file |
overwrite |
FALSE |
Used in prepInputs , preProcess ,
downloadFile , and postProcess . |
quick |
FALSE |
Used in Cache . This will cause
Cache to use file.size(file)
instead of the digest::digest(file) .
Less robust to changes, but faster. NOTE: this will only affect objects
on disk. |
showSimilar |
Passed to Cache . Default FALSE . |
useCache |
TRUE |
Used in Cache . If FALSE , then
the entire Cache machinery is skipped and the functions
are run as if there was no Cache occurring. Can also take 2 other values:
'overwrite' and 'devMode' . 'overwrite' will
cause no recovery of objects from the cache repository, only new
ones will be created. If the hash is identical to a previous one,
then this will overwrite the previous one. 'devMode' will
function as normally Cache except it will use the userTags
to determine if a previous function has been run. If the userTags
are identical, but the digest value is different, the old value will
be deleted from the cache repository and this new value will be added.
This addresses a common situation during the development stage: functions
are changing frequently, so any entry in the cache repository will
be stale following changes to functions, i.e., they will likely never
be relevant again. This will therefore keep the cache repository
clean of stale objects. If there is ambiguity in the userTags , i.e
they do not uniquely identify a single entry in the cacheRepo, then
this option will default back to the non devMode behaviour to avoid
deleting objects. This, therefore, is most useful if the user is
using unique values for userTags |
useCloud |
Passed to Cache . Default FALSE . |
useMemoise |
TRUE |
Used in Cache . If TRUE ,
recovery of cached elements from the cacheRepo will use
memoise::memoise . This means that the 3rd time running a function
will be much faster than the 1st (create cache entry) or 2nd (recover
from the SQLite database on dist). NOTE: memoised values are removed
when the R session is restarted. This option will use more RAM and
so may need to be turned off if RAM is limiting. clearCache
of any sort will cause all memoising to be 'forgotten'
(memoise::forget ) |
useNewDigestAlgorithm |
TRUE |
This will mean that previous
cache repositories
will be defunct. This new algorithm will make Cache less
sensitive to minor but irrelevant changes (like changing the
order of arguments) and will work successfully across operating
systems (especially relevant for the new 'cloudCache' function |
verbose |
FALSE |
If set to TRUE then every Cache
call will show a summary of the objects being cached, their object.size
and the time it took to digest them and also the time it took to run
the call and save the call to the cache repository or load the cached
copy from the repository. This may help diagnosing some problems that may occur. |
OPTION | DEFAULT VALUE |