datadr (version 0.8.4)

drFilter: Filter a 'ddo' or 'ddf' Object

Description

Filter a 'ddo' or 'ddf' object by selecting key-value pairs that satisfy a logical condition

Usage

drFilter(x, filterFn, output = NULL, overwrite = FALSE, params = NULL,
  packages = NULL, control = NULL)

Arguments

x
an object of class 'ddo' or 'ddf'
filterFn
function that takes either a key-value pair (as two arguments) or just a value (as a single argument) and returns either TRUE or FALSE - if TRUE, that key-value pair will be present in the result. See examples for de
output
a "kvConnection" object indicating where the output data should reside (see localDiskConn, hdfsConn). If NULL (default), output will be
overwrite
logical; should existing output location be overwritten? (also can specify overwrite = "backup" to move the existing output to _bak)
params
a named list of objects external to the input data that are needed in the distributed computing (most should be taken care of automatically such that this is rarely necessary to specify)
packages
a vector of R package names that contain functions used in filterFn (most should be taken care of automatically such that this is rarely necessary to specify)
control
parameters specifying how the backend should handle things (most-likely parameters to rhwatch in RHIPE) - see rhipeControl and localDiskContr

Value

  • a 'ddo' or 'ddf' object

See Also

drJoin, drLapply

Examples

Run this code
# Create a ddf using the iris data
bySpecies <- divide(iris, by = "Species")

# Filter using only the 'value' of the key/value pair
drFilter(bySpecies, function(v) mean(v$Sepal.Width) < 3)

# Filter using both the key and value
drFilter(bySpecies, function(k,v) k != "Species=virginica" & mean(v$Sepal.Width) < 3)

Run the code above in your browser using DataLab