h2o (version 3.2.0.3)

h2o.ddply: Split H2O Dataset, Apply Function, and Return Results

Description

For each subset of an H2O data set, apply a user-specified function, then combine the results. This is an experimental feature.

Usage

h2o.ddply(.data, .variables, .fun = NULL, ..., .progress = "none")

Arguments

.data
An H2OFrame object to be processed.
.variables
Variables to split .data by, either the indices or names of a set of columns.
.fun
Function to apply to each subset grouping.
.progress
Name of the progress bar to use. #TODO: (Currently unimplemented)
...
Additional arguments passed on to .fun. #TODO: (Currently unimplemented)

Value

  • Returns a H2OFrame object containing the results from the split/apply operation, arranged

See Also

ddply for the plyr library implementation.

Examples

Run this code
library(h2o)
localH2O <- h2o.init()

# Import iris dataset to H2O
irisPath <- system.file("extdata", "iris_wheader.csv", package = "h2o")
iris.hex <- h2o.uploadFile(localH2O, path = irisPath, destination_frame = "iris.hex")
# Add function taking mean of sepal_len column
fun = function(df) { sum(df[,1], na.rm = T)/nrow(df) }
# Apply function to groups by class of flower
# uses h2o's ddply, since iris.hex is an H2OFrame object
res = h2o.ddply(iris.hex, "class", fun)
head(res)

Run the code above in your browser using DataLab