h2o (version 3.2.0.3)

h2o.group_by: Group and Apply by Column

Description

Performs a group by and apply similar to ddply.

Usage

h2o.group_by(data, by, ..., order.by = NULL, gb.control = list(na.methods =
  NULL, col.names = NULL))

Arguments

data
an H2OFrame object.
by
a list of column names
order.by
Takes a vector column names or indices specifiying how to order the group by result.
gb.control
a list of how to handle NA values in the dataset as well as how to name output columns. See Details: for more help.
...
any supported aggregate function.

Value

  • Returns a new H2OFrame object with columns equivalent to the number of groups created

Details

In the case of na.methods within gb.control, there are three possible settings. "all" will include NAs in computation of functions. "rm" will completely remove all NA fields. "ignore" will remove NAs from the numerator but keep the rows for computational purposes. If a list smaller than the number of columns groups is supplied, the list will be padded by "ignore".

Similar to na.methods, col.names will pad the list with the default column names if the length is less than the number of colums groups supplied.