Learn R Programming

plyr (version 1.6)

daply: Split data frame, apply function, and return results in an array.

Description

For each subset of data frame, apply function then combine results into an array. daply with a function that operates column-wise is similar to aggregate.

Usage

daply(.data, .variables, .fun = NULL, ..., .progress =
  "none", .drop_i = TRUE, .drop_o = TRUE, .parallel =
  FALSE)

Arguments

.fun
function to apply to each piece
...
other arguments passed on to .fun
.progress
name of the progress bar to use, see create_progress_bar
.data
data frame to be processed
.variables
variables to split data frame by, as quoted variables, a formula or character vector
.drop_i
should combinations of variables that do not appear in the input data be preserved (FALSE) or dropped (TRUE, default)
.parallel
if TRUE, apply function in parallel, using parallel backend provided by foreach
.drop_o
should extra dimensions of length 1 in the output be dropped, simplifying the output. Defaults to TRUE

Value

  • if results are atomic with same type and dimensionality, a vector, matrix or array; otherwise, a list-array (a list with dimensions)

Input

This function splits data frames by variables.

Output

If there are no results, then this function will return a vector of length 0 (vector()).

References

Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. http://www.jstatsoft.org/v40/i01/.

See Also

Other array output: aaply, laply

Other data frame input: ddply, dlply

Examples

Run this code
daply(baseball, .(year), nrow)

# Several different ways of summarising by variables that should not be
# included in the summary

daply(baseball[, c(2, 6:9)], .(year), mean)
daply(baseball[, 6:9], .(baseball$year), mean)
daply(baseball, .(year), function(df) mean(df[, 6:9]))

Run the code above in your browser using DataLab