Apply a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors.

`tapply(X, INDEX, FUN = NULL, …, default = NA, simplify = TRUE)`

INDEX

FUN

a function (or name of a function) to be applied, or `NULL`

.
In the case of functions like `+`

, `%*%`

, etc.,
the function name must be backquoted or quoted. If `FUN`

is
`NULL`

, tapply returns a vector which can be used to subscript
the multi-way array `tapply`

normally produces.

…

optional arguments to `FUN`

: the Note section.

default

(only in the case of simplification to an array) the
value with which the array is initialized as
`array(default, dim = ..)`

. Before R 3.4.0, this
was hard coded to `array()`

's default `NA`

. If it
is `NA`

(the default), the missing value of the answer type,
e.g. `NA_real_`

, is chosen (`as.raw(0)`

for
`"raw"`

). In a numerical case, it may be set, e.g., to
`FUN(integer(0))`

, e.g., in the case of `FUN = sum`

to
`0`

or `0L`

.

When `FUN`

is present, `tapply`

calls `FUN`

for each
cell that has any data in it. If `FUN`

returns a single atomic
value for each such cell (e.g., functions `mean`

or `var`

)
and when `simplify`

is `TRUE`

, `tapply`

returns a
multi-way array containing the values, and `NA`

for the
empty cells. The array has the same number of dimensions as
`INDEX`

has components; the number of levels in a dimension is
the number of levels (`nlevels()`

) in the corresponding component
of `INDEX`

. Note that if the return value has a class (e.g., an
object of class `"Date"`

) the class is discarded.

`simplify = TRUE`

always returns an array, possibly 1-dimensional.

If `FUN`

does not return a single atomic value, `tapply`

returns an array of mode `list`

whose components are the
values of the individual calls to `FUN`

, i.e., the result is a
list with a `dim`

attribute.

When there is an array answer, its `dimnames`

are named by
the names of `INDEX`

and are based on the levels of the grouping
factors (possibly after coercion).

For a list result, the elements corresponding to empty cells are
`NULL`

.

If `FUN`

is not `NULL`

, it is passed to
`match.fun`

, and hence it can be a function or a symbol or
character string naming a function.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
*The New S Language*.
Wadsworth & Brooks/Cole.

the convenience functions `by`

and
`aggregate`

(using `tapply`

);
`apply`

,
`lapply`

with its versions
`sapply`

and `mapply`

.

# NOT RUN { require(stats) groups <- as.factor(rbinom(32, n = 5, prob = 0.4)) tapply(groups, groups, length) #- is almost the same as table(groups) ## contingency table from data.frame : array with named dimnames tapply(warpbreaks$breaks, warpbreaks[,-1], sum) tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum) n <- 17; fac <- factor(rep_len(1:3, n), levels = 1:5) table(fac) tapply(1:n, fac, sum) tapply(1:n, fac, sum, default = 0) # maybe more desirable tapply(1:n, fac, sum, simplify = FALSE) tapply(1:n, fac, range) tapply(1:n, fac, quantile) tapply(1:n, fac, length) ## NA's tapply(1:n, fac, length, default = 0) # == table(fac) # } # NOT RUN { ## example of ... argument: find quarterly means tapply(presidents, cycle(presidents), mean, na.rm = TRUE) ind <- list(c(1, 2, 2), c("A", "A", "B")) table(ind) tapply(1:3, ind) #-> the split vector tapply(1:3, ind, sum) ## Some assertions (not held by all patch propsals): nq <- names(quantile(1:5)) stopifnot( identical(tapply(1:3, ind), c(1L, 2L, 4L)), identical(tapply(1:3, ind, sum), matrix(c(1L, 2L, NA, 3L), 2, dimnames = list(c("1", "2"), c("A", "B")))), identical(tapply(1:n, fac, quantile)[-1], array(list(`2` = structure(c(2, 5.75, 9.5, 13.25, 17), .Names = nq), `3` = structure(c(3, 6, 9, 12, 15), .Names = nq), `4` = NULL, `5` = NULL), dim=4, dimnames=list(as.character(2:5))))) # }