tapply
Apply a Function Over a Ragged Array
Apply a function to each cell of a ragged array, that is to each (nonempty) group of values given by a unique combination of the levels of certain factors.
Usage
tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)
Arguments
 X
 an atomic object, typically a vector.
 INDEX
 list of one or more factors, each of same length as
X
. The elements are coerced to factors byas.factor
.  FUN
 the function to be applied, or
NULL
. In the case of functions like+
,%*%
, etc., the function name must be backquoted or quoted. IfFUN
isNULL
, tapply returns a vector which can be used to subscript the multiway arraytapply
normally produces.  ...
 optional arguments to
FUN
: the Note section.  simplify
 If
FALSE
,tapply
always returns an array of mode"list"
. IfTRUE
(the default), then ifFUN
always returns a scalar,tapply
returns an array with the mode of the scalar.
Value

If FUN is not NULL, it is passed to
match.fun, and hence it can be a function or a symbol or
character string naming a function.When FUN is present, tapply calls FUN for each
cell that has any data in it. If FUN returns a single atomic
value for each such cell (e.g., functions mean or var)
and when simplify is TRUE, tapply returns a
multiway array containing the values, and NA for the
empty cells. The array has the same number of dimensions as
INDEX has components; the number of levels in a dimension is
the number of levels (nlevels()) in the corresponding component
of INDEX. Note that if the return value has a class (e.g., an
object of class "Date") the class is discarded.Note that contrary to S, simplify = TRUE always returns an
array, possibly 1dimensional.If FUN does not return a single atomic value, tapply
returns an array of mode list whose components are the
values of the individual calls to FUN, i.e., the result is a
list with a dim attribute.When there is an array answer, its dimnames are named by
the names of INDEX and are based on the levels of the grouping
factors (possibly after coercion).For a list result, the elements corresponding to empty cells are
NULL.
Note
Optional arguments to FUN
supplied by the ...
argument
are not divided into cells. It is therefore inappropriate for
FUN
to expect additional arguments with the same length as
X
.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
See Also
the convenience functions by
and
aggregate
(using tapply
);
apply
,
lapply
with its versions
sapply
and mapply
.
Examples
library(base)
require(stats)
groups < as.factor(rbinom(32, n = 5, prob = 0.4))
tapply(groups, groups, length) # is almost the same as
table(groups)
## contingency table from data.frame : array with named dimnames
tapply(warpbreaks$breaks, warpbreaks[,1], sum)
tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum)
n < 17; fac < factor(rep(1:3, length = n), levels = 1:5)
table(fac)
tapply(1:n, fac, sum)
tapply(1:n, fac, sum, simplify = FALSE)
tapply(1:n, fac, range)
tapply(1:n, fac, quantile)
## example of ... argument: find quarterly means
tapply(presidents, cycle(presidents), mean, na.rm = TRUE)
ind < list(c(1, 2, 2), c("A", "A", "B"))
table(ind)
tapply(1:3, ind) #> the split vector
tapply(1:3, ind, sum)
## Some assertions (not held by all patch propsals):
nq < names(quantile(1:5))
stopifnot(
identical(tapply(1:3, ind), c(1L, 2L, 4L)),
identical(tapply(1:3, ind, sum),
matrix(c(1L, 2L, NA, 3L), 2, dimnames = list(c("1", "2"), c("A", "B")))),
identical(tapply(1:n, fac, quantile)[1],
array(list(`2` = structure(c(2, 5.75, 9.5, 13.25, 17), .Names = nq),
`3` = structure(c(3, 6, 9, 12, 15), .Names = nq),
`4` = NULL, `5` = NULL), dim=4, dimnames=list(as.character(2:5)))))
Community examples
This example is originally given in [An Introduction to R](https://cran.rproject.org/doc/manuals/rrelease/Rintro.html). ```{r} statef < c("tas", "sa", "qld", "nsw", "nsw", "nt", "wa", "wa", "qld", "vic", "nsw", "vic", "qld", "qld", "sa", "tas", "sa", "nt", "wa", "vic", "qld", "nsw", "nsw", "wa", "sa", "act", "nsw", "vic", "vic", "act") incomes < c(60, 49, 40, 61, 64, 60, 59, 54, 62, 69, 70, 42, 56, 61, 61, 61, 58, 51, 48, 65, 49, 49, 41, 48, 52, 46, 59, 46, 58, 43) (incmeans < tapply(incomes, statef, mean)) ```