Cross Tabulation and Table Creation
table uses the cross-classifying factors to build a contingency
table of the counts at each combination of factor levels.
table(…, exclude = if (useNA == "no") c(NA, NaN), useNA = c("no", "ifany", "always"), dnn = list.names(…), deparse.level = 1)
as.table(x, …) is.table(x)
# S3 method for table as.data.frame(x, row.names = NULL, …, responseName = "Freq", stringsAsFactors = TRUE, sep = "", base = list(LETTERS))
one or more objects which can be interpreted as factors (including character strings), or a list (or data frame) whose components can be so interpreted. (For
as.table, arguments passed to specific methods; for
levels to remove for all factors in
…. If it does not contain
useNAis not specified, it implies
useNA = "ifany". See ‘Details’ for its interpretation for non-factor arguments.
whether to include
NAvalues in the table. See ‘Details’. Can be abbreviated.
the names to be given to the dimensions in the result (the dimnames names).
controls how the default
dnnis constructed. See ‘Details’.
an arbitrary R object, or an object inheriting from class
as.data.framemethod. Note that
as.data.frame.table(x, *)may be called explicitly for non-table
a character vector giving the row names for the data frame.
The name to be used for the column of table entries, usually counts.
logical: should the classifying factors be returned as factors (the default) or character vectors?
- sep, base
If the argument
dnn is not supplied, the internal function
list.names is called to compute the ‘dimname names’. If the
… are named, those names are used. For the
deparse.level = 0 gives an empty name,
deparse.level = 1 uses the supplied argument if it is a symbol,
deparse.level = 2 will deparse the argument.
exclude is specified (i.e., not by default) and
table potentially drop levels of factor
useNA controls if the table includes counts of
values: the allowed values correspond to never (
"no"), only if the count is
"ifany") and even for zero counts (
Note the somewhat “pathological” case of two different kinds of
NAs which are treated differently, depending on both
d.patho in the
a are coerced via
exclude=exclude). Since R 3.4.0, care is taken not to
count the excluded values (where they were included in the
summary method for class
"table" (used for objects
xtabs) which gives basic
information and performs a chi-squared test for independence of
factors (note that the function
only handles 2-d tables).
table() returns a contingency table, an object of
"table", an array of integer values.
Note that unlike S the result is always an
array, a 1D
array if one factor is given.
is.table coerce to and test for contingency
as.data.frame method for objects inheriting from class
"table" can be used to convert the array-based representation
of a contingency table to a data frame containing the classifying
factors and the corresponding entries (the latter as component
responseName). This is the inverse of
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
tabulate is the underlying function and allows finer
xtabs for cross tabulation of data frames with a
require(stats) # for rpois and xtabs ## Simple frequency distribution table(rpois(100, 5)) ## Check the design: with(warpbreaks, table(wool, tension)) table(state.division, state.region) # simple two-way contingency table with(airquality, table(cut(Temp, quantile(Temp)), Month)) a <- letters[1:3] table(a, sample(a)) # dnn is c("a", "") table(a, sample(a), deparse.level = 0) # dnn is c("", "") table(a, sample(a), deparse.level = 2) # dnn is c("a", "sample(a)") ## xtabs() <-> as.data.frame.table() : UCBAdmissions ## already a contingency table DF <- as.data.frame(UCBAdmissions) class(tab <- xtabs(Freq ~ ., DF)) # xtabs & table ## tab *is* "the same" as the original table: all(tab == UCBAdmissions) all.equal(dimnames(tab), dimnames(UCBAdmissions)) a <- rep(c(NA, 1/0:3), 10) table(a) # does not report NA's table(a, exclude = NULL) # reports NA's b <- factor(rep(c("A","B","C"), 10)) table(b) table(b, exclude = "B") d <- factor(rep(c("A","B","C"), 10), levels = c("A","B","C","D","E")) table(d, exclude = "B") print(table(b, d), zero.print = ".") ## NA counting: is.na(d) <- 3:4 d. <- addNA(d) d.[1:7] table(d.) # ", exclude = NULL" is not needed ## i.e., if you want to count the NA's of 'd', use table(d, useNA = "ifany") ## "pathological" case: d.patho <- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho) <- 3:4 d.patho ## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here : as.integer(d.patho) # 1 4 NA NA 1 2 ## ## In R >= 3.4.0, table() allows to differentiate: table(d.patho) # counts the "unusual" NA table(d.patho, useNA = "ifany") # counts all three table(d.patho, exclude = NULL) # (ditto) table(d.patho, exclude = NA) # counts none ## Two-way tables with NA counts. The 3rd variant is absurd, but shows ## something that cannot be done using exclude or useNA. with(airquality, table(OzHi = Ozone > 80, Month, useNA = "ifany")) with(airquality, table(OzHi = Ozone > 80, Month, useNA = "always")) with(airquality, table(OzHi = Ozone > 80, addNA(Month)))