Learn R Programming

popEpi (version 0.2.1)

ltable: Tabulate counts and other functions with multiple factors into a long-format table

Description

ltable makes use of data.table capabilities to tabulate frequencies or arbitrary functions of given variables into a long format data.table/data.frame.

Usage

ltable(data, by.vars = "sex", expr = list(obs = .N), subset = NULL,
  use.levels = TRUE, na.rm = FALSE, robust = TRUE)

Arguments

data
individual-level or aggregated data
by.vars
names of variables that are used for categorization, as a character vector, e.g. c('sex','agegroup')
expr
object or a list of objects where each object is a function of a variable (see: details)
subset
a logical condition; data is limited accordingly before evaluating expr
use.levels
logical; if TRUE, uses factor levels of given variables if present; if you want e.g. counts for levels that actually have zero observatios but are levels in a factor variable, use this
na.rm
logical; if TRUE, drops rows in table that have more than zero NA values on any by.vars column
robust
logical; if TRUE, runs the outputted data's by.vars columns through robust_values before outputting

Details

Returns expr for each unique combination of given by.vars. By default makes use of any and all levels present for each variable in by.vars. This is useful, because even if a subset of the data does not contain observations for e.g. a specific age group, those age groups are nevertheless presented in the resulting table; e.g. with the default expr = list(obs = .N) all age group levels are represented by a row and can have obs = 0. The function differs from the vanilla table by giving a long format table of values regardless of the number of by.vars given. Make use of e.g. cast_simple if data needs to be presented in a wide format (e.g. a two-way table). The rows of the long-format table are effectively cross-products of the levels of each variable in by.vars, e.g. with by.vars = c("sex", "area") all levels of area are repeated for both levels of sex in the table. The expr allows the user to apply any function(s) on all levels defined by by.vars. Here are some examples:
  • .N or list(.N) is a function used inside adata.tableto calculate counts in each group
  • list(obs = .N), same as above but user assigned variable name
  • list(sum(obs), sum(pyrs), mean(dg_age)), multiple objects in a list
  • list(obs = sum(obs), pyrs = sum(pyrs)), same as above with user defined var names
If use.levels = FALSE, no levels information will be used. This means that if e.g. the agegroup variable is a factor and has 18 levels defined, but only 15 levels are present in the data, no rows for the missing levels will be shown in the table. na.rm simply drops any rows from the resulting table where any of the by.vars values was NA.

See Also

table, cast_simple, melt

Examples

Run this code
sr <- copy(sire)
sr$agegroup <- cut(sr$dg_age, breaks=c(0,45,60,75,85,Inf))
## counts by default
ltable(sr, "agegroup")

## any expression can be given
ltable(sr, "agegroup", list(mage = mean(dg_age)))
ltable(sr, "agegroup", list(mage = mean(dg_age), vage = var(dg_age)))

## also returns levels where there are zero rows (expressions as NA)
ltable(sr, "agegroup", list(obs = .N, minage = min(dg_age), maxage = max(dg_age)),
       subset = dg_age < 85)

Run the code above in your browser using DataLab