tabfreq: Create Frequency Table

Description

Creates an I-by-J frequency table comparing the distribution of y across levels of x.

Usage

tabfreq(formula = NULL, data = NULL, x = NULL, y = NULL,
  columns = c("xgroups", "p"), cell = "counts",
  parenth = "col.percent", sep.char = ", ", test = "chi.fisher",
  xlevels = NULL, yname = NULL, ylevels = NULL,
  compress.binary = FALSE, yname.row = TRUE, indent.spaces = 3,
  text.label = NULL, quantiles = NULL, quantile.vals = FALSE,
  latex = TRUE, decimals = 1, formatp.list = NULL,
  n.headings = FALSE, print.html = FALSE,
  html.filename = "table1.html")

Arguments

formula

Formula, e.g. Sex ~ Group.

data

Data frame containing variables named in formula.

Vector indicating group membership for columns of I-by-J table.

Vector indicating group membership for rows of I-by-J table.

columns

Character vector specifying what columns to include. Choices for each element are "n" for total sample size, "overall" for overall distribution of y, "xgroups" for distributions of y for each x group, "test" for test statistic, and "p" for p-value.

cell

Character string specifying what statistic to display in cells. Choices are "counts", "tot.percent", "col.percent", and "row.percent".

parenth

Character string specifying what statistic to display in parentheses. Choices are "none", "se", "ci", "counts", "tot.percent", "col.percent", and "row.percent".

sep.char

Character string with separator to place between lower and upper bound of confidence intervals. Typically "-" or ", ".

test

Character string specifying which test for association between x and y should be used. Choices are "chi.fisher" for Pearson's chi-squared test if its assumptions are met, otherwise Fisher's exact test; "chi"; "fisher"; "z" for z test without continuity correction; and "z.continuity" for z test with continuity correction. The last two only work if both x and y are binary.

xlevels

Character vector with labels for the levels of x, used in column headings.

yname

Character string with a label for the y variable.

ylevels

Character vector with labels for the levels of y. Note that levels of y are listed in the order that they appear when you run table(y, x).

compress.binary

Logical value for whether to compress binary y variable to a single row, excluding the first level rather than showing both.

yname.row

Logical value for whether to include a row displaying the name of the y variable and indent the factor levels.

indent.spaces

Integer value specifying how many spaces to indent factor levels. Only used if yname.row = TRUE.

text.label

Character string with text to put after the y variable name, identifying what cell values and parentheses represent.

quantiles

Numeric value. If specified, table compares y across quantiles of x created on the fly.

quantile.vals

Logical value for whether labels for x quantiles should show quantile number and corresponding range, e.g. Q1 [0.00, 0.25), rather than just the quantile number.

latex

Logical value for whether to format table so it is ready for printing in LaTeX via xtable or kable.

decimals

Numeric value specifying number of decimal places for numbers other than p-values.

formatp.list

List of arguments to pass to formatp.

n.headings

Logical value for whether to display group sample sizes in parentheses in column headings.

print.html

Logical value for whether to write a .html file with the table to the current working directory.

html.filename

Character string specifying the name of the .html file that gets written if print.html = TRUE.

Value

Data frame which you can print in R (e.g. with xtable's xtable or knitr's kable) or export to Word, Excel, or some other program. To export the table, set print.html = TRUE. This will result in a .html file being written to your current working directory, which you can open and copy/paste into your document.

Examples

Run this code

# NOT RUN {
# Compare sex distribution by group
(freqtable1 <- tabfreq(Sex ~ Group, data = tabdata))

# Same as previous, but specifying input vectors rather than formula
(freqtable2 <- tabfreq(x = tabdata$Group, y = tabdata$Sex))

# Same as previous, but showing male row only and percent (SE) rather than n
# (percent)
(freqtable3 <- tabfreq(Sex ~ Group, data = tabdata,
                       cell = "col.percent", parenth = "se",
                       compress.binary = TRUE))

# Create single table comparing sex and race in control vs. treatment group.
# Drop missing observations first.
tabdata2 <- subset(tabdata, ! is.na(Sex) & ! is.na(Race))
(freqtable4 <- rbind(tabfreq(Sex ~ Group, data = tabdata2),
                     tabfreq(Race ~ Group, data = tabdata2)))

# Same as previous, but using tabmulti for convenience
#(freqtable5 <- tabmulti(data = d, xvarname = "Group",
#                        yvarnames = c("Sex", "Race")))


# }

Run the code above in your browser using DataLab