Learn R Programming

furniture (version 1.4.1)

table1: Table 1 for Simple and Stratified Descriptive Statistics

Description

Produces a descriptive table, stratified by an optional categorical variable, providing means/frequencies and standard deviations/percentages. It is well-formatted for easy transition to academic article or report. Can be used within the piping framework [see library(magrittr)].

Usage

table1(.data, ..., all = FALSE, splitby = NULL, row_wise = FALSE, splitby_labels = NULL, medians = NULL, test = FALSE, test_type = "default", simple = FALSE, condense = FALSE, piping = FALSE, rounding = 2, rounding_perc = 1, var_names = NULL, format_output = "pvalues", output_type = "text", format_number = FALSE, NAkeep = FALSE, m_label = "Missing", booktabs = TRUE, caption = NULL, align = NULL, export = NULL)

Arguments

.data
the data.frame that is to be summarized
...
variables in the data set that are to be summarized; unquoted names separated by commas (e.g. age, gender, race) or indices. If indices, it needs to be a single vector (e.g. c(1:5, 8, 9:20) instead of 1:5, 8, 9:20). As it is currently, it CANNOT handle both indices and unquoted names simultaneously.
all
logical; if set to TRUE all variables in the dataset are used. If there is a stratifying variable then that is the only variable excluded.
splitby
the categorical variable to stratify by in formula form (e.g., splitby = ~gender) or quoted (e.g., splitby = "gender"); not too surprisingly, it requires that the number of levels be > 0
row_wise
how to calculate percentages for factor variables when splitby != NULL: if FALSE calculates percentages by variable within groups; if TRUE calculates percentages across groups for one level of the factor variable.
splitby_labels
allows for custom labels of the splitby levels; must match the number of levels of the splitby variable
medians
a vector or list of continuous variables for which medians and 25% and 75% quartiles should be produced
test
logical; if set to TRUE then the appropriate bivariate tests of significance are performed if splitby has more than 1 level
test_type
has two options: "default" performs the default tests of significance only; "or" also give unadjusted odds ratios as well based on logistic regression (only use if splitby has 2 levels)
simple
logical; if set to TRUE then only percentages are shown for categorical variables.
condense
logical; if set to TRUE then continuous variables' means and SD's will be on the same line as the variable name and dichotomous variables only show counts and percentages for the reference category
piping
if TRUE then the table is printed and the original data is passed on. It is very useful in piping situations where one wants the table but wants it to be part of a larger pipe.
rounding
the number of digits after the decimal for means and SD's; default is 2
rounding_perc
the number of digits after the decimal for percentages; default is 1
var_names
custom variable names to be printed in the table
format_output
has three options (with partial matching): 1) "full" provides the table with the type of test, test statistic, and the p-value for each variable; 2) "pvalues" provides the table with the p-values; and 3) "stars" provides the table with stars indicating significance. Only "p-values" works when simple and condense are set to TRUE
output_type
default is "text"; the other options are all format options in the kable() function in knitr (e.g., latex, html, markdown, pandoc) as well as "text2" which adds a line below the header in the table.
format_number
default in FALSE; if TRUE, then the numbers are formatted with commas (e.g., 20,000 instead of 20000)
NAkeep
when sset to TRUE it also shows how many missing values are in the data for each categorical variable being summarized
m_label
when NAkeep = TRUE this provides a label for the missing values in the table
booktabs
when output_type != "text"; option is passed to knitr::kable
caption
when output_type != "text"; option is passed to knitr::kable
align
when output_type != "text"; option is passed to knitr::kable
export
character; when given, it exports the table to a CSV file to folder named "table1" in the working directory with the name of the given string (e.g., "myfile" will save to "myfile.csv")

Value

A table with the number of observations, means/frequencies and standard deviations/percentages is returned. The object is a table1 class object with a print method. Can be printed in LaTex form.

Examples

Run this code
## Ficticious Data ##
library(furniture)
library(tidyverse)

x  <- runif(1000)
y  <- rnorm(1000)
z  <- factor(sample(c(0,1), 1000, replace=TRUE))
a  <- factor(sample(c(1,2), 1000, replace=TRUE))
df <- data.frame(x, y, z, a)

## Simple
table1(df, x, y, z, a)

## Stratified
## both below are the same
table1(df, x, y, z,
       splitby = ~ a)
table1(df, x, y, z,
       splitby = "a")

## With Piping
df %>%
  table1(x, y, z, 
         splitby = ~a, 
         piping = TRUE) %>%
  summarise(count = n())

## Adjust variables within function
table1(df, ifelse(x > 0, 1, 0), z,
       var_names = c("X2", "Z"))
         

Run the code above in your browser using DataLab