descr: Univariate Statistics for Numerical Data

Description

Calculates mean, sd, min, Q1*, median, Q3*, max, MAD, IQR*, CV, skewness*, SE.skewness*, and kurtosis* on numerical vectors. (*) Not available when using sampling weights.

Usage

descr(
  x,
  var = NULL,
  stats = st_options("descr.stats"),
  na.rm = TRUE,
  round.digits = st_options("round.digits"),
  transpose = st_options("descr.transpose"),
  order = "sort",
  style = st_options("style"),
  plain.ascii = st_options("plain.ascii"),
  justify = "r",
  headings = st_options("headings"),
  display.labels = st_options("display.labels"),
  split.tables = 100,
  weights = NA,
  rescale.weights = FALSE,
  ...
)

Arguments

A numerical vector or a data frame.

var

Unquoted expression referring to a specific column in x. Provides support for piped function calls (e.g. df %>% descr(some_var).

stats

Which stats to produce. Either “all” (default), “fivenum”, “common” (see Details), or a selection of : “mean”, “sd”, “min”, “q1”, “med”, “q3”, “max”, “mad”, “iqr”, “cv”, “skewness”, “se.skewness”, “kurtosis”, “n.valid”, and “pct.valid”. This can be set globally via st_options (“descr.stats”).

na.rm

Argument to be passed to statistical functions. Defaults to TRUE. Can be set globally; see st_options.

round.digits

Number of significant digits to display. Defaults to 2, and can be set globally (see st_options).

transpose

Logical. Makes variables appears as columns, and stats as rows. Defaults to FALSE. To change this default value, see st_options (option “descr.transpose”).

order

Character. One of “sort” (or simply “s”), “preserve” (or “p”), or a vector of all variable names in the desired order. Defaults to “sort”.

style

Style to be used by pander when rendering output table; One of “simple” (default), “grid”, or “rmarkdown” This option can be set globally; see st_options.

plain.ascii

Logical. pander argument; when TRUE, no markup characters will be used (useful when printing to console). Defaults to TRUE unless style = 'rmarkdown', in which case it will be set to FALSE automatically. To change the default value globally, see st_options.

justify

Alignment of numbers in cells; “l” for left, “c” for center, or “r” for right (default). Has no effect on html tables.

headings

Logical. Set to FALSE to omit heading section. Can be set globally via st_options. TRUE by default.

display.labels

Logical. Should variable / data frame labels be displayed in the title section? Default is TRUE. To change this default value globally, see st_options.

split.tables

Pander argument that specifies how many characters wide a table can be. 100 by default.

weights

Vector of weights having same length as x. NA (default) indicates that no weights are used.

rescale.weights

Logical. When set to TRUE, the total count will be the same as the unweighted x. FALSE by default.

…

Additional arguments passed to pander.

Value

An object having classes matrix and summarytools containing the statistics, with extra attributes used by print method.

Examples

Run this code

# NOT RUN {
data("exams")

# All stats for all numerical variabls
descr(exams)

# Only common statistics
descr(exams, stats = "common")

# Arbitrary selection of statistics, transposed
descr(exams, stats = c("mean", "sd", "min", "max"), transpose = TRUE)

# Rmarkdown-ready
descr(exams, plain.ascii = FALSE, style = "rmarkdown")

# Grouped statistics
data("tobacco")
with(tobacco, stby(BMI, gender, descr))

# Grouped statistics, transposed
with(tobacco, stby(BMI, age.gr, descr, stats = "common", transpose = TRUE))

# }
# NOT RUN {
# Show in Viewer (or browser if not in RStudio)
view(descr(exams))

# Save to html file with title
print(descr(exams),
      file = "descr_exams.html", 
      report.title = "BMI by Age Group",
      footnote = "<b>Schoolyear:</b> 2018-2019<br/><b>Semester:</b> Fall")

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab