sjt.frq: Summary of frequencies as HTML table

Description

Shows (multiple) frequency tables as HTML file, or saves them as file.

Usage

sjt.frq(data, weight.by = NULL, title.wtd.suffix = " (weighted)",
  title = NULL, value.labels = NULL, sort.frq = c("none", "asc", "desc"),
  altr.row.col = FALSE, string.val = "value", string.cnt = "N",
  string.prc = "raw %", string.vprc = "valid %",
  string.cprc = "cumulative %", string.na = "missings", emph.md = FALSE,
  emph.quart = FALSE, show.summary = TRUE, show.skew = FALSE,
  show.kurtosis = FALSE, skip.zero = "auto", ignore.strings = TRUE,
  auto.group = NULL, auto.grp.strings = TRUE, max.string.dist = 3,
  digits = 2, CSS = NULL, encoding = NULL, file = NULL,
  use.viewer = TRUE, no.output = FALSE, remove.spaces = TRUE)

Arguments

data

A vector or a data frame, for which frequencies should be printed as table.

weight.by

Vector of weights that will be applied to weight all cases. Must be a vector of same length as the input vector. Default is NULL, so no weights are used.

title.wtd.suffix

Suffix (as string) for the title, if weight.by is specified, e.g. title.wtd.suffix=" (weighted)". Default is NULL, so title will not have a suffix when cases are weighted.

title

Table caption, as character vector.

value.labels

Character vector (or list of character vectors) with value labels of the supplied variables, which will be used to label variable values in the output.

sort.frq

Determines whether categories should be sorted according to their frequencies or not. Default is "none", so categories are not sorted by frequency. Use "asc" or "desc" for sorting categories ascending or descending order.

altr.row.col

Logical, if TRUE, alternating rows are highlighted with a light gray background color.

string.val

Character label for the very first table column containing the values (see value.labels).

string.cnt

Character label for the first table data column containing the counts. Default is "N".

string.prc

Character label for the second table data column containing the raw percentages. Default is "raw %".

string.vprc

Character label for the third data table column containing the valid percentages, i.e. the count percentage value exluding possible missing values.

string.cprc

Character label for the last table data column containing the cumulative percentages.

string.na

Character label for the last table data row containing missing values.

emph.md

Logical, if TRUE, the table row indicating the median value will be emphasized.

emph.quart

Logical, if TRUE, the table row indicating the lower and upper quartiles will be emphasized.

show.summary

Logical, if TRUE (default), a summary row with total and valid N as well as mean and standard deviation is shown.

show.skew

Logical, if TRUE, the variable's skewness is added to the summary. The skewness is retrieved from the describe-function of the psych-package and indicated by a lower case Greek gamma.

show.kurtosis

Logical, if TRUE, the variable's kurtosis is added to the summary. The kurtosis is retrieved from the describe-function of the psych-package and indicated by a lower case Greek omega.

skip.zero

Logical, if TRUE, rows with only zero-values are not printed (e.g. if a variable has values or levels 1 to 8, and levels / values 4 to 6 have no counts, these values would not be printed in the table). Use FALSE to print also zero-values, or use "auto" (default) to detect whether it makes sense or not to print zero-values (e.g., a variable "age" with values from 10 to 100, where at least 25 percent of all possible values have no counts, zero-values would be skipped automatically).

ignore.strings

Logical, if TRUE (default), character vectors / string variables will be removed from data before frequency tables are computed.

auto.group

numeric value, indicating the minimum amount of unique values in the count variable, at which automatic grouping into smaller units is done (see group_var). Default value for auto.group is NULL, i.e. auto-grouping is off. See group_var for examples on grouping.

auto.grp.strings

Logical, if TRUE (default), string values in character vectors (string variables) are automatically grouped based on their similarity. The similarity is estimated with the stringdist-package. You can specify a distance-measure via max.string.dist argument. This argument only applies if ignore.strings is FALSE.

max.string.dist

Numeric, the allowed distance of string values in a character vector, which indicates when two string values are merged because they are considered as close enough. See auto.grp.strings.

digits

Numeric, amount of digits after decimal point when rounding estimates and values.

CSS

A list with user-defined style-sheet-definitions, according to the official CSS syntax. For more details, see this package-vignette, or 'Details' in sjt.frq.

encoding

String, indicating the charset encoding used for variable and value labels. Default is NULL, so encoding will be auto-detected depending on your platform (e.g., "UTF-8" for Unix and "Windows-1252" for Windows OS). Change encoding if specific chars are not properly displayed (e.g. German umlauts).

file

Destination file, if the output should be saved as file. If NULL (default), the output will be saved as temporary file and openend either in the IDE's viewer pane or the default web browser.

use.viewer

Logical, if TRUE, the HTML table is shown in the IDE's viewer pane. If FALSE or no viewer available, the HTML table is opened in a web browser.

no.output

Logical, if TRUE, the html-output is neither opened in a browser nor shown in the viewer pane and not even saved to file. This option is useful when the html output should be used in knitr documents. The html output can be accessed via the return value.

remove.spaces

Logical, if TRUE, leading spaces are removed from all lines in the final string that contains the html-data. Use this, if you want to remove parantheses for html-tags. The html-source may look less pretty, but it may help when exporting html-tables to office tools.

Value

Invisibly returns

the web page style sheet (page.style),
each frequency table as web page content (page.content.list),
the complete html-output (output.complete) and
the html-table with inline-css for use with knitr (knitr)

for further use.

Details

How do I use CSS-argument?

With the CSS-argument, the visual appearance of the tables can be modified. To get an overview of all style-sheet-classnames that are used in this function, see return value page.style for details. Arguments for this list have following syntax:

the class-names with "css."-prefix as argument name and
each style-definition must end with a semicolon

You can add style information to the default styles by using a + (plus-sign) as initial character for the argument attributes. Examples:

css.table = 'border:2px solid red;' for a solid 2-pixel table border in red.
css.summary = 'font-weight:bold;' for a bold fontweight in the summary row.
css.lasttablerow = 'border-bottom: 1px dotted blue;' for a blue dotted border of the last table row.
css.colnames = '+color:green' to add green color formatting to column names.
css.arc = 'color:blue;' for a blue text color each 2nd row.
css.caption = '+color:red;' to add red font-color to the default table caption style.

See further examples in this package-vignette.

Examples

Run this code

# NOT RUN {
# load sample data
library(sjmisc)
data(efc)

# show frequencies of "e42dep" in RStudio Viewer Pane
# or default web browser
sjt.frq(efc$e42dep)

# plot and show frequency table of "e42dep" with labels
sjt.frq(efc$e42dep, title = "Dependency",
        value.labels = c("independent", "slightly dependent",
                         "moderately dependent", "severely dependent"))

# plot frequencies of e42dep, e16sex and c172code in one HTML file
# and show table in RStudio Viewer Pane or default web browser
# Note that value.labels of multiple variables have to be
# list-objects
sjt.frq(data.frame(efc$e42dep, efc$e16sex, efc$c172code),
        title = c("Dependency", "Gender", "Education"),
        value.labels = list(c("independent", "slightly dependent",
                              "moderately dependent", "severely dependent"),
                            c("male", "female"), c("low", "mid", "high")))

# auto-detection of labels
sjt.frq(data.frame(efc$e42dep, efc$e16sex, efc$c172code))

# plot larger scale including zero-counts
# indicating median and quartiles
sjt.frq(efc$neg_c_7, emph.md = TRUE, emph.quart = TRUE)

# sort frequencies
sjt.frq(efc$e42dep, sort.frq = "desc")

# User defined style sheet
sjt.frq(efc$e42dep,
        CSS = list(css.table = "border: 2px solid;",
                   css.tdata = "border: 1px solid;",
                   css.firsttablecol = "color:#003399; font-weight:bold;"))
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab

Get 50% off unlimited learning