Learn R Programming

tidysummary (version 0.1.0)

add_var: Prepare variables for add_summary

Description

This function processes a dataset for statistical analysis by categorizing variables into continuous and categorical types. It automatically handles normality checks, equality of variances checks, and expected frequency assumptions checks.

Usage

add_var(data, var = NULL, group = "group", norm = "auto", center = "median")

Value

A modified data frame with an attribute 'add_var' containing a list of categorized variables and their properties:

  • var: List of categorized variables:

    • valid: All valid variable names after checks

    • continuous: Sublist of continuous variables (further divided by normality/equal variance)

    • categorical: Sublist of categorical variables (further divided by ordered/expected frequency)

  • group: Grouping variable name

  • overall_n: Total number of observations

  • group_n: Observation counts per group

  • group_nlevels: Number of groups

  • group_levels: Group level names

  • norm: Normality check method used

Arguments

data

A data frame containing the variables to analyze, with variables at columns and observations at rows.

var

A character vector of variable names to include. If NULL, by default, all columns except the group column will be used.

group

A character string specifying the grouping variable in data. If not specified, 'group', by default.

norm

Control parameter for normality tests. Accepts:

  • 'auto': Automatically decide based on p-values, but the same as 'ask' when n > 1000, default

  • 'ask': Show p-values, plots QQ plots and prompts for decision

  • TRUE/'true': Always assuming data are normally distributed

  • FALSE/'false': Always assuming data are non-normally distributed

center

A character string specifying the center to use in Levene's test for equality of variances. Default is 'median', which is more robust than the mean.

Examples

Run this code
data <- add_var(iris, var = c("Sepal.Length", "Species"), group = "Species")

Run the code above in your browser using DataLab