add_var: Prepare variables for add_summary

Description

This function processes a dataset for statistical analysis by categorizing variables into continuous and categorical types. It automatically handles normality checks, equality of variances checks, and expected frequency assumptions checks.

Usage

add_var(data, var = NULL, group = "group", norm = "auto", center = "median")

Value

A modified data frame with an attribute 'add_var' containing a list of categorized variables and their properties:

var: List of categorized variables:
- valid: All valid variable names after checks
- continuous: Sublist of continuous variables (further divided by normality/equal variance)
- categorical: Sublist of categorical variables (further divided by ordered/expected frequency)
group: Grouping variable name
overall_n: Total number of observations
group_n: Observation counts per group
group_nlevels: Number of groups
group_levels: Group level names
norm: Normality check method used

Arguments

data

A data frame containing the variables to analyze, with variables at columns and observations at rows.

var

A character vector of variable names to include. If NULL, by default, all columns except the group column will be used.

group

A character string specifying the grouping variable in data. If not specified, 'group', by default.

norm

Control parameter for normality tests. Accepts:

'auto': Automatically decide based on p-values, but the same as 'ask' when n > 1000, default
'ask': Show p-values, plots QQ plots and prompts for decision
TRUE/'true': Always assuming data are normally distributed
FALSE/'false': Always assuming data are non-normally distributed

center

A character string specifying the center to use in Levene's test for equality of variances. Default is 'median', which is more robust than the mean.

Examples

Run this code

data <- add_var(iris, var = c("Sepal.Length", "Species"), group = "Species")

Run the code above in your browser using DataLab