This function processes a dataset for statistical analysis by categorizing variables into continuous and categorical types. It automatically handles normality checks, equality of variances checks, and expected frequency assumptions checks.
add_var(data, var = NULL, group = "group", norm = "auto", center = "median")
A modified data frame with an attribute 'add_var'
containing a list of categorized variables and their properties:
var
: List of categorized variables:
valid
: All valid variable names after checks
continuous
: Sublist of continuous variables (further divided by normality/equal variance)
categorical
: Sublist of categorical variables (further divided by ordered/expected frequency)
group
: Grouping variable name
overall_n
: Total number of observations
group_n
: Observation counts per group
group_nlevels
: Number of groups
group_levels
: Group level names
norm
: Normality check method used
A data frame containing the variables to analyze, with variables at columns and observations at rows.
A character vector of variable names to include. If NULL
, by default, all columns except the group
column will be used.
A character string specifying the grouping variable in data
. If not specified, 'group'
, by default.
Control parameter for normality tests. Accepts:
'auto'
: Automatically decide based on p-values, but the same as 'ask'
when n > 1000, default
'ask'
: Show p-values, plots QQ plots and prompts for decision
TRUE
/'true'
: Always assuming data are normally distributed
FALSE
/'false'
: Always assuming data are non-normally distributed
A character string specifying the center
to use in Levene's test for equality of variances. Default is 'median'
, which is more robust than the mean.
data <- add_var(iris, var = c("Sepal.Length", "Species"), group = "Species")
Run the code above in your browser using DataLab