Learn R Programming

mantar (version 0.2.0)

ordered_suggest: Heuristic procedure for identifying ordered categorical variables

Description

Suggests which variables in a data set may be treated as ordered categorical based on their number of unique categories and the amount of available information for estimating the network structure. This function provides a preliminary, non-binding recommendation and should be interpreted as a beta-level heuristic.

Usage

ordered_suggest(data, max_categories = 7)

Value

A logical vector of length ncol(data) indicating, for each variable, whether it is recommended to be treated as ordered (TRUE) or continuous (FALSE). Additionally, a message is printed to the console summarizing the recommendation in terms of which correlation methods to use.

Arguments

data

Raw data matrix or data frame containing the variables to be included in the network. May include missing values.

max_categories

Maximum number of categories a variable may have to be treated as ordered (default: 7).

Details

While polychoric correlations are generally more appropriate for ordered categorical data @foldness.2022mantar, they may encounter estimation problems if the number of available observations is small relative to the number of estimated parameters @see e.g., @johal.2023mantar. Our preliminary simulations suggest that in such cases Pearson correlations may introduce less bias, an effect that becomes even more pronounced when data are missing.

This helper function provides a recommendation on which variables to treat as ordered. In general, variables with more than max_categories categories are recommended to be treated as continuous, whereas for variables with fewer categories the procedure evaluates whether the amount of available information is too limited to justify polychoric estimation, in which case Pearson correlations are recommended instead. This procedure is only a helper, is still under early development, and may be refined in future versions.

References

Examples

Run this code
# Suggest ordered variables in a data set with mixed variable types
# (400 observations for 8 variables)
ordered_suggest(data = mantar_dummy_full_mix, max_categories = 7)

Run the code above in your browser using DataLab