While polychoric correlations are generally more appropriate for ordered categorical
data @foldness.2022mantar, they may encounter estimation problems if
the number of available observations is small relative to the number of estimated
parameters @see e.g., @johal.2023mantar. Our preliminary simulations
suggest that in such cases Pearson correlations may introduce less bias, an effect
that becomes even more pronounced when data are missing.
This helper function provides a recommendation on which variables to treat as
ordered. In general, variables with more than max_categories categories are
recommended to be treated as continuous, whereas for variables with fewer categories
the procedure evaluates whether the amount of available information is too limited
to justify polychoric estimation, in which case Pearson correlations are recommended
instead. This procedure is only a helper, is still under early development, and may
be refined in future versions.