ci_cramersv: Confidence Interval for the Population Cramer's V

Description

This function calculates confidence intervals for the population Cramer's V. By default, a parametric approach based on the non-centrality parameter (ncp) of the chi-squared distribution is utilized. Alternatively, bootstrap confidence intervals are available, also by boostrapping confidence intervals for the ncp.

Usage

ci_cramersv(
  x,
  probs = c(0.025, 0.975),
  type = c("chi-squared", "bootstrap"),
  boot_type = c("bca", "perc", "norm", "basic"),
  R = 9999,
  seed = NULL,
  test_adjustment = TRUE,
  ...
)

Arguments

The result of stats::chisq.test, a matrix/table of counts or a data.frame with exactly two columns representing the two variables.

probs

Error probabilites. The default c(0.025, 0.975) gives a symmetric 95% confidence interval.

type

Type of confidence interval. One of "chi-squared" (default) or "bootstrap".

boot_type

Type of bootstrap confidence interval ("bca", "perc", "norm", "basic"). Only used for type = "bootstrap".

The number of bootstrap resamples. Only used for type = "bootstrap".

seed

An integer random seed. Only used for type = "bootstrap".

test_adjustment

Adjustment to allow for test of association, see Details. The default is TRUE.

...

Further arguments passed to resample::CI.boot_type.

Value

A list with class cint containing these components:

parameter: The parameter in question.
interval: The confidence interval for the parameter.
estimate: The estimate for the parameter.
probs: A vector of error probabilities.
type: The type of the interval.
info: An additional description text for the interval.

Details

A positive lower (1-alpha)*100%-confidence limit for the ncp goes hand-in-hand with a significant association test at level alpha. In order to allow such test approach also with Cramer's V, if the lower bound for the ncp is 0, we round down to 0 also the lower bound for Cramer's V. Without this slightly conservative adjustment, the lower limit for V would always be positive since ci for V = sqrt((ci for ncp + df)/(n (k - 1))), where k is the smaller number of levels in the two variables (see Smithson for this formula). Use test_adjustment = FALSE to switch off this behaviour. Note that this is also a reason to bootstrap V via ncp instead of directly bootstrapping V. Bootstrap confidence intervals are calculated by the package "boot", see references. The default bootstrap type is "bca" (bias-corrected accelerated) as it enjoys the property of being second order accurate as well as transformation respecting (see Efron, p. 188). Note that no continuity correction is applied for 2x2 tables. Further note that large chi-squared test statistics might provide unreliable results with method "chi-squared" (see ?pchisq).

References

Smithson, M. (2003). Confidence intervals. Series: Quantitative Applications in the Social Sciences. New York, NY: Sage Publications.
Efron, B. and Tibshirani R. J. (1994). An Introduction to the Bootstrap. Chapman & Hall/CRC.
Canty, A and Ripley B. (2019). boot: Bootstrap R (S-Plus) Functions.

Examples

Run this code

# NOT RUN {
ir <- iris
ir$PL <- ir$Petal.Width > 1
ci_cramersv(ir[, c("Species", "PL")])
ci_cramersv(ir[, c("Species", "PL")], type = "bootstrap", R = 999)
ci_cramersv(ir[, c("Species", "PL")], probs = c(0.05, 1))
ci_cramersv(mtcars[c("am", "vs")])
ci_cramersv(mtcars[c("am", "vs")], test_adjustment = FALSE)
# }

Run the code above in your browser using DataLab