More information regarding Confidence (Compatibiity) Intervals and how they are computed in effectsize.
Unless stated otherwise, confidence (compatibility) intervals (CIs) are
estimated using the noncentrality parameter method (also called the "pivot
method"). This method finds the noncentrality parameter ("ncp") of a
noncentral t, F, or
For additional details on estimation and troubleshooting, see effectsize_CIs.
"Confidence intervals on measures of effect size convey all the information
in a hypothesis test, and more." (Steiger, 2004). Confidence (compatibility)
intervals and p values are complementary summaries of parameter uncertainty
given the observed data. A dichotomous hypothesis test could be performed
with either a CI or a p value. The 100 (1 -
Note that a confidence interval including 0 does not indicate that the null
(no effect) is true. Rather, it suggests that the observed data together with
the model and its assumptions combined do not provided clear evidence against
a parameter value of 0 (same as with any other value in the interval), with
the level of this evidence defined by the chosen
Some effect sizes are directionless--they do have a minimum value that would
be interpreted as "no effect", but they cannot cross it. For example, a null
value of Kendall's W is 0, indicating no difference between
groups, but it can never have a negative value. Same goes for
U2 and Overlap: the null value of
When bootstrapping CIs for such effect sizes, the bounds of the CIs will
never cross (and often will never cover) the null. Therefore, these CIs
should not be used for statistical inference.
Typically, CIs are constructed as two-tailed intervals, with an equal
proportion of the cumulative probability distribution above and below the
interval. CIs can also be constructed as one-sided intervals,
giving only a lower bound or upper bound. This is analogous to computing a
1-tailed p value or conducting a 1-tailed hypothesis test.
Significance tests conducted using CIs (whether a value is inside the interval)
and using p values (whether p < alpha for that value) are only guaranteed
to agree when both are constructed using the same number of sides/tails.
Most effect sizes are not bounded by zero (e.g., r, d, g), and as such
are generally tested using 2-tailed tests and 2-sided CIs.
Some effect sizes are strictly positive--they do have a minimum value, of 0.
For example, alternative = "greater"
is set.
This lower bound interval indicates the smallest effect size that is not
significantly different from the observed effect size. That is, it is the
minimum effect size compatible with the observed data, background model
assumptions, and
One-sided CIs can also be used to test against a maximum effect size value
(e.g., is alternative = "less"
. This estimates a CI with only an
upper bound; anything from the minimum possible value of the effect size
(e.g., 0) up to this upper bound is in the interval.
We can also obtain a 2-sided interval by setting alternative = "two.sided"
.
These intervals can be interpreted in the same way as other 2-sided
intervals, such as those for r, d, or g.
An alternative approach to aligning significance tests using CIs and 1-tailed
p values that can often be found in the literature is to construct a
2-sided CI at a lower confidence level (e.g., 100(1-2
data("hardlyworking")
fit <- lm(salary ~ n_comps, data = hardlyworking)
eta_squared(fit) # default, ci = 0.95, alternative = "greater"
#> For one-way between subjects designs, partial eta squared is equivalent to eta squared.
#> Returning eta squared.
#> # Effect Size for ANOVA
#>
#> Parameter | Eta2 | 95% CI
#> -------------------------------
#> n_comps | 0.19 | [0.14, 1.00]
#>
#> - One-sided CIs: upper bound fixed at [1.00].
eta_squared(fit, alternative = "less") # Test is eta is smaller than some value
#> For one-way between subjects designs, partial eta squared is equivalent to eta squared.
#> Returning eta squared.
#> # Effect Size for ANOVA
#>
#> Parameter | Eta2 | 95% CI
#> -------------------------------
#> n_comps | 0.19 | [0.00, 0.24]
#>
#> - One-sided CIs: lower bound fixed at [0.00].
eta_squared(fit, alternative = "two.sided") # 2-sided bounds for alpha = .05
#> For one-way between subjects designs, partial eta squared is equivalent to eta squared.
#> Returning eta squared.
#> # Effect Size for ANOVA
#>
#> Parameter | Eta2 | 95% CI
#> -------------------------------
#> n_comps | 0.19 | [0.14, 0.25]
eta_squared(fit, ci = 0.9, alternative = "two.sided") # both 1-sided bounds for alpha = .05
#> For one-way between subjects designs, partial eta squared is equivalent to eta squared.
#> Returning eta squared.
#> # Effect Size for ANOVA
#>
#> Parameter | Eta2 | 90% CI
#> -------------------------------
#> n_comps | 0.19 | [0.14, 0.24]
For very large sample sizes or effect sizes, the width of the CI can be smaller than the tolerance of the optimizer, resulting in CIs of width 0. This can also result in the estimated CIs excluding the point estimate.
For example:
t_to_d(80, df_error = 4555555)
#> d | 95% CI
#> -------------------
#> 0.07 | [0.08, 0.08]
In these cases, consider an alternative optimizer, or an alternative method for computing CIs, such as the bootstrap.
Bauer, P., & Kieser, M. (1996). A unifying approach for confidence intervals and testing of equivalence and difference. Biometrika, 83(4), 934-–937. tools:::Rd_expr_doi("10.1093/biomet/83.4.934")
Rafi, Z., & Greenland, S. (2020). Semantic and cognitive tools to aid statistical science: Replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology, 20(1), Article 244. tools:::Rd_expr_doi("10.1186/s12874-020-01105-9")
Schweder, T., & Hjort, N. L. (2016). Confidence, likelihood, probability: Statistical inference with confidence distributions. Cambridge University Press. tools:::Rd_expr_doi("10.1017/CBO9781139046671")
Steiger, J. H. (2004). Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9(2), 164--182. tools:::Rd_expr_doi("10.1037/1082-989x.9.2.164")
Xie, M., & Singh, K. (2013). Confidence distribution, the frequentist distribution estimator of a parameter: A review. International Statistical Review, 81(1), 3–-39. tools:::Rd_expr_doi("10.1111/insr.12000")