Welch's t-test
results including
Cohen's d with confidence interval (CI),
Bayes factor
(BF), and
area under the receiver operating characteristic
curve
(AUC). For non-parametric version,
Wilcoxon test
results (Mann–Whitney U test,
aka "Wilcoxon rank-sum test", for independent samples; Wilcoxon signed-rank
test for paired samples; including nonparametric "location difference
estimate" (see stats::wilcox.test
); along
with corresponding rank-based BFs as per van Doorn et al., 2020).
t_neat(
var1,
var2,
pair = FALSE,
nonparametric = FALSE,
greater = NULL,
norm_tests = "latent",
norm_plots = FALSE,
ci = NULL,
bf_added = FALSE,
bf_rscale = sqrt(0.5),
bf_sample = 1000,
auc_added = FALSE,
cutoff = NULL,
r_added = TRUE,
for_table = FALSE,
test_title = NULL,
round_descr = 2,
round_auc = 3,
auc_greater = "1",
cv_rep = FALSE,
cv_fold = 10,
hush = FALSE,
plots = FALSE,
rug_size = 4,
aspect_ratio = 1,
y_label = "density estimate",
x_label = "\nvalues",
factor_name = NULL,
var_names = c("1", "2"),
reverse = FALSE
)
Prints t-test statistics (including Cohen's d with CI, BF, and AUC, as
specified via the corresponding parameters) in APA style. Furthermore, when
assigned, returns a list, that contains a named vector 'stats
' with
the following elements: t
(t value), p
(p value), d
(Cohen's d), bf
(Bayes factor), auc
(AUC), accuracy
(overall accuracy using the optimal classification threshold), and
youden
(Youden's index: specificity + sensitivity - 1
). The
latter three are NULL
when auc_added
is FALSE
. When
auc_added
is TRUE
, there are also two or three additional
elements of the list. One is 'roc_obj
', which is a
roc
object, to be used e.g. with the
roc_neat
function. Another is 'best_thresholds
', which
contains the best threshold value(s) for classification, along with
corresponding specificity and sensitivity. The third 'cv_results
'
contains the results, if any, of the cross-validation of TPRs and TNRs
(means per repetition). Finally, if plots
is TRUE
(or
NULL
), the plot is displayed as well as returned as a
ggplot
object, named t_plot
.
Numeric vector; numbers of the first variable.
Numeric vector; numbers of the second variable.
Logical. If TRUE
, all tests (t, BF, AUC) are conducted for
paired samples. If FALSE
(default) for independent samples.
Logical (FALSE
by default). If TRUE
, uses
nonparametric (rank-based, "Wilcoxon") t-tests (including BFs; see Notes).
NULL
or string (or number); optionally specifies
one-sided tests (t and BF): either "1" (var1
mean expected to be
greater than var2
mean) or "2" (var2
mean expected to be
greater than var1
mean). If NULL
(default), the test is
two-sided.
Normality tests. Any or all of the following character input
is accepted (as a single string or a character vector; case-insensitive):
"W"
(Shapiro-Wilk), "K2"
(D'Agostino), "A2"
(Anderson-Darling), "JB"
(Jarque-Bera); see Notes. Two other options
are "all"
(same as TRUE
; to choose all four previous tests at
the same time) or "latent"
(default value; prints all tests only if
nonparametric
is set to FALSE
and any of the four tests gives
a p value below .05). Each normality test is performed for the difference
values between the two variables in case of paired samples, or for each of
the two variables for unpaired samples. Set to "none"
to disable
(i.e., not to perform any normality tests).
If TRUE
, displays density, histogram, and Q-Q plots
(and scatter plots for paired tests) for each of the two variable (and
differences for pairwise observations, in case of paired samples).
Numeric; confidence level for returned CIs for Cohen's d and AUC.
Logical. If TRUE
(default), Bayes factor is calculated
and displayed.
The scale of the prior distribution (0.707
by
default).
Number of samples used to estimate Bayes factor (1000
by default). More samples (e.g. 10000
) take longer time but give more
stable BF.
Logical (FALSE
by default). If TRUE
, AUC is
calculated and displayed. Includes TPR and TNR, i.e., true positive and true
negative rates, i.e. sensitivity and specificity, using an optimal
value, i.e. threshold, that provides maximal TPR and TNR. These values may
be cross-validated: see cv_rep
. (Note that what is designated as
"positive" or "negative" depends on the scenario: this function always
assumes var1
as positive and var2
as negative. If your
scenario or preference differs, you can simply switch the names or values
when reporting the results.)
Numeric. Custom cutoff value for AUC TPR and TNR, also to be depicted in the plot. In case of multiple given, the first is used for calculations, but all will be depicted in the plot.
Logical. If TRUE
(default), Pearson correlation is
calculated and displayed in case of paired comparison.
Logical. If TRUE
, omits the confidence level display
from the printed text.
NULL
or string. If not NULL
, simply displayed
in printing preceding the statistics. (Useful e.g. to distinguish several
different comparisons inside a function
or a for
loop.)
Number to round
to the descriptive
statistics (means and SDs).
Number to round
to the AUC and its CI.
String (or number); specifies which variable is expected to
have greater values for 'cases' as opposed to 'controls': "1" (default;
var1
expected to be greater for 'cases' than var2
mean) or "2"
(var2
expected to be greater for 'cases' than var1
). Not to be
confused with one-sided tests; see Details.
FALSE
(default), TRUE
, or numeric. If TRUE
or numeric, a cross-validation is performed for the calculation of TPRs and
TNRs. Numeric value specifies the number of repetitions, while, if
TRUE
, it defaults to 100
repetitions. In each repetition, the
data is divided into k
random parts ("folds"; see cv_fold
),
and the optimal accuracy is obtained k times from a k-1 training set
(var1
and var2
truncated to equal length, if needed, in each
case within each repetition), and the TPR and TNR are calculated from the
remaining test set (different each time).
Numeric. The number of folds into which the data is divided for cross-validation (default: 10).
Logical. If TRUE
, prevents printing any details to console.
Logical (or NULL
). If TRUE
, creates a combined
density plot (i.e., Gaussian kernel density
estimates
) from the two variables. Includes dashed vertical lines to
indicate means of each of the two variables. If nonparametric
is set
to TRUE
, medians are calculated for these dashed lines instead of
means. When auc_added
is TRUE
(and the AUC is at least .5),
the best threshold value for classification (maximal differentiation
accuracy using Youden's index) is added to the plot as solid vertical line.
(In case of multiple best thresholds with identical overall accuracy, all
are added.) If NULL
, same as if TRUE
except that histogram is
added to the background.
Numeric (4
by default): size of the rug ticks below the
density plot. Set to 0
(zero) to omit rug plotting.
Aspect ratio of the plots: 1
(1
/1
) by
default. (Set to NULL
for dynamic aspect ratio.)
String or NULL
; the label for the y
axis.
(Default: "density estimate"
.)
String or NULL
; the label for the x
axis.
(Default: "values"
.)
String or NULL
; factor legend title. (Default:
NULL
.)
A vector of two strings; the variable names to be displayed
in the legend. (Default: c("1", "2")
.)
Logical. If TRUE
, reverses the order of variable names
displayed in the legend.
The Bayes factor (BF) supporting null hypothesis is denoted as BF01, while
that supporting alternative hypothesis is denoted as BF10. When the BF is
smaller than 1 (i.e., supports null hypothesis), the reciprocal is calculated
(hence, BF10 = BF, but BF01 = 1/BF). When the BF is greater than or equal to
10000, scientific (exponential) form is reported for readability. (The
original full BF number is available in the returned named vector as
bf
.)
For simplicity, Cohen's d is reported for nonparametric tests too: you may however want to consider reporting alternative effect sizes in this case.
The original pROC::auc
function, by default, always
returns an AUC greater than (or equal to) .5, assuming that the prediction
based on values in the expected direction work correctly at least at chance
level. This however may be confusing. Consider an example where we measure the
heights of persons in a specific small sample and expect that greater height
predicts masculine gender. The results are, say, 169, 175, 167, 164 (cm) for
one gender, and 176, 182, 179, 165 for the other. If the expectation is
correct (the second, greater values are for males), the AUC is .812. However,
if in this particular population females are actually taller than males, the
AUC is in fact .188. To keep things clear, the t_neat
function always
makes an assumption about which variable is expected to be greater for correct
classification ("1" by default; i.e., var1
; to be specified as
auc_greater = "2"
for var2
to be expected as greater). For this
example, if the first (smaller) variables are given as var1
for
females, and second (larger), variables are given as var2
for males, we
have to specify auc_greater = "2"
to indicate the expectation of larger
values for males. (Or, easier, just add the expected larger values as
var1
.)
Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use Welch's t-test instead of Student's t-test. International Review of Social Psychology, 30(1). tools:::Rd_expr_doi("http://doi.org/10.5334/irsp.82")
Kelley, K. (2007). Methods for the behavioral, educational, and social sciences: An R package. Behavior Research Methods, 39(4), 979-984. tools:::Rd_expr_doi("https://doi.org/10.3758/BF03192993")
Lakens, D. (2015). The perfect t-test (version 1.0.0). Retrieved from https://github.com/Lakens/perfect-t-test. tools:::Rd_expr_doi("https://doi.org/10.5281/zenodo.17603")
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J. C., & Muller, M. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC bioinformatics, 12(1), 77. tools:::Rd_expr_doi("https://doi.org/10.1186/1471-2105-12-77")
van Doorn, J., Ly, A., Marsman, M., & Wagenmakers, E.-J. (2020). Bayesian rank-based hypothesis testing for the rank sum test, the signed rank test, and Spearman’s rho. Journal of Applied Statistics, 1–23. tools:::Rd_expr_doi("https://doi.org/10.1080/02664763.2019.1709053")
Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141–2155. tools:::Rd_expr_doi("https://doi.org/10.1080/00949655.2010.520163")
corr_neat
, roc_neat
# assign two variables (numeric vectors)
v1 = c(191, 115, 129, 43, 523,-4, 34, 28, 33,-1, 54)
v2 = c(4,-2, 23, 13, 32, 16, 3, 29, 37,-4, 65)
t_neat(v1, v2) # prints results as independent samples
t_neat(v1, v2, pair = TRUE) # as paired samples (r added by default)
t_neat(v1, v2, pair = TRUE, greater = "1") # one-sided
t_neat(v1, v2, pair = TRUE, auc_added = TRUE ) # AUC included
# print results and assign returned list
results = t_neat(v1, v2, pair = TRUE)
results$stats['bf'] # get precise BF value
Run the code above in your browser using DataLab