statcheck (version 1.5.0)

statcheck: Extract statistics and recompute p-values

Description

statcheck extracts Null Hypothesis Significance (NHST) results from strings and returns the extracted values, reported p-values and recomputed p-values.

Usage

statcheck(
  texts,
  stat = c("t", "F", "cor", "chisq", "Z", "Q"),
  OneTailedTests = FALSE,
  alpha = 0.05,
  pEqualAlphaSig = TRUE,
  pZeroError = TRUE,
  OneTailedTxt = FALSE,
  AllPValues = FALSE,
  messages = TRUE
)

Value

A data frame containing for each extracted statistic:

source

Name of the file of which the statistic is extracted

test_type

Character indicating the statistic that is extracted

df1

First degree of freedom (if applicable)

df2

Second degree of freedom

test_comp

Reported comparison of the test statistic, when importing from pdf this will often not be converted properly

test_value

Reported value of the statistic

p_comp

Reported comparison, when importing from pdf this might not be converted properly

reported_p

The reported p-value, or NA if the reported value was n.s.

computed_p

The recomputed p-value

raw

Raw string of the statistical reference that is extracted

error

The computed p value is not congruent with the reported p-value

decision_error

The reported result is significant whereas the recomputed result is not, or vice versa.

one_tailed_in_txt

Logical. Does the text contain the string "sided", "tailed", and/or "directional"?

apa_factor

What proportion of all detected p-values was part of a fully APA reported result?

Arguments

texts

A vector of strings.

stat

Specify which test types you want to extract. "t" to extract t-values, "F" to extract F-values, "cor" to extract correlations, "chisq"to extract \(\chi2\) values, "Z" to extract Z-values, and "Q" to extract Q-values. Using c() you can specify multiple tests. Defaults to all tests.

OneTailedTests

Logical. Do you want to assume that all reported tests are one-tailed (TRUE) or two-tailed (FALSE, default)?

alpha

Assumed level of significance in the scanned texts. Defaults to .05.

pEqualAlphaSig

Logical. If TRUE, statcheck counts p <= alpha as significant (default), if FALSE, statcheck counts p < alpha as significant.

pZeroError

Logical. If TRUE, statcheck counts p = .000 as an error (because a p-value is never exactly zero, and should be reported as < .001), if FALSE, statcheck does not count p = .000 automatically as an error.

OneTailedTxt

Logical. If TRUE, statcheck searches the text for "one-sided", "one-tailed", and "directional" to identify the possible use of one-sided tests. If one or more of these strings is found in the text AND the result would have been correct if it was a one-sided test, the result is assumed to be indeed one-sided and is counted as correct.

AllPValues

Logical. If TRUE, the output will consist of a dataframe with all detected p values, also the ones that were not part of the full results in APA format.

messages

Logical. If TRUE, statcheck will print a progress bar while it's extracting statistics from text.

Details

statcheck roughly works in three steps.

1. Scan text for statistical results

statcheck uses regular expressions to recognizes statistical results from t-tests, F-tests, \(\chi2\)-tests, Z-tests, Q-tests, and correlations. statcheck can only recognize these results if the results are reported exactly according to the APA guidelines:

  • t(df) = value, p = value

  • F(df1, df2) = value, p = value

  • r(df) = value, p = value

  • \(\chi2\) (df, N = value) = value, p = value (N is optional)

  • Z = value, p = value

  • Q(df) = value, p = value (statcheck can distinguish between Q, Qw / Q-within, and Qb / Q-between)

statcheck takes into account that test statistics and p values may be exactly (=) or inexactly (< or >) reported. Different spacing has also been taken into account.

2. Recompute p-value

statcheck uses the reported test statistic and degrees of freedom to recompute the p-value. By default, the recomputed p-value is two-sided

3. Compare reported and recomputed p-value

This comparison takes into account how the results were reported, e.g., p < .05 is treated differently than p = .05. Incongruent p values are marked as an error. If the reported result is significant and the recomputed result is not, or vice versa, the result is marked as a decision_error.

Correct rounding is taken into account. For instance, a reported t-value of 2.35 could correspond to an actual value of 2.345 to 2.354 with a range of p-values that can slightly deviate from the recomputed p-value. statcheck will not count cases like this as errors.

Note that when statcheck flags an error or decision_error, it implicitly assumes that the p-value is the inconsistent value, but it could just as well be the case that the test statistic or degrees of freedom contain a reporting error. statcheck merely detects wether a set of numbers is consistent with each other.

See Also

For more details, see the online manual.

Examples

Run this code
txt <- "blablabla the effect was very significant (t(100)=1, p < 0.001)"
statcheck(txt)

Run the code above in your browser using DataLab