perform_regression_test: Perform a test on the slope coefficient of a univariate linear regression

Description

This function performs a bootstrap regression test for given data X,Y. The null hypothesis corresponds of a slope coefficient of zero, versus the alternative hypothesis of a non-zero slope coefficient. It uses an independence/null bootstrap "indep", a non-parametric "NP", a residual bootstrap "res_bs", a fixed design bootstrap "fixed_design_bs", a fixed design null bootstrap "fixed_design_bs_Hnull", a hybrid null bootstrap "hybrid_null_bs" as bootstrap resampling schemes to perform the bootstrap. This function gives the corresponding p-values, the true test statistic and the bootstrap-version test statistics. Furthermore, it also gives the estimated slope.The default (and valid) method implemented in this function is the null bootstrap, together with the equivalent test statistic. Via the bootstrapOptions argument, the user can specify other bootstrap resampling schemes and test statistics.

Usage

perform_regression_test(
  X,
  Y,
  nBootstrap = 100,
  show_progress = TRUE,
  bootstrapOptions = NULL
)

Value

A class object with components

pvals_df a dataframe of p-values and bootstrapped test statistics:

These are the p-values for the combinations of bootstrap resampling schemes, test statistics (centered and equivalent).

It also contains the vectors of bootstrap test statistics for each of the combinations.
true_stat a named vector of size 1 containing the true test statistic.
nBootstrap Number of bootstrap repetitions.
data named list of the used input data, i.e. X and Y.
nameMethod string for the name of the method used.
beta numeric value of the estimated slope of the regression model.

Arguments

X

numeric univariate input vector resembling the independent variables

Y

numeric univariate input vector the dependent variables

nBootstrap

numeric value of the amount of bootstrap resamples

show_progress

logical value indicating whether to show a progress bar

bootstrapOptions

This can be one of

NULL This uses the default options type_boot = "indep", type_stat = "eq".
a list with at most 2 elements names
- type_boot type of bootstrap resampling scheme. It must be one of
  - "indep" for the independence bootstrap (i.e. under the null). This is the default.
  - "NP" for the non-parametric bootstrap (i.e. n out of n bootstrap).
  - "res_bs" for the residual bootstrap.
  - "hybrid_null_bs" for the hybrid null bootstrap
  - "fixed_design_bs" for the fixed design bootstrap
  - "fixed_design_bs_Hnull" for the fixed design null bootstrap.
- type_stat type of test statistic to be used. It must be one of
  - "eq" for the equivalent test statistic \( T_n^* = \sqrt{n} | \hat{b}^* | \). This is the default.
  - "cent" for the centered test statistic \( T_n^* = \sqrt{n} | \hat{b}^* - \hat{b} | \)
  For each type_boot there is only one valid choice of type_stat to be made. If type_stat is not specified, the valid choice is automatically used.
"all" this gives test results for all theoretically valid combinations of bootstrap resampling schemes.
"all and also invalid" this gives test results for all possible combinations of bootstrap resampling schemes and test statistics, including invalid ones.

A warning is raised if the given combination of type_boot and type_stat is theoretically invalid.

References

Derumigny, A., Galanis, M., Schipper, W., & van der Vaart, A. (2025). Bootstrapping not under the null? ArXiv preprint, tools:::Rd_expr_doi("10.48550/arXiv.2512.10546")

Examples

Run this code

n <- 100

# Under H1
X_data <- rnorm(n)
Y_data <-  X_data + rnorm(n)   #Y = X + epsilon
result <- perform_regression_test(X_data, Y_data, nBootstrap = 100,
                        bootstrapOptions =  list(type_boot = "indep",
                                                 type_stat = "eq"))
print(result)
plot(result)

# Under H0
X_data <- rnorm(n)
Y_data <-  0 * X_data + rnorm(n)   # (as b = 0 under H0)
result <- perform_regression_test(X_data, Y_data, nBootstrap = 100)
print(result)
plot(result)