Learn R Programming

PoSIAdjRSquared (version 0.1.0)

postselp_value_specified_interval: Post-selection p-value specified interval

Description

This function returns a p-value for the test whether the regression coefficient equals tn_mu (e.g. 0) with a two-sided alternative. The p-value is valid given the model selection, because it conditions on the specified intervals of the OLS estimator where the regression coefficient actually gets selected. The intervals contained in the object "z_interval" can be obtained from the function "solve_selection_event".

Usage

postselp_value_specified_interval(z_interval, etaj, etajTy, tn_mu, tn_sigma)

Value

p_value

The p-value for a two-sided test which is valid after model selection

Arguments

z_interval

The intervals of type "list" where the OLS estimator gets selected: can be obtained from function "solve_selection_event"

etaj

Vector of type "matrix" and dimension nx1: useful in orthogonal decomposition of y (see Lemma 1 for details)

etajTy

The OLS estimator of the j'th selected coefficient in the selected model of type "matrix" and dimension 1x1

tn_mu

Integer for the mean of the truncated sampling distribution of the test statistic under the null hypothesis: for example, if you want to test beta_j=0, specify 0 for the mean

tn_sigma

Integer for the variance of the truncated sampling distribution of the test statistic

References

Pirenne, S. and Claeskens, G. (2024). Exact Post-Selection Inference for Adjusted R Squared.

Examples

Run this code
  # Generate data
  n <- 100
  Data <- datagen.norm(seed = 7, n, p = 10, rho = 0, beta_vec = c(1,0.5,0,0.5,0,0,0,0,0,0))
  X <- Data$X
  y <- Data$y

  # Select model
  result <- fit_all_subset_linear_models(y, X, intercept=FALSE)
  phat <- result$phat
  X_M_phat <- result$X_M_phat
  k <- result$k
  R_M_phat <- result$R_M_phat
  kappa_M_phat <- result$kappa_M_phat
  R_M_k <- result$R_M_k
  kappa_M_k <- result$kappa_M_k

  # Estimate Sigma from residuals of full model
  full_model <- lm(y ~ 0 + X)
  sigma_hat <- sd(resid(full_model))
  Sigma <- diag(n)*(sigma_hat)^2

  # Construct test statistic
  Construct_test <- construct_test_statistic(j = 5, X_M_phat, y, phat, Sigma, intercept=FALSE)
  a <- Construct_test$a
  b <- Construct_test$b
  etaj <- Construct_test$etaj
  etajTy <- Construct_test$etajTy

  # Solve selection event
  Solve <- solve_selection_event(a,b,R_M_k,kappa_M_k,R_M_phat,kappa_M_phat,k)
  z_interval <- Solve$z_interval

  # Post-selection inference for beta_j=0
  tn_sigma <- sqrt((t(etaj)%*%Sigma)%*%etaj)
  postselp_value_specified_interval(z_interval, etaj, etajTy, tn_mu = 0, tn_sigma)

Run the code above in your browser using DataLab