wqs_pt: WQS permutation test

Description

wqs_pt takes a gwqs object as an input and runs the permutation test (Day et al. 2022) to obtain an estimate for the p-value significance for the WQS coefficient.

Usage

wqs_pt(
  model,
  niter = 200,
  boots = NULL,
  b1_pos = TRUE,
  b_constr = FALSE,
  rs = FALSE,
  plan_strategy = "multicore",
  seed = NULL,
  nworkers = NULL,
  ...
)

Value

wqs_pt returns an object of class "wqs_pt", which contains:

perm_test: List containing: (1) pval: permutation test p-value, (2) (linear WQS regression only) testbeta1: reference WQS coefficient beta1 value, (3) (linear WQS regression only) betas: Vector of beta values from each permutation test run, (4) (WQS GLM only) testpval: test reference p-value, (5) (WQS GLM only) permpvals: p-values from the null models.
gwqs_main: Main gWQS object (same as model input).
gwqs_perm: Permutation test reference gWQS object (NULL if model family != "gaussian" or if same number of bootstraps are used in permutation test WQS regression runs as in the main run).

Arguments

model: A gwqs object as generated from the gWQS package.
niter: Number of permutation test iterations.
boots: Number of bootstrap samples for each permutation test WQS regression iteration. If boots is not specified, then we will use the same bootstrap count for each permutation test WQS regression iteration as was specified in the main WQS regression run.
b1_pos: A logical value that indicates whether beta values should be positive or negative.
b_constr: Logical value that determines whether to apply positive or negative constraints in the optimization function for the weight optimization. Note that this won't guarantee that the iterated b1 values in the weight optimization are only positive (if b1_pos = TRUE) or only negative (if b1_pos = FALSE) as seen in the bres matrix output by the gwqs models (i.e., column bres$b1), but it does substantially increase the probability that those b1 values will be constrained to be either positive or negative. This defaults to FALSE.
rs: A logical value indicating whether random subset implementation should be performed.
plan_strategy: Evaluation strategy for the plan function. You can choose among "sequential", "transparent", "multisession", "multicore", and "cluster". This defaults to "multicore". See the future::plan documentation for full details.
seed: (optional) Random seed for the permutation test WQS reference run. This should be the same random seed as used for the main WQS regression run. This seed will be saved in the "gwqs_perm" object as gwqs_perm$seed. This defaults to NULL.
nworkers: (optional) If the plan_strategy is not "sequential", this argument defines the number of parallel processes to use, which can be critical when using a high-performance computing (HPC) cluster. This should be an integer value. The default behavior for gWQS::gwqs is to use all detected cores on a machine, but for many HPC use scenarios, this would call in cores that have not been allotted by the HPC scheduler, resulting in the submitted job being halted. For example, if one has requested 14 cores on a 28-core HPC queue, one would want to set nworkers = 14. If nworkers was greater than 14 in that case, the HPC job would be terminated. This argument defaults to NULL, in which case length(future::availableWorkers()) will be used to determine the number of parallel processes to use.
...: (optional) Additional arguments to pass to the gWQS::gwqs function.

Details

To use wqs_pt, we first need to run an initial WQS regression run while setting validation = 0. We will use this gwqs object as the model argument for the wqs_pt function. Note that permutation test has so far only been validated for linear WQS regression (i.e., family = "gaussian") or logistic WQS regression (i.e., family = binomial(link = "logit")), though the permutation test algorithm should also work for all WQS GLMs. Therefore, this function accepts gwqs objects made with the following families: "gaussian" or gaussian(link = "identity"), "binomial" or binomial() with any accepted link function (e.g., "logit" or "probit"), "poisson" or poisson(link="log"), "negbin" for negative binomial, and "quasipoisson" or quasipoisson(link="log"). This function cannot currently accommodate gwqs objects made with the "multinomial" family, and it is not currently able to accommodate stratified weights or WQS interaction terms (e.g., y ~ wqs * sex).

The argument boots is the number of bootstraps for the WQS regression run in each permutation test iteration. Note that we may elect a bootstrap count boots lower than that specified in the model object for the sake of efficiency. If boots is not specified, then we will use the same bootstrap count in the permutation test WQS regression runs as that specified in the model argument.

The arguments b1_pos and rs should be consistent with the inputs chosen in the model object. The seed should ideally be consistent with the seed set in the model object for consistency, though this is not required.

References

Day, D. B., Sathyanarayana, S., LeWinn, K. Z., Karr, C. J., Mason, W. A., & Szpiro, A. A. (2022). A permutation test-based approach to strengthening inference on the effects of environmental mixtures: comparison between single index analytic methods. Environmental Health Perspectives, 130(8).

Day, D. B., Collett, B. R., Barrett, E. S., Bush, N. R., Swan, S. H., Nguyen, R. H., ... & Sathyanarayana, S. (2021). Phthalate mixtures in pregnancy, autistic traits, and adverse childhood behavioral outcomes. Environment International, 147, 106330.

Loftus, C. T., Bush, N. R., Day, D. B., Ni, Y., Tylavsky, F. A., Karr, C. J., ... & LeWinn, K. Z. (2021). Exposure to prenatal phthalate mixtures and neurodevelopment in the Conditions Affecting Neurocognitive Development and Learning in Early childhood (CANDLE) study. Environment International, 150, 106409.

Examples

Run this code

library(gWQS)

# mixture names
PCBs <- names(wqs_data)[1:5] 
 # Only using 1st 5 of the original 34 exposures for this quick example

# create reference wqs object with 4 bootstraps
wqs_main <- gwqs(yLBX ~ wqs, mix_name = PCBs, data = wqs_data, q = 10, 
                 validation = 0, b = 3, b1_pos = TRUE, b_constr = FALSE,
                 plan_strategy = "multicore", family = "gaussian", seed = 16)
# Note: We recommend niter = 1000 for the main WQS regression. This example
# has a lower number of bootstraps to serve as a shorter test run.

# run the permutation test
perm_test_res <- wqs_pt(wqs_main, niter = 2, b1_pos = TRUE)

# Note: The default value of niter = 200 is the recommended parameter value. 
# This example has a lower niter in order to serve as a shorter test run.

Run the code above in your browser using DataLab