Learn R Programming

wqspt (version 1.0.2)

wqs_full_perm: Full wrapper WQS permutation test

Description

wqs_full_perm is a full wrapper function that is a full implementation of the Weighted Quantile Sum (WQS) regression method followed by the permutation test to determine the significance of the WQS coefficient.

Usage

wqs_full_perm(
  formula,
  data,
  mix_name,
  q = 10,
  b_main = 1000,
  b_perm = 200,
  b1_pos = TRUE,
  b_constr = FALSE,
  rs = FALSE,
  niter = 200,
  seed = NULL,
  family = "gaussian",
  plan_strategy = "multicore",
  stop_if_nonsig = FALSE,
  stop_thresh = 0.05,
  nworkers = NULL,
  ...
)

Value

wqs_full_perm returns an object of class wqs_perm, which contains three sublists:

perm_test

List containing:

  • pval: permutation test p-value

  • (linear regression only) testbeta1: reference WQS regression coefficient beta1 value

  • (linear regression only) betas: Vector of beta values from each permutation test run

  • (logistic regression only) testpval: test reference p-value

  • (logistic regression only) permpvals: p-values from the null models

gwqs_main

Main gWQS object (same as model input). This will now include an additional object "seed" that returns the seed used for this main WQS regression.

gwqs_perm

Permutation test reference gWQS object (NULL if model family != "gaussian" or if same number of bootstraps are used in permutation test WQS regression runs as in the main run).

Arguments

formula

An object of class formula. The wqs term must be included in the formula (e.g., y ~ wqs + ...).

data

The data.frame to be used in the WQS regression run. This can be of class data.frame or it can be a tibble from the tidyverse.

mix_name

A vector with the mixture column names.

q

An integer to indicate the number of quantiles to split the mixture variables.

b_main

The number of bootstraps for the main WQS regression run.

b_perm

The number of bootstraps for the iterated permutation test WQS regression runs and the reference WQS regression run (only for linear WQS regression and only when b_mean != b_perm).

b1_pos

A logical value that indicates whether beta values should be positive or negative.

b_constr

Logical value that determines whether to apply positive or negative constraints in the optimization function for the weight optimization. Note that this won't guarantee that the iterated b1 values in the weight optimization are only positive (if b1_pos = TRUE) or only negative (if b1_pos = FALSE) as seen in the bres matrix output by the gwqs models (i.e., column bres$b1), but it does substantially increase the probability that those b1 values will be constrained to be either positive or negative. This defaults to FALSE.

rs

A logical value indicating whether random subset implementation should be performed.

niter

Number of permutation test iterations.

seed

An integer to fix the seed. This will only impact the the initial WQS regression run and not the permutation test iterations. The default setting is NULL, which means no seed is used for the initial WQS regression. The seed will be saved in the "gwqs_main" object as "gwqs_main$seed".

family

A description of the error distribution and link function to be used in the model. This can be a character string naming a family function (e.g., "binomial") or a family object (e.g., binomial(link="logit")). Currently validated families include gaussian(link="identity") for linear regression, binomial() with any accepted link function (e.g., "logit" or "probit"), poisson(link = "log"), quasipoisson(link = "log"), or "negbin" for negative binomial. The "multinomial" family is not yet supported.

plan_strategy

Evaluation strategy for the plan function. You can choose among "sequential", "multisession", "multicore", and "cluster". This defaults to "multicore". See the future::plan documentation for full details.

stop_if_nonsig

if TRUE, the function will not proceed with the permutation test if the main WQS regression run produces nonsignificant p-value.

stop_thresh

numeric p-value threshold required in order to proceed with the permutation test, if stop_if_nonsig = TRUE.

nworkers

(optional) If the plan_strategy is not "sequential", this argument defines the number of parallel processes to use, which can be critical when using a high-performance computing (HPC) cluster. This should be an integer value. The default behavior for gWQS::gwqs is to use all detected cores on a machine, but for many HPC use scenarios, this would call in cores that have not been allotted by the HPC scheduler, resulting in the submitted job being halted. For example, if one has requested 14 cores on a 28-core HPC queue, one would want to set nworkers = 14. If nworkers was greater than 14 in that case, the HPC job would be terminated. This argument defaults to NULL, in which case length(future::availableWorkers()) will be used to determine the number of parallel processes to use.

...

(optional) Additional arguments to pass to the gWQS::gwqs function.

Examples

Run this code
library(gWQS)

# mixture names
PCBs <- names(wqs_data)[1:5] 
# Only using 1st 5 of the original 34 exposures for this quick example

# quick example with only 3 bootstraps each WQS regression iteration, and 
# only 2 iterations

perm_test_res <- wqs_full_perm(formula = yLBX ~ wqs, data = wqs_data, 
                                mix_name = PCBs, q = 10, b_main = 3, 
                                b_perm = 3, b1_pos = TRUE, b_constr = FALSE, 
                                niter = 2, seed = 16, 
                                plan_strategy = "multicore", 
                                stop_if_nonsig = FALSE)

# Note: The default values of b_main = 1000, b_perm = 200, and niter = 200 
# are the recommended parameter values. This example has a lower b_main, 
# b_perm, and niter in order to serve as a shorter test run. 
 

Run the code above in your browser using DataLab