perm.test: Permutation Test for Conditional Independence

Description

Permutation Test for Conditional Independence

Usage

perm.test(
  formula,
  data,
  p = 0.5,
  nperm = 160,
  subsample = 1,
  metric = "RMSE",
  method = "rf",
  nrounds = 600,
  mtry = NULL,
  parametric = FALSE,
  tail = NA,
  robust = TRUE,
  metricfunc = NULL,
  mlfunc = NULL,
  nthread = 1,
  progress = TRUE,
  k = 15,
  center = TRUE,
  scale = TRUE,
  eps = 1e-15,
  positive = NULL,
  kernel = "optimal",
  distance = 2,
  ...
)

Value

An object of class 'CCI' containing the null distribution, observed test statistic, p-values, the machine learning model used, and the data.

Arguments

formula: Model formula or DAGitty object specifying the relationship between dependent and independent variables.
data: A data frame containing the variables specified in the formula.
p: Proportion of data to use for training the model. Default is 0.5.
nperm: Number of permutations to perform. Default is 160.
subsample: The proportion of the data to be used. Default is 1 (no subsampling).
metric: Type of metric: "RMSE", "Kappa" or "LogLoss". Default is 'RMSE'.
method: The machine learning method to use for the learner. Supported methods include "rf", "xgboost", "KNN" and "svm". Default is "rf".
nrounds: Number of rounds (trees) for methods 'xgboost' and 'rf'. Default is 600.
mtry: Number of variables to possibly split at in each node for method 'rf'. Default is NULL (sqrt of number of variables).
parametric: Logical. If TRUE, a parametric p-value is calculated instead of an empirical p-value. Default is FALSE.
tail: Specifies whether the test is one-tailed ("left" or "right") or two-tailed. Default is NA.
robust: Logical. If TRUE, uses a robust method for permutation. Default is TRUE.
metricfunc: An optional custom function to calculate the performance metric based on the model's predictions. Default is NULL.
mlfunc: An optional custom machine learning function to use instead of the predefined methods. Default is NULL.
nthread: Integer. The number of threads to use for parallel processing for method 'rf' and 'xgboost'. Default is 1.
progress: Logical. If TRUE, a progress bar is displayed during the permutation process. Default is TRUE.
k: Integer. The number of nearest neighbors for the "KNN" method. Default is 15.
center: Logical. If TRUE, the data is centered before model fitting. Default is TRUE.
scale: Logical. If TRUE, the data is scaled before model fitting. Default is TRUE.
eps: Numeric. A small value added to avoid division by zero. Default is 1e-15.
positive: Character vector. Specifies which levels of a factor variable should be treated as positive class in classification tasks. Default is NULL.
kernel: Character string specifying the kernel type for method option "KNN" . Possible choices are "rectangular" (which is standard unweighted knn), "triangular", "epanechnikov" (or beta(2,2)), "biweight" (or beta(3,3)), "triweight" (or beta(4,4)), "cos", "inv", "gaussian" and "optimal". Default is "optimal".
distance: Numeric. Parameter of Minkowski distance for the "KNN" method. Default is 2.
...: Additional arguments to pass to the machine learning model fitting function.

Examples

Run this code

set.seed(123)
dat <- data.frame(x1 = rnorm(100),
x2 = rnorm(100),
x3 = rnorm(100),
x4 = rnorm(100),
y = rnorm(100))
perm.test(y ~ x1 | x2 + x3 + x4, data = dat, nperm = 25)

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples