A variable selection algorithm based on the directed dependence coefficient (didec
).
mfoci(
X,
Y,
pre.selected = NULL,
perm = FALSE,
perm.method = c("decreasing"),
autostop = TRUE
)
A data.frame listing the selected variables.
A numeric matrix or data.frame/data.table. Contains the predictor vector X.
A numeric matrix or data.frame/data.table. Contains the response vector Y.
An integer vector for indexing pre-selected predictor variables from X.
A logical. If True
a version of didec
is computed that takes into account the permutations (specified by perm.method
) of the response variables.
An optional character string specifying a method in didec
for permuting the response variables. This must be one of the strings "sample"
, "increasing"
, "decreasing"
(default) or "full"
. The version "full"
is invariant with respect to permutations of the response variables.
A logical. If True
the algorithm stops at the first non-increasing value of didec
.
Sebastian Fuchs, Jonathan Ansari, Yuping Wang
mfoci
is a forward feature selection algorithm for multiple-outcome data that employs the directed dependence coefficient (didec
) at each step.
mfoci
is proved to be consistent in the sense that the subset of predictor variables selected via mfoci
is sufficient with high probability.
If autostop == TRUE
the algorithm stops at the first non-increasing value of didec
, thereby selecting a subset of variables.
Otherwise, all predictor variables are ordered according to their predictive strength measured by didec
.
J. Ansari, S. Fuchs, A simple extension of Azadkia & Chatterjee's rank correlation to multi-response vectors, Available at https://arxiv.org/abs/2212.01621, 2024.
library(didec)
data("bioclimatic")
X <- bioclimatic[, c(9:12)]
Y <- bioclimatic[, c(1,8)]
mfoci(X, Y, pre.selected = c(1, 3))
Run the code above in your browser using DataLab