Learn R Programming

hetcorFS (version 1.0.1)

UFS: Unsupervised Feature Selection

Description

Performs unsupervised feature selection for mixed type data. Both algorithms are based on the heterogeneous correlation matrix.

Usage

UFS(
  data = NULL,
  alpha = 0.05,
  missing = FALSE,
  pv_adj = "none",
  smooth.tol = 10^-12,
  method = "c"
)

Value

An list of elements:

rearranged.data.set

Original data frame with with numerical features first

selected.features

A data frame of the selected features

feature.indices

The indices of the selected features from the original data frame

original.corr.matrix

The \(p\) by \(p\) extended correlation matrix of all the inputted features

corr.matrix

The \(d\) by \(d\) extended correlation matrix of the selected features

original.p.value.matrix

The \(p\) by \(p\) p-values matrix of all the inputted features

p.value.matrix

The \(d\) by \(d\) p-values matrix of the selected features

Arguments

data

A data frame. Values of type 'numeric' or 'integer' are treated as numerical, factors as ordinal categorical.

alpha

Significance level to be used for testing, default = 0.05.

missing

Pairwise complete by default, set to TRUE for complete deletion.

pv_adj

Correction method for p-value, "none" by default. For options see p.adjust.

smooth.tol

Minimum acceptable eigenvalue for the smoothing, default = 10^-12.

method

Algorithm used. c (cell-wise) by default, r (row-wise) as the alternative.

References

Tortora C., Madhvani S., Punzo A. (2025). Designing unsupervised mixed-type feature selection techniques using the heterogeneous correlation matrix. International Statistical Review. https://doi.org/10.1111/insr.70016

Examples

Run this code
data(ESI)#Loading the data
data = ESI[,-c(1,3,4,6,9)]##removing categorical features
res = UFS(data)

### visualize selected features
colnames(res$selected.features)

Run the code above in your browser using DataLab