powered by
Performs unsupervised feature selection for mixed type data. Both algorithms are based on the heterogeneous correlation matrix.
UFS( data = NULL, alpha = 0.05, missing = FALSE, pv_adj = "none", smooth.tol = 10^-12, method = "c" )
An list of elements:
Original data frame with with numerical features first
A data frame of the selected features
The indices of the selected features from the original data frame
The \(p\) by \(p\) extended correlation matrix of all the inputted features
The \(d\) by \(d\) extended correlation matrix of the selected features
The \(p\) by \(p\) p-values matrix of all the inputted features
The \(d\) by \(d\) p-values matrix of the selected features
A data frame. Values of type 'numeric' or 'integer' are treated as numerical, factors as ordinal categorical.
Significance level to be used for testing, default = 0.05.
Pairwise complete by default, set to TRUE for complete deletion.
Correction method for p-value, "none" by default. For options see p.adjust.
Minimum acceptable eigenvalue for the smoothing, default = 10^-12.
Algorithm used. c (cell-wise) by default, r (row-wise) as the alternative.
Tortora C., Madhvani S., Punzo A. (2025). Designing unsupervised mixed-type feature selection techniques using the heterogeneous correlation matrix. International Statistical Review. https://doi.org/10.1111/insr.70016
data(ESI)#Loading the data data = ESI[,-c(1,3,4,6,9)]##removing categorical features res = UFS(data) ### visualize selected features colnames(res$selected.features)
Run the code above in your browser using DataLab