From a matrix or data.frame with dimension NxD, where N>1, D>0,
`Dirac()` computes the simplest kernel for categorical data. Samples
should be in the rows and features in the columns. When there is a single feature,
`Dirac()` returns 1 if the category (or class, or level) is the same in
two given samples, and 0 otherwise. Instead, when D>1, the results for the
D features are combined doing a sum, a mean, or a weighted mean.
Kernel matrix (dimension: NxN), or a list with the kernel matrix and the
feature space.
Arguments
X
Matrix (class "character") or data.frame (class "character", or columns = "factor").
The elements in X are assumed to be categorical in nature.
comp
When D>1, this argument indicates how the variables
of the dataset are combined. Options are: "mean", "sum" and "weighted". (Defaults: "mean")
"sum" gives the same importance to all variables, and returns an
unnormalized kernel matrix.
"mean" gives the same importance to all variables, and returns a
normalized kernel matrix (all its elements range between 0 and 1).
"weighted" weights each variable according to the `coeff` parameter, and returns a
normalized kernel matrix.
coeff
(optional) A vector of weights with length D.
feat_space
If FALSE, only the kernel matrix is returned. Otherwise,
the feature space is also returned. (Defaults: FALSE).
References
Belanche, L. A., and Villegas, M. A. (2013).
Kernel functions for categorical variables with application to problems in the life sciences.
Artificial Intelligence Research and Development (pp. 171-180). IOS Press.
Link