Learn R Programming

rdetools (version 1.0)

rde: Relevant Dimension Estimation (RDE)

Description

The function estimates the relevant dimension in feature space. By default, this is done by fitting a two-component model, but rde by leave-one-out cross-validation is also available. The function is also able to calculate a denoised version of the labels and to estimate the noise level in the data set.

Usage

rde(K, y, est_y = FALSE, alldim = FALSE, est_noise = FALSE, regression = FALSE, nmse = TRUE, dim_rest = 0.5, tcm = TRUE)

Arguments

K
kernel matrix of the inputs (e.g. rbf kernel matrix)
y
label vector which contains the label for each data point
est_y
set this to TRUE if you want a denoised version of the labels
alldim
if this is TRUE denoised labels for all dimensions are calculated (instead of only for relevant dimension)
est_noise
set this to TRUE if you want an estimated noise level
regression
only interesting if one of est_y, alldim, est_noise is TRUE. Set this to TRUE if you want to force the function to handle the data as data for a regression problem. If you leave this FALSE, the function will try to determine itself whether this is a classification or regression problem.
nmse
only interesting if est_noise is TRUE and the function is handling the data as data of a regression problem. If you leave this TRUE, the normalized mean squared error is used for estimating the noise level, otherwise the conventional mean squared error.
dim_rest
percantage of leading dimensions to which the search for the relevant dimensions should be restricted. This is needed due to numerical instabilities. 0.5 should be a good choice in most cases (and is also the default value)
tcm
this is TRUE by default; indicates whether rde should be done by TCM or LOO-CV algorithm

Value

rd
estimated relevant dimension
err
loo-cv-error/negative-log-likelihood-value for each dimension (the position of the minimum is the relevant dimension)
yh
only returned if est_y, alldim or est_noise is TRUE, contains the denoised labels
Yh
only returned if alldim is TRUE, matrix with denoised labels for each dimension in each column
noise
only returned if est_noise is TRUE, contains the estimated noise level
kpc
kernel pca coefficients
eigvec
eigenvectors of the kernel matrix
eigval
eigenvalues of the kernel matrix
tcm
TRUE if TCM algorithm was used, otherwise (LOO-CV algorithm) FALSE

Details

If est_noise or alldim are TRUE, a denoised version of the labels for the relevant dimension will be returned even if est_y is FALSE (so e.g. if you want denoised labels and noise approximation it is enough to set est_noise to TRUE).

References

M. L. Braun, J. M. Buhmann, K. R. Mueller (2008) \_On Relevant Dimensions in Kernel Feature Spaces\_

See Also

rde_loocv, rde_tcm, estnoise, isregression, rbfkernel, polykernel, drawkpc

Examples

Run this code
## example with sinc data using tcm algorithm
d <- sincdata(100, 0.1) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
# rde, return also denoised labels and noise, fit tcm
r <- rde(K, d$y, est_y = TRUE, est_noise = TRUE)
r$rd # estimated relevant dimension
r$noise # estimated noise
drawkpc(r) # draw kernel pca coefficients

## example with sinc data using loo-cv algorithm
d <- sincdata(100, 0.1) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
# rde, return also denoised labels and noise
r <- rde(K, d$y, est_y = TRUE, est_noise = TRUE, tcm = FALSE)
r$rd # estimated relevant dimension
r$noise # estimated noise
drawkpc(r) # draw kernel pca coefficients

Run the code above in your browser using DataLab