Learn R Programming

⚠️There's a newer version (1.1.3) of this package.Take me there.

Rdimtools

Rdimtools is an R package for dimension reduction (DR) - including feature selection and manifold learning - and intrnsic dimension estimation (IDE) methods. We aim at building one of the most comprehensive toolbox available online, where current version delivers 143 DR algorithms and 17 IDE methods.

The philosophy is simple, the more we have at hands, the better we can play.

Elephant

Our logo characterizes the foundational nature of multivariate data analysis; we may be blind people wrangling the data to see an elephant to grasp an idea of what the data looks like with partial information from each algorithm.

Installation

You can install a release version from CRAN:

install.packages("Rdimtools")

or the development version from github:

## install.packages("devtools")
devtools::install_github("kisungyou/Rdimtools")

Minimal Example : Dimension Reduction

Here is an example of dimension reduction on the famous iris dataset. Principal Component Analysis (do.pca), Laplacian Score (do.lscore), and Diffusion Maps (do.dm) are compared, each from a family of algorithms for linear reduction, feature extraction, and nonlinear reduction.

# load the library
library(Rdimtools)

# load the data
X   = as.matrix(iris[,1:4])
lab = as.factor(iris[,5])

# run 3 algorithms mentioned above
mypca = do.pca(X, ndim=2)
mylap = do.lscore(X, ndim=2)
mydfm = do.dm(X, ndim=2, bandwidth=10)

# visualize
par(mfrow=c(1,3))
plot(mypca$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="PCA")
plot(mylap$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="Laplacian Score")
plot(mydfm$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="Diffusion Maps")

Minimal Example : Dimension Estimation

Swill Roll is a classic example of 2-dimensional manifold embedded in ℝ3 and one of 11 famous model-based samples from aux.gensamples() function. Given the ground truth that d = 2, let’s apply several methods for intrinsic dimension estimation.

# generate sample data
set.seed(100)
roll = aux.gensamples(dname="swiss")

# we will compare 6 methods (out of 17 methods from version 1.0.0)
vecd = rep(0,5)
vecd[1] = est.Ustat(roll)$estdim       # convergence rate of U-statistic on manifold
vecd[2] = est.correlation(roll)$estdim # correlation dimension
vecd[3] = est.made(roll)$estdim        # manifold-adaptive dimension estimation
vecd[4] = est.mle1(roll)$estdim        # MLE with Poisson process
vecd[5] = est.twonn(roll)$estdim       # minimal neighborhood information

# let's visualize
plot(1:5, vecd, type="b", ylim=c(1.5,2.5), 
     main="true dimension is d=2",
     xaxt="n",xlab="",ylab="estimated dimension")
xtick = seq(1,5,by=1)
axis(side=1, at=xtick, labels = FALSE)
text(x=xtick,  par("usr")[3], 
     labels = c("Ustat","correlation","made","mle1","twonn"), pos=1, xpd = TRUE)

We can observe that all 5 methods we tested estimated the intrinsic dimension around d = 2. It should be noted that the estimated dimension may not be integer-valued due to characteristics of each method.

Acknowledgements

The logo icon is made by Freepik from www.flaticon.com.The rotating Swiss Roll image is taken from Dinoj Surendran’s website.

Copy Link

Version

Install

install.packages('Rdimtools')

Monthly Downloads

10,252

Version

1.0.9

License

MIT + file LICENSE

Maintainer

Kisung You

Last Published

February 4th, 2022

Functions in Rdimtools (1.0.9)

aux.kernelcov

Build a centered kernel matrix K
est.nearneighbor1

Intrinsic Dimension Estimation with Near-Neighbor Information
aux.shortestpath

Find shortest path using Floyd-Warshall algorithm
aux.preprocess

Preprocessing the data
aux.gensamples

Generate model-based samples
est.nearneighbor2

Near-Neighbor Information with Bias Correction
est.mindkl

MiNDkl
aux.pkgstat

Show the number of functions for Rdimtools.
est.incisingball

Intrinsic Dimension Estimation with Incising Ball
est.made

Manifold-Adaptive Dimension Estimation
do.procrustes

Feature Selection using PCA and Procrustes Analysis
do.spufs

Structure Preserving Unsupervised Feature Selection
do.fscore

Fisher Score
do.enet

Elastic Net Regularization
do.rsr

Regularized Self-Representation
do.udfs

Unsupervised Discriminative Features Selection
est.Ustat

ID Estimation with Convergence Rate of U-statistic on Manifold
est.clustering

Intrinsic Dimension Estimation via Clustering
do.lsdf

Locality Sensitive Discriminant Feature
do.ugfs

Unsupervised Graph-based Feature Selection
do.bpca

Bayesian Principal Component Analysis
do.cca

Canonical Correlation Analysis
do.lsls

Locality Sensitive Laplacian Score
est.mindml

MINDml
do.uwdfs

Uncorrelated Worst-Case Discriminative Feature Selection
est.danco

Intrinsic Dimensionality Estimation with DANCo
est.boxcount

Box-counting Dimension
est.correlation

Correlation Dimension
do.dspp

Discriminative Sparsity Preserving Projection
do.isoproj

Isometric Projection
do.ldp

Locally Discriminating Projection
do.elde

Exponential Local Discriminant Embedding
do.kmvp

Kernel-Weighted Maximum Variance Projection
do.lea

Locally Linear Embedded Eigenspace Analysis
est.mle1

Maximum Likelihood Esimation with Poisson Process
do.lasso

Least Absolute Shrinkage and Selection Operator
aux.graphnbd

Construct Nearest-Neighborhood Graph
est.mle2

Maximum Likelihood Esimation with Poisson Process and Bias Correction
do.lscore

Laplacian Score
do.lpp

Locality Preserving Projection
est.twonn

Intrinsic Dimension Estimation by a Minimal Neighborhood Information
do.cscore

Constraint Score
est.gdistnn

Intrinsic Dimension Estimation based on Manifold Assumption and Graph Distance
do.ldakm

Combination of LDA and K-means
do.mifs

Mutual Information for Selecting Features
do.lfda

Local Fisher Discriminant Analysis
do.mfa

Marginal Fisher Analysis
do.lde

Local Discriminant Embedding
do.llp

Local Learning Projections
do.lqmi

Linear Quadratic Mutual Information
do.msd

Maximum Scatter Difference
do.specs

Supervised Spectral Feature Selection
do.specu

Unsupervised Spectral Feature Selection
do.mlie

Maximal Local Interclass Embedding
est.packing

Intrinsic Dimension Estimation using Packing Numbers
do.mvp

Maximum Variance Projection
do.ppca

Probabilistic Principal Component Analysis
do.rlda

Regularized Linear Discriminant Analysis
est.pcathr

PCA Thresholding with Accumulated Variance
do.anmm

Average Neighborhood Margin Maximization
do.elpp2

Enhanced Locality Preserving Projection (2013)
do.asi

Adaptive Subspace Iteration
do.pls

Partial Least Squares
do.pflpp

Parameter-Free Locality Preserving Projection
do.cscoreg

Constraint Score using Spectral Graph
do.disr

Diversity-Induced Self-Representation
do.lpe

Locality Pursuit Embedding
do.lpca2006

Locally Principal Component Analysis by Yang et al. (2006)
do.nrsr

Non-convex Regularized Self-Representation
do.dm

Diffusion Maps
do.slpp

Supervised Locality Preserving Projection
do.crda

Curvilinear Distance Analysis
do.slpe

Supervised Locality Pursuit Embedding
do.dppca

Dual Probabilistic Principal Component Analysis
do.lspe

Locality and Similarity Preserving Embedding
do.mcfs

Multi-Cluster Feature Selection
do.wdfs

Worst-Case Discriminative Feature Selection
iris

Load Iris data
do.lsda

Locality Sensitive Discriminant Analysis
do.lsir

Localized Sliced Inverse Regression
do.cisomap

Conformal Isometric Feature Mapping
do.kmfa

Kernel Marginal Fisher Analysis
do.kmmc

Kernel Maximum Margin Criterion
do.crca

Curvilinear Component Analysis
do.cnpe

Complete Neighborhood Preserving Embedding
do.olda

Orthogonal Linear Discriminant Analysis
do.onpp

Orthogonal Neighborhood Preserving Projections
do.olpp

Orthogonal Locality Preserving Projection
do.odp

Orthogonal Discriminant Projection
do.adr

Adaptive Dimension Reduction
do.crp

Collaborative Representation-based Projection
do.sda

Semi-Supervised Discriminant Analysis
do.save

Sliced Average Variance Estimation
do.mmds

Metric Multidimensional Scaling
do.splapeig

Supervised Laplacian Eigenmaps
do.ltsa

Local Tangent Space Alignment
do.dve

Distinguishing Variance Embedding
do.dagdne

Double-Adjacency Graphs-based Discriminant Neighborhood Embedding
do.ammc

Adaptive Maximum Margin Criterion
do.dne

Discriminant Neighborhood Embedding
do.kudp

Kernel-Weighted Unsupervised Discriminant Projection
do.extlpp

Extended Locality Preserving Projection
do.eslpp

Extended Supervised Locality Preserving Projection
do.fssem

Feature Subset Selection using Expectation-Maximization
do.fa

Exploratory Factor Analysis
do.lisomap

Landmark Isometric Feature Mapping
do.lapeig

Laplacian Eigenmaps
do.rpca

Robust Principal Component Analysis
do.plp

Piecewise Laplacian-based Projection (PLP)
do.ree

Robust Euclidean Embedding
do.sammon

Sammon Mapping
do.lpfda

Locality Preserving Fisher Discriminant Analysis
do.mmsd

Multiple Maximum Scatter Difference
do.lpmip

Locality-Preserved Maximum Information Projection
do.ica

Independent Component Analysis
do.lltsa

Linear Local Tangent Space Alignment
do.lmds

Landmark Multidimensional Scaling
do.spmds

Spectral Multidimensional Scaling
do.kpca

Kernel Principal Component Analysis
do.ispe

Isometric Stochastic Proximity Embedding
do.mve

Minimum Volume Embedding
do.isomap

Isometric Feature Mapping
do.sdlpp

Sample-Dependent Locality Preserving Projection
do.mvu

Maximum Variance Unfolding / Semidefinite Embedding
do.modp

Modified Orthogonal Discriminant Projection
do.sir

Sliced Inverse Regression
do.kqmi

Kernel Quadratic Mutual Information
do.bmds

Bayesian Multidimensional Scaling
do.mmc

Maximum Margin Criterion
do.tsne

t-distributed Stochastic Neighbor Embedding
do.idmap

Interactive Document Map
do.iltsa

Improved Local Tangent Space Alignment
do.cge

Constrained Graph Embedding
oos.linproj

OOS : Linear Projection
do.nolpp

Nonnegative Orthogonal Locality Preserving Projection
do.lspp

Local Similarity Preserving Projection
do.lda

Linear Discriminant Analysis
do.mds

(Classical) Multidimensional Scaling
do.mmp

Maximum Margin Projection
do.nonpp

Nonnegative Orthogonal Neighborhood Preserving Projections
do.npca

Nonnegative Principal Component Analysis
do.pca

Principal Component Analysis
do.spc

Supervised Principal Component Analysis
do.opls

Orthogonal Partial Least Squares
do.npe

Neighborhood Preserving Embedding
do.keca

Kernel Entropy Component Analysis
do.udp

Unsupervised Discriminant Projection
do.spca

Sparse Principal Component Analysis
do.ulda

Uncorrelated Linear Discriminant Analysis
do.rndproj

Random Projection
do.rpcag

Robust Principal Component Analysis via Geometric Median
do.sammc

Semi-Supervised Adaptive Maximum Margin Criterion
do.ssldp

Semi-Supervised Locally Discriminant Projection
do.rsir

Regularized Sliced Inverse Regression
do.spp

Sparsity Preserving Projection
do.klde

Kernel Local Discriminant Embedding
do.hydra

Hyperbolic Distance Recovery and Approximation
do.fastmap

FastMap
do.klsda

Kernel Locality Sensitive Discriminant Analysis
do.klfda

Kernel Local Fisher Discriminant Analysis
do.lle

Locally Linear Embedding
do.llle

Local Linear Laplacian Eigenmaps
usps

Load USPS handwritten digits data
do.lamp

Local Affine Multidimensional Projection
do.nnp

Nearest Neighbor Projection
do.ksda

Kernel Semi-Supervised Discriminant Analysis
do.phate

Potential of Heat Diffusion for Affinity-based Transition Embedding
do.spe

Stochastic Proximity Embedding
do.sne

Stochastic Neighbor Embedding