RandPro (version 0.2.0)

dimension: Function to determine the required number of dimension for generating the projection matrix

Description

Johnson-Lindenstrauss (JL) lemma is the heart of random projection. The lemma states that a small set of points in a high-dimensional space can be embedded into low dimensional space in such a way that distances between the points are nearly preserved. The lemma has been used in dimensionality reduction, compressed sensing, manifold learning and graph embedding. The epsilon is the error tolerant parameter and it is inversely proportional to the accuracy of the result. The higher error tolerant level decreases the number of dimension and also the computation complexity with the marginal loss of accuracy.

Usage

dimension(sample, epsilon = 0.1)

Arguments

sample

- number of samples

epsilon

- error tolerance level with default value 0.1

Value

minimum number of dimension required to maintain the pair wise distance between any two points with the controlled amount of error(eps)

Details

The function dimension() is used to find the minimum dimension required to project the data from high dimensional space to low dimensional space. The number of sample and error tolerant level has been passed as an input argument to the function dimension() . It will return the size of the random subspace to guarantee a bounded distortion introduced by the random projection.

References

[1] William B.Johnson, Joram Lindenstrauss, "Extension of Lipschitz mappings into a Hilbert space (1984)"

[2] Sanjoy Dasgupta , Anupam Gupta "An elementary proof of a theorem of Johnson and Lindenstrauss (2003)"

See Also

Johnson-Lindenstrauss Elementary Proof

Examples

Run this code
# NOT RUN {
#load library
library(RandPro)

#Calculate minimum dimension using eps =0.5 for 1000000 sample
y <- dimension(1000000,0.5)

#Calculating minimum dimension using different epsilon value for 1000000 sample
d <-  c(0.5,0.1)
x<- dimension(103260,d)

# }

Run the code above in your browser using DataLab