frequentdirections

Implementation of Frequent-Directions algorithm for efficient matrix sketching [E. Liberty, SIGKDD2013]

Installation

# Not yet onCRAN
install.packages("frequentdirections")

# Or the development version from GitHub:
install.packages("devtools")
devtools::install_github("shinichi-takayanagi/frequentdirections")

Example

Download example data

Here, we use Handwritten digits USPS dataset as sample data. In the following example, we assume that you save the above sample data into /tmp directory.

Load data

The dataset has 7291 train and 2007 test images in h5 format. The images are 16*16 grayscale pixels.

library("h5")
file <- h5file("/tmp/usps.h5")
x <- file["train/data"][]
y <- file["train/target"][]
str(x)
#>  num [1:7291, 1:256] 0 0 0 0 0 0 0 0 0 0 ...

Plot example image

Example the number 8

image(matrix(x[338,], nrow=16, byrow = FALSE))

Plot SVD

Plot the original data on the first and second singular vector plane.

x <- scale(x)
frequentdirections::plot_svd(x, y)

Matrix Sketching

l = 8 case

eps <- 10^(-8)
# 7291 x 256 -> 8 * 256 matrix
b <- frequentdirections::sketching(x, 8, eps)
frequentdirections::plot_svd(x, y, b)

l = 32 case

# 7291 x 256 -> 32 * 256 matrix
b <- frequentdirections::sketching(x, 32, eps)
frequentdirections::plot_svd(x, y, b)

l = 128 case

# 7291 x 256 -> 128 * 256 matrix
b <- frequentdirections::sketching(x, 128, eps)
frequentdirections::plot_svd(x, y, b)

This result is almost the same with the original data SVD expression.

That’s why we can think that the original data is expressed with only 128 rows.

Copy Link

Version

Down Chevron

Install

install.packages('frequentdirections')

Monthly Downloads

99

Version

0.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Last Published

April 16th, 2019

Functions in frequentdirections (0.1.0)