The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

frequentdirections Build Status

Implementation of Frequent-Directions algorithm for efficient matrix sketching [E. Liberty, SIGKDD2013]

Installation

# Not yet onCRAN
install.packages("frequentdirections")

# Or the development version from GitHub:
install.packages("devtools")
devtools::install_github("shinichi-takayanagi/frequentdirections")

Example

Download example data

Here, we use Handwritten digits USPS dataset as sample data. In the following example, we assume that you save the above sample data into /tmp directory.

Load data

The dataset has 7291 train and 2007 test images in h5 format. The images are 16*16 grayscale pixels.

library("h5")
file <- h5file("/tmp/usps.h5")
x <- file["train/data"][]
y <- file["train/target"][]
str(x)
#>  num [1:7291, 1:256] 0 0 0 0 0 0 0 0 0 0 ...

Plot example image

Example the number 8

image(matrix(x[338,], nrow=16, byrow = FALSE))

Plot SVD

Plot the original data on the first and second singular vector plane.

x <- scale(x)
frequentdirections::plot_svd(x, y)

Matrix Sketching

l = 8 case

eps <- 10^(-8)
# 7291 x 256 -> 8 * 256 matrix
b <- frequentdirections::sketching(x, 8, eps)
frequentdirections::plot_svd(x, y, b)

l = 32 case

# 7291 x 256 -> 32 * 256 matrix
b <- frequentdirections::sketching(x, 32, eps)
frequentdirections::plot_svd(x, y, b)

l = 128 case

# 7291 x 256 -> 128 * 256 matrix
b <- frequentdirections::sketching(x, 128, eps)
frequentdirections::plot_svd(x, y, b)

This result is almost the same with the original data SVD expression.

That’s why we can think that the original data is expressed with only 128 rows.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.