The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

eimpute: Efficiently IMPUTE Large Scale Incomplete Matrix

Introduction

Matrix completion is a procedure for imputing the missing elements in matrices by using the information of observed elements. This procedure can be visualized as:

Matrix completion has attracted a lot of attention, it is widely applied in:

A computationally efficient R package, eimpute is developed for matrix completion. In eimpute, matrix completion problem is solved by iteratively performing low-rank approximation and data calibration, which enjoy two admirable advantages:

Compare eimpute and softimpute in systhesis datasets \(X_{m \times m}\) with \(p\) proportion missing observations. The square matrix \(X_{m \times m}\) is generated by \(X = UV + \epsilon\), where \(U\) and \(V\) are \(m \times r\), \(r \times n\) matrices whose entries are \(i.i.d.\) sampled standard normal distribution, \(\epsilon \sim N(0, r/3)\).

In high dimension case, als method in softimpute is a little faster than eimpute in low proportion of missing observations, as the proportion of missing observations increase, rsvd method in eimpute have a better performance than softimpute in time cost and test error. Compare with two method in **eimpute*, rsvd method is better than tsvd in time cost.

Installation

Install the stable version from CRAN:

install.packages("eimpute")

Install the development version from github:

library(devtools)
install_github("Mamba413/eimpute", build_vignettes = TRUE)

Quick Example

We start with a toy example. Let us generate a small matrix with some values missing via incomplete.generator function.

m <- 6
n <- 5
r <- 3
x_na <- incomplete.generator(m, n, r)
x_na
#>            [,1]       [,2]       [,3]      [,4]       [,5]
#> [1,] -0.8269428  1.2228586         NA        NA         NA
#> [2,] -2.2410010  4.5095165         NA        NA         NA
#> [3,]  0.4499102         NA -0.2818085 0.7718102 -0.8364048
#> [4,]         NA  1.7167365  0.9480745        NA  3.5680208
#> [5,]         NA  0.7240437         NA        NA  0.2633712
#> [6,]         NA -2.8879249         NA 1.2027552         NA

Use eimpute function to impute missing values.

x_impute <- eimpute(x_na, r)
x_impute[["x.imp"]]
#>            [,1]       [,2]        [,3]      [,4]       [,5]
#> [1,] -0.8269428  1.2228586  0.19035820 0.9514541  0.2994880
#> [2,] -2.2410010  4.5095165  0.39560039 0.7295574  0.4911418
#> [3,]  0.4499102 -1.2083884 -0.28180850 0.7718102 -0.8364048
#> [4,] -0.3408353  1.7167365  0.94807452 0.1835412  3.5680208
#> [5,] -0.3669454  0.7240437  0.11988844 0.3294654  0.2633712
#> [6,]  1.3875965 -2.8879249  0.01871091 1.2027552  0.4512052

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.