The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

nmfbin: Non-Negative Matrix Factorization for Binary Data

CRAN status R-CMD-check

The nmfbin R package provides a simple Non-Negative Matrix Factorization (NMF) implementation tailored for binary data matrices. It offers a choice of initialization methods, loss functions and updating algorithms.

NMF is typically used for reducing high-dimensional matrices into lower (k-) rank ones where k is chosen by the user. Given a non-negative matrix X of size \(m \times n\), NMF looks for two non-negative matrices W (\(m \times k\)) and H (\(k \times n\)), such that:

\[X \approx W \times H\]

In topic modelling, W is interpreted as the document-topic matrix and H as the topic-feature matrix.

Unlike most other NMF packages, nmfbin is focused on binary (Boolean) data, while keeping the number of dependencies to a minimum. For more information see the website.

Installation

You can install the development version of nmfbin from GitHub with:

# install.packages("remotes")
remotes::install_github("michalovadek/nmfbin")

Usage

The input matrix can only contain 0s and 1s.

# load
library(nmfbin)

# Create a binary matrix for demonstration
X <- matrix(sample(c(0, 1), 100, replace = TRUE), ncol = 10)

# Perform Logistic NMF
results <- nmfbin(X, k = 3, optimizer = "mur", init = "nndsvd", max_iter = 1000)

Citation

@Manual{,
  title = {nmfbin: Non-Negative Matrix Factorization for Binary Data},
  author = {Michal Ovadek},
  year = {2023},
  note = {R package version 0.2.1},
  url = {https://michalovadek.github.io/nmfbin/},
}

Contributions

Contributions to the nmfbin package are more than welcome. Please submit pull requests or open an issue for discussion.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.