dNMF
)In this vignette, we consider approximating a binary (i.e., {0,1}) matrix (or a non-negative matrix) as a product of two non-negative low-rank matrices (a.k.a., factor matrices).
Test data is available from toyModel
.
library("dcTensor")
## Warning: no DISPLAY variable so Tk is not available
<- toyModel("NMF") X
You will see that there are five blocks in the data matrix as follows.
image(X, main="Original Data")
Here, we consider the approximation of the binary data matrix \(X\) (\(N \times M\)) as the matrix product of \(U\) (\(N \times J\)) and \(V\) (\(M \times J\)):
\[ X \approx U V' \ \mathrm{s.t.}\ U,V \in \{0,1\} \]
This is known as binary matrix factorization (BMF). Zhang (2007) et al. developed BMF by binary regularization against non-negative matrix factorization (NMF (Lee and Seung 1999; CICHOCK 2009)).
In BMF, the rank parameter \(J\)
(\(\leq \min(N, M)\)) is needed to be
set in advance. Other settings such as the number of iterations
(num.iter
) or factorization algorithm
(algorithm
) are also available. For the details of
arguments of dNMF, see ?dNMF
. After the calculation,
various objects are returned by dNMF
.
set.seed(123456)
<- dNMF(X, Bin_U=1, Bin_V=1, J=5)
out_BMF str(out_BMF, 2)
## List of 6
## $ U : num [1:100, 1:5] 0.979 0.979 0.979 0.979 0.979 ...
## $ V : num [1:300, 1:5] 0.999 0.999 0.999 0.999 0.999 ...
## $ RecError : Named num [1:101] 1.00e-09 1.34e+01 1.34e+01 1.34e+01 1.34e+01 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TrainRecError: Named num [1:101] 1.00e-09 1.34e+01 1.34e+01 1.34e+01 1.34e+01 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ RelChange : Named num [1:101] 1.00e-09 1.19e-04 1.43e-04 1.49e-04 1.48e-04 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
The reconstruction error (RecError
) and relative error
(RelChange
, the amount of change from the reconstruction
error in the previous step) can be used to diagnose whether the
calculation is converging or not.
layout(t(1:2))
plot(log10(out_BMF$RecError[2:101]), type="b", main="Reconstruction Error")
plot(log10(out_BMF$RelChange[2:101]), type="b", main="Relative Change")
The product of \(U\) and \(V\) shows that the original data can be
well-recovered by dNMF
.
<- out_BMF$U %*% t(out_BMF$V)
recX layout(t(1:2))
image(X, main="Original Data")
image(recX, main="Reconstructed Data (BMF)")
You will also notice that both \(U\) and \(V\) take values close to 0 and 1. Note that these factor matrices are not completely binarized because the binary is achieved by regularization, not hard binarization.
layout(t(1:2))
hist(out_BMF$U, breaks=100)
hist(out_BMF$V, breaks=100)
Note that these \(U\) and \(V\) do not always take the values of 0 and 1 completely. This is because the binarization in BMF is based on the regularization to set the values as close to {0,1} as possible, and is not a hard binarization.
head(out_BMF$U)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.979044 0 0 3.765852e-104 0
## [2,] 0.979044 0 0 4.277338e-106 0
## [3,] 0.979044 0 0 2.301292e-104 0
## [4,] 0.979044 0 0 1.337948e-104 0
## [5,] 0.979044 0 0 9.347010e-105 0
## [6,] 0.979044 0 0 8.698401e-105 0
head(out_BMF$V)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.9988044 1.072698e-56 0 0.0013906118 0.000000e+00
## [2,] 0.9988248 5.888980e-54 0 0.0003300704 1.735858e-209
## [3,] 0.9988038 4.861490e-57 0 0.0014216661 0.000000e+00
## [4,] 0.9988042 5.637380e-58 0 0.0014015649 0.000000e+00
## [5,] 0.9988029 1.288196e-54 0 0.0014663748 0.000000e+00
## [6,] 0.9988025 6.724228e-58 0 0.0014901051 0.000000e+00
If you want to get the {0,1} values, use the round
function as below:
head(round(out_BMF$U, 0))
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 0 0 0 0
## [2,] 1 0 0 0 0
## [3,] 1 0 0 0 0
## [4,] 1 0 0 0 0
## [5,] 1 0 0 0 0
## [6,] 1 0 0 0 0
head(round(out_BMF$V, 0))
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 0 0 0 0
## [2,] 1 0 0 0 0
## [3,] 1 0 0 0 0
## [4,] 1 0 0 0 0
## [5,] 1 0 0 0 0
## [6,] 1 0 0 0 0
Next, we consider the approximation of a non-negative data matrix \(X\) (\(N \times M\)) as the matrix product of binary matrix \(U\) (\(N \times J\)) and non-negative matrix \(V\) (\(M \times J\)):
\[ X \approx U V' \ \mathrm{s.t.}\ U \in \{0,1\}, V \geq 0 \]
Here, we define this formalization as semi-binary matrix factorization (SBMF). SBMF can capture discrete patterns from a non-negative matrix.
To demonstrate SBMF, next we use a non-negative matrix from the
nnTensor
package.
library("nnTensor")
##
## Attaching package: 'nnTensor'
## The following object is masked from 'package:dcTensor':
##
## toyModel
<- nnTensor::toyModel("NMF") X2
You will see that there are five blocks in the data matrix as follows.
image(X2, main="Original Data")
Switching from BMF to SBMF is quite easy; SBMF is achieved by specifying the binary regularization parameter as a large value like below:
set.seed(123456)
<- dNMF(X2, Bin_U=1E+6, J=5)
out_SBMF str(out_SBMF, 2)
## List of 6
## $ U : num [1:100, 1:5] 0.978 0.984 0.985 0.988 0.977 ...
## $ V : num [1:300, 1:5] 98.5 100.7 100 101.5 100.6 ...
## $ RecError : Named num [1:101] 1.00e-09 2.99e+03 2.92e+03 2.83e+03 2.77e+03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TrainRecError: Named num [1:101] 1.00e-09 2.99e+03 2.92e+03 2.83e+03 2.77e+03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ RelChange : Named num [1:101] 1.00e-09 2.60e-01 2.44e-02 3.18e-02 2.27e-02 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
RecError
and RelChange
can be used to
diagnose whether the calculation is converging or not.
layout(t(1:2))
plot(log10(out_SBMF$RecError[2:101]), type="b", main="Reconstruction Error")
plot(log10(out_SBMF$RelChange[2:101]), type="b", main="Relative Change")
The product of \(U\) and \(V\) shows that the original data can be
well-recovered by dNMF
.
<- out_SBMF$U %*% t(out_SBMF$V)
recX2 layout(t(1:2))
image(X2, main="Original Data")
image(recX2, main="Reconstructed Data (SBMF)")
You will notice that \(U\) looks binary but \(V\) does not.
layout(t(1:2))
hist(out_SBMF$U, breaks=100)
hist(out_SBMF$V, breaks=100)
Finally, we introduce the binary regularization to ternary regularization to take {0,1,2} values as below:
\[ X \approx U V' \ \mathrm{s.t.}\ U \in \{0,1,2\}, V \geq 0 \] , where \(X\) (\(N \times M\)) is a non-negative data matrix, \(U\) (\(N \times J\)) is a ternary matrix, and \(V\) (\(M \times J\)) is a non-negative matrix.
STMF is achieved by specifying the ternary regularization parameter as a large value like the below:
set.seed(123456)
<- dNMF(X2, Ter_U=1E+6, J=5)
out_STMF str(out_STMF, 2)
## List of 6
## $ U : num [1:100, 1:5] 2.02 2.02 2.02 2.02 2.02 ...
## $ V : num [1:300, 1:5] 48.7 49.8 49.5 50.2 49.8 ...
## $ RecError : Named num [1:101] 1.00e-09 2.52e+03 2.58e+03 2.59e+03 2.58e+03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TrainRecError: Named num [1:101] 1.00e-09 2.52e+03 2.58e+03 2.59e+03 2.58e+03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ RelChange : Named num [1:101] 1.00e-09 1.22e-01 2.12e-02 3.09e-03 2.03e-03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
RecError
and RelChange
can be used to
diagnose whether the calculation is converging or not.
layout(t(1:2))
plot(log10(out_STMF$RecError[2:101]), type="b", main="Reconstruction Error")
plot(log10(out_STMF$RelChange[2:101]), type="b", main="Relative Change")
The product of \(U\) and \(V\) shows that the original data can be
well-recovered by dNMF
.
<- out_STMF$U %*% t(out_STMF$V)
recX layout(t(1:2))
image(X, main="Original Data")
image(recX, main="Reconstructed Data (STMF)")
You will notice that \(U\) looks ternary but \(V\) does not.
layout(t(1:2))
hist(out_STMF$U, breaks=100)
hist(out_STMF$V, breaks=100)
## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux bookworm/sid
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.21.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] nnTensor_1.1.10 dcTensor_0.99.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.9 highr_0.10 RColorBrewer_1.1-3 bslib_0.4.2
## [5] compiler_4.2.2 pillar_1.8.1 jquerylib_0.1.4 rTensor_1.4.8
## [9] viridis_0.6.2 tools_4.2.2 digest_0.6.31 dotCall64_1.0-2
## [13] jsonlite_1.8.4 evaluate_0.19 lifecycle_1.0.3 tibble_3.1.8
## [17] gtable_0.3.1 viridisLite_0.4.1 pkgconfig_2.0.3 rlang_1.0.6
## [21] cli_3.5.0 yaml_2.3.6 spam_2.9-1 xfun_0.36
## [25] fastmap_1.1.0 gridExtra_2.3 stringr_1.5.0 knitr_1.41
## [29] vctrs_0.5.1 sass_0.4.4 fields_14.1 maps_3.4.1
## [33] plot3D_1.4 grid_4.2.2 glue_1.6.2 R6_2.5.1
## [37] fansi_1.0.3 tcltk_4.2.2 rmarkdown_2.19 ggplot2_3.4.0
## [41] magrittr_2.0.3 MASS_7.3-58.1 scales_1.2.1 htmltools_0.5.4
## [45] tagcloud_0.6 misc3d_0.9-1 colorspace_2.0-3 utf8_1.2.2
## [49] stringi_1.7.8 munsell_0.5.0 cachem_1.0.6