The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Method

This repository provides an implementation of the MGLasso (Multiscale Graphical Lasso) algorithm: an approach for estimating sparse Gaussian Graphical Models with the addition of a group-fused Lasso penalty.

MGLasso is described in the paper Inference of Multiscale Gaussian Graphical Model. MGLasso has these contributions:

To solve the MGLasso problem, we seek the regression vectors \(\beta^i\) that minimize

$J{\lambda_1, \lambda_2}(\boldsymbol{\beta}; \mathbf{X} ) = \frac{1}{2} \sum{i=1}p \left \lVert \mathbf{X}i - \mathbf{X}{\setminus i} \boldsymbol{\beta}i \right \rVert2 2 + \lambda_1 \sum{i = 1}p
\left \lVert \boldsymbol{\beta}i \right \rVert1 + \lambda_2 \sum{i < j} \left \lVert \boldsymbol{\beta}i - \tau_{ij}(\boldsymbol{\beta}j) \right \rVert_2.$

MGLasso package is based on the python implementation of the solver CONESTA available in pylearn-parsimony library.

Package requirements and installation

install.packages('reticulate')
reticulate::install_miniconda()
# install.packages('mglasso')
remotes::install_github("desanou/mglasso")
library(mglasso)
install_pylearn_parsimony(envname = "rmglasso", method = "conda")
reticulate::use_condaenv("rmglasso", required = TRUE)
reticulate::py_config()

The conesta_solver is delay loaded. See reticulate::import_from_path for details.

An example of use is given below.

Illustration on a simple model

Simulate a block diagonal model

We simulate a \(3\)-block diagonal model where each block contains \(3\) variables. The intra-block correlation level is set to \(0.85\) while the correlations outside the blocks are kept to \(0\).

library(Matrix)
n = 50
K = 3
p = 9
rho = 0.85
blocs <- list()

for (j in 1:K) {
  bloc <- matrix(rho, nrow = p/K, ncol = p/K)
  for(i in 1:(p/K)) { bloc[i,i] <- 1 }
  blocs[[j]] <- bloc
}

mat.correlation <- Matrix::bdiag(blocs)
corrplot::corrplot(as.matrix(mat.correlation), method = "color", tl.col="black")

Simulate gaussian data from the covariance matrix

set.seed(11)
X <- mvtnorm::rmvnorm(n, mean = rep(0,p), sigma = as.matrix(mat.correlation))
colnames(X) <- LETTERS[1:9]

Run mglasso()

We set the sparsity level \(\lambda_1\) to \(0.2\) and rescaled it with the size of the sample.

X <- scale(X)    
res <- mglasso(X, lambda1 = 0.2*n, lambda2_start = 0.1, fuse_thresh = 1e-3, verbose = FALSE)

To launch a unique run of the objective function call the conesta function.

temp <- mglasso::conesta(X, lam1 = 0.2*n, lam2 = 0.1)

Estimated clustering path

We plot the clustering path of mglasso method on the 2 principal components axis of \(X\). The path is drawn on the predicted \(X\)'s.

library(ggplot2)
library(ggrepel)
mglasso:::plot_clusterpath(as.matrix(X), res)

Estimated adjacency matrices along the clustering path

As the the fusion penalty increases from level9 to level1 we observe a progressive fusion of adjacent edges.

plot_mglasso(res)

Reference

Edmond, Sanou; Christophe, Ambroise; Geneviève, Robin; (2022): Inference of Multiscale Gaussian Graphical Model. ArXiv. Preprint. https://doi.org/10.48550/arXiv.2202.05775

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.