The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The missoNet package implements a powerful framework for multitask learning with missing responses, simultaneously estimating:
Regression coefficients (\(\mathbf{B}\)): Relationships between predictors and multiple responses
Conditional network (\(\Theta\)): Dependencies among responses after accounting for predictors
The conditional Gaussian model is: \[ \mathbf{Y} = \mathbf{1}\mu^T + \mathbf{X}\mathbf{B} + \mathbf{E}, \quad \mathbf{E} \sim \mathrm{MVN}(0, \Theta^{-1}) \] where:
\(\mathbf{Y} \in \mathbb{R}^{n \times q}\): Response matrix (may contain missing values)
\(\mathbf{X} \in \mathbb{R}^{n \times p}\): Predictor matrix (complete)
\(\mathbf{B} \in \mathbb{R}^{p \times q}\): Coefficient matrix
\(\Theta \in \mathbb{R}^{q \times q}\): Precision matrix (inverse covariance)
\(\mu \in \mathbb{R}^q\): Intercept vector
For theoretical details, see Zeng et al. (2025).
The package includes a flexible data generator for testing:
# Generate synthetic data
sim <- generateData(
n = 200, # Sample size
p = 50, # Number of predictors
q = 10, # Number of responses
rho = 0.1, # Missing rate (10%)
missing.type = "MCAR" # Missing completely at random
)
# Examine the data structure
str(sim, max.level = 1)
#> List of 7
#> $ X : num [1:200, 1:50] -0.424 0.84 -2.546 1.825 1.217 ...
#> $ Y : num [1:200, 1:10] -0.0884 -0.3687 2.7607 -2.1025 3.2892 ...
#> $ Z : num [1:200, 1:10] -0.0884 -0.3687 2.7607 -2.1025 3.2892 ...
#> $ Beta : num [1:50, 1:10] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Theta : num [1:10, 1:10] 1 0 0 0 0 0 0 0 0 0 ...
#> $ rho : num [1:10] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
#> $ missing.type: chr "MCAR"
#> - attr(*, "class")= chr "missoNet.sim"
# Check dimensions
cat("Predictors (X):", dim(sim$X), "\n")
#> Predictors (X): 200 50
cat("Complete responses (Y):", dim(sim$Y), "\n")
#> Complete responses (Y): 200 10
cat("Observed responses (Z):", dim(sim$Z), "\n")
#> Observed responses (Z): 200 10
cat("Missing rate:", sprintf("%.1f%%", mean(is.na(sim$Z)) * 100), "\n")
#> Missing rate: 10.0%
# Fit missoNet with automatic parameter selection
fit <- missoNet(
X = sim$X,
Y = sim$Z, # Use observed responses with missing values
GoF = "BIC" # Goodness-of-fit criterion
)
#>
#> =============================================================
#> missoNet
#> =============================================================
#>
#> > Initializing model...
#>
#> --- Model Configuration -------------------------------------
#> Data dimensions: n = 200, p = 50, q = 10
#> Missing rate (avg): 10.0%
#> Selection criterion: BIC
#> Lambda grid: standard (dense)
#> Lambda grid size: 50 x 50 = 2500 models
#> -------------------------------------------------------------
#>
#> --- Optimization Progress -----------------------------------
#> Stage 1: Initializing warm starts
#> Stage 2: Grid search (sequential)
#> -------------------------------------------------------------
#>
#> | | | 0% | | | 1% | |= | 1% | |= | 2% | |= | 3% | |== | 3% | |== | 4% | |== | 5% | |=== | 5% | |=== | 6% | |=== | 7% | |==== | 7% | |==== | 8% | |==== | 9% | |===== | 9% | |===== | 10% | |===== | 11% | |====== | 11% | |====== | 12% | |====== | 13% | |======= | 13% | |======= | 14% | |======= | 15% | |======== | 15% | |======== | 16% | |======== | 17% | |========= | 17% | |========= | 18% | |========= | 19% | |========== | 19% | |========== | 20% | |========== | 21% | |=========== | 21% | |=========== | 22% | |=========== | 23% | |============ | 23% | |============ | 24% | |============ | 25% | |============= | 25% | |============= | 26% | |============= | 27% | |============== | 27% | |============== | 28% | |============== | 29% | |=============== | 29% | |=============== | 30% | |=============== | 31% | |================ | 31% | |================ | 32% | |================ | 33% | |================= | 33% | |================= | 34% | |================= | 35% | |================== | 35% | |================== | 36% | |================== | 37% | |=================== | 37% | |=================== | 38% | |=================== | 39% | |==================== | 39% | |==================== | 40% | |==================== | 41% | |===================== | 41% | |===================== | 42% | |===================== | 43% | |====================== | 43% | |====================== | 44% | |====================== | 45% | |======================= | 45% | |======================= | 46% | |======================= | 47% | |======================== | 47% | |======================== | 48% | |======================== | 49% | |========================= | 49% | |========================= | 50% | |========================= | 51% | |========================== | 51% | |========================== | 52% | |========================== | 53% | |=========================== | 53% | |=========================== | 54% | |=========================== | 55% | |============================ | 55% | |============================ | 56% | |============================ | 57% | |============================= | 57% | |============================= | 58% | |============================= | 59% | |============================== | 59% | |============================== | 60% | |============================== | 61% | |=============================== | 61% | |=============================== | 62% | |=============================== | 63% | |================================ | 63% | |================================ | 64% | |================================ | 65% | |================================= | 65% | |================================= | 66% | |================================= | 67% | |================================== | 67% | |================================== | 68% | |================================== | 69% | |=================================== | 69% | |=================================== | 70% | |=================================== | 71% | |==================================== | 71% | |==================================== | 72% | |==================================== | 73% | |===================================== | 73% | |===================================== | 74% | |===================================== | 75% | |====================================== | 75% | |====================================== | 76% | |====================================== | 77% | |======================================= | 77% | |======================================= | 78% | |======================================= | 79% | |======================================== | 79% | |======================================== | 80% | |======================================== | 81% | |========================================= | 81% | |========================================= | 82% | |========================================= | 83% | |========================================== | 83% | |========================================== | 84% | |========================================== | 85% | |=========================================== | 85% | |=========================================== | 86% | |=========================================== | 87% | |============================================ | 87% | |============================================ | 88% | |============================================ | 89% | |============================================= | 89% | |============================================= | 90% | |============================================= | 91% | |============================================== | 91% | |============================================== | 92% | |============================================== | 93% | |=============================================== | 93% | |=============================================== | 94% | |=============================================== | 95% | |================================================ | 95% | |================================================ | 96% | |================================================ | 97% | |================================================= | 97% | |================================================= | 98% | |================================================= | 99% | |==================================================| 99% | |==================================================| 100%
#>
#> -------------------------------------------------------------
#>
#> > Refitting optimal model ...
#>
#>
#> --- Optimization Results ------------------------------------
#> Optimal lambda.beta: 5.8376e-01
#> Optimal lambda.theta: 1.2260e-01
#> BIC value: 4305.2563
#> Active predictors: 31 / 50 (62.0%)
#> Network edges: 3 / 45 (6.7%)
#> -------------------------------------------------------------
#>
#> =============================================================
# Extract optimal estimates
Beta.hat <- fit$est.min$Beta
Theta.hat <- fit$est.min$Theta
mu.hat <- fit$est.min$mu
# Model summary
cat("Selected lambda.beta:", fit$est.min$lambda.beta, "\n")
#> Selected lambda.beta: 0.5837619
cat("Selected lambda.theta:", fit$est.min$lambda.theta, "\n")
#> Selected lambda.theta: 0.1225953
cat("Active predictors:", sum(rowSums(abs(Beta.hat)) > 1e-8), "/", nrow(Beta.hat), "\n")
#> Active predictors: 31 / 50
cat("Network edges:", sum(abs(Theta.hat[upper.tri(Theta.hat)]) > 1e-8),
"/", ncol(Theta.hat) * (ncol(Theta.hat)-1) / 2, "\n")
#> Network edges: 3 / 45
# Split data for demonstration
train_idx <- 1:150
test_idx <- 151:200
# Refit on training data
fit_train <- missoNet(
X = sim$X[train_idx, ],
Y = sim$Z[train_idx, ],
GoF = "BIC",
verbose = 0 # Suppress output
)
# Predict on test data
Y_pred <- predict(fit_train, newx = sim$X[test_idx, ])
# Evaluate predictions (using complete data for comparison)
mse <- mean((Y_pred - sim$Y[test_idx, ])^2)
cat("Test set MSE:", round(mse, 4), "\n")
#> Test set MSE: 1.2326
missoNet handles three types of missing data:
# Generate data with different missing mechanisms
n <- 300; p <- 30; q <- 8; rho <- 0.15
sim_mcar <- generateData(n, p, q, rho, missing.type = "MCAR")
sim_mar <- generateData(n, p, q, rho, missing.type = "MAR")
sim_mnar <- generateData(n, p, q, rho, missing.type = "MNAR")
# Visualize missing patterns
par(mfrow = c(1, 3), mar = c(4, 4, 3, 1))
# MCAR pattern
image(1:q, 1:n, t(is.na(sim_mcar$Z)),
col = c("white", "darkred"),
xlab = "Response", ylab = "Observation",
main = "MCAR: Random Pattern")
# MAR pattern
image(1:q, 1:n, t(is.na(sim_mar$Z)),
col = c("white", "darkred"),
xlab = "Response", ylab = "Observation",
main = "MAR: Depends on X")
# MNAR pattern
image(1:q, 1:n, t(is.na(sim_mnar$Z)),
col = c("white", "darkred"),
xlab = "Response", ylab = "Observation",
main = "MNAR: Depends on Y")
# Fit with different criteria
criteria <- c("AIC", "BIC", "eBIC")
results <- list()
for (crit in criteria) {
results[[crit]] <- missoNet(
X = sim$X,
Y = sim$Z,
GoF = crit,
verbose = 0
)
}
# Compare selected models
comparison <- data.frame(
Criterion = criteria,
Lambda.Beta = sapply(results, function(x) x$est.min$lambda.beta),
Lambda.Theta = sapply(results, function(x) x$est.min$lambda.theta),
Active.Predictors = sapply(results, function(x)
sum(rowSums(abs(x$est.min$Beta)) > 1e-8)),
Network.Edges = sapply(results, function(x)
sum(abs(x$est.min$Theta[upper.tri(x$est.min$Theta)]) > 1e-8)),
GoF.Value = sapply(results, function(x) x$est.min$gof)
)
print(comparison, digits = 4)
#> Criterion Lambda.Beta Lambda.Theta Active.Predictors Network.Edges GoF.Value
#> AIC AIC 0.3770 0.007715 42 34 4021
#> BIC BIC 0.5838 0.122595 31 3 4305
#> eBIC eBIC 0.5838 0.122595 31 3 4483
# Define custom regularization paths
lambda.beta <- 10^seq(0, -2, length.out = 15)
lambda.theta <- 10^seq(0, -2, length.out = 15)
# Fit with custom grid
fit_custom <- missoNet(
X = sim$X,
Y = sim$Z,
lambda.beta = lambda.beta,
lambda.theta = lambda.theta,
verbose = 0
)
# Grid coverage summary
cat(" Beta range: [",
sprintf("%.4f", min(fit_custom$param_set$gof.grid.beta)), ", ",
sprintf("%.4f", max(fit_custom$param_set$gof.grid.beta)), "]\n", sep = "")
#> Beta range: [0.0100, 1.0000]
cat(" Theta range: [",
sprintf("%.4f", min(fit_custom$param_set$gof.grid.theta)), ", ",
sprintf("%.4f", max(fit_custom$param_set$gof.grid.theta)), "]\n", sep = "")
#> Theta range: [0.0100, 1.0000]
cat(" Total models evaluated:", length(fit_custom$param_set$gof), "\n")
#> Total models evaluated: 225
# Create data with variable missing rates across responses
n <- 300; p <- 30; q <- 8; rho <- 0.15
rho_vec <- seq(0.05, 0.30, length.out = q)
sim_var <- generateData(
n = 300,
p = 30,
q = 8,
rho = rho_vec, # Different missing rate for each response
missing.type = "MAR"
)
# Examine missing patterns
miss_summary <- data.frame(
Response = paste0("Y", 1:q),
Target = rho_vec,
Actual = colMeans(is.na(sim_var$Z))
)
print(miss_summary, digits = 3)
#> Response Target Actual
#> 1 Y1 0.0500 0.0367
#> 2 Y2 0.0857 0.0500
#> 3 Y3 0.1214 0.1167
#> 4 Y4 0.1571 0.1633
#> 5 Y5 0.1929 0.2000
#> 6 Y6 0.2286 0.2267
#> 7 Y7 0.2643 0.2367
#> 8 Y8 0.3000 0.3033
# Fit model accounting for variable missingness
fit_var <- missoNet(
X = sim_var$X,
Y = sim_var$Z,
adaptive.search = TRUE, # Fast adaptive search
verbose = 0
)
# Visualize
plot(fit_var)
# Use penalty factors to incorporate prior information
p <- ncol(sim$X)
q <- ncol(sim$Z)
# Example: We know predictors 1-10 are important
beta.pen.factor <- matrix(1, p, q)
beta.pen.factor[1:10, ] <- 0.1 # Lighter penalty for known important predictors
# Example: We expect certain response pairs to be connected
theta.pen.factor <- matrix(1, q, q)
theta.pen.factor[1, 2] <- theta.pen.factor[2, 1] <- 0.1
theta.pen.factor[3, 4] <- theta.pen.factor[4, 3] <- 0.1
# Fit with prior information
fit_prior <- missoNet(
X = sim$X,
Y = sim$Z,
beta.pen.factor = beta.pen.factor,
theta.pen.factor = theta.pen.factor
)
# Standardization is recommended (default: TRUE)
# for numerical stability and comparable penalties
fit_std <- missoNet(X = sim$X, Y = sim$Z,
standardize = TRUE,
standardize.response = TRUE)
# Without standardization (for pre-scaled data)
fit_no_std <- missoNet(X = scale(sim$X), Y = scale(sim$Z),
standardize = FALSE,
standardize.response = FALSE)
# Adjust convergence settings based on problem difficulty and time constraints
fit_tight <- missoNet(
X = sim$X,
Y = sim$Z,
beta.tol = 1e-6, # Tighter tolerance
theta.tol = 1e-6,
beta.max.iter = 10000, # More iterations allowed
theta.max.iter = 10000
)
# For quick exploration, use looser settings
fit_quick <- missoNet(
X = sim$X,
Y = sim$Z,
beta.tol = 1e-3, # Looser tolerance
theta.tol = 1e-3,
beta.max.iter = 1000, # Fewer iterations
theta.max.iter = 1000,
adaptive.search = TRUE # Fast adaptive search
)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.