The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

dicepro - Hyperparameter Search Space Visualization

dicepro Team

2026-06-24

Note: All code chunks have eval = FALSE and are shown for illustration only. To run them interactively:

library(dicepro)
# copy-paste the chunks below into your R session

1 Overview

This vignette explains the two hyper-parameter search space strategies available in dicepro and shows how to visualize the resulting \((\gamma, \lambda)\) distributions with create_gamma_lambda_plot().

The hspaceTechniqueChoose argument controls which strategy is used, both in run_experiment() and in the plot function.


2 The Two Strategies

2.1 "all" - Independent sampling

\(\lambda\) and \(\gamma\) are each drawn independently from their own log-uniform distribution:

Parameter Distribution Range
lambda_ Log-uniform \([1,\; 10^8]\)
gamma Log-uniform \([1,\; 10^8]\)
p_prime Log-uniform \([10^{-6},\; 1]\)

No structural constraint links the two parameters. The resulting \((\gamma, \lambda)\) cloud fills the entire feasible rectangle uniformly on a log-log scale.

2.2 "restrictionEspace" - Linked sampling

\(\gamma\) is the base variable; \(\lambda\) is derived via:

\[\lambda = \gamma \times \lambda_\text{factor}, \quad \lambda_\text{factor} \sim \text{LogUniform}(2,\; 100)\]

Parameter Distribution Range
gamma Log-uniform \([1,\; 10^5]\)
lambda_factor Log-uniform \([2,\; 100]\)
p_prime Log-uniform \([0.1,\; 1]\)

This guarantees \(\lambda \geq 2\gamma\) at all times. The feasibility region is bounded by two diagonal lines in the log-log plane:


3 Visualizing the Search Space

create_gamma_lambda_plot() samples 200 configurations (by default) and renders them as scatter plot on log-log axes.

3.1 "all" - Independent space

library(dicepro)

p_all <- create_gamma_lambda_plot(hspaceTechniqueChoose = "all")
p_all

The cloud fills the square \([1, 10^8]^2\) uniformly, with no structural relationship between \(\gamma\) and \(\lambda\).

3.2 "restrictionEspace" - Restricted space

p_restr <- create_gamma_lambda_plot(hspaceTechniqueChoose = "restrictionEspace")
p_restr

All points fall within the diagonal band delimited by the two dashed lines. On log–log axes, the linear \(\lambda = c * \gamma\) relationship appear as parallel straight lines.


4 Simulated Data

Before running the optimization, we simulate a self-consistent data set using simulation(). The function returns a list with three elements:

run_experiment() expects a dataset list with keys $W, $P, and $B. We therefore rename $p to $P after simulation.

library(dicepro)
set.seed(2101L)

sim <- simulation(
  loi        = "gauss",
  scenario   = "hierarchical",
  nSample    = 30L,
  nGenes     = 200L,
  nCellsType = 10L,
  sigma_bio  = 0.07,
  sigma_tech = 0.07,
  seed       = 2101L
)

my_dataset <- list(
  W = sim$W,
  P = sim$p,
  B = sim$B
)

cat("W :", nrow(my_dataset$W), "genes x", ncol(my_dataset$W), "cell types\n")
cat("P :", nrow(my_dataset$P), "samples x", ncol(my_dataset$P), "cell types\n")
cat("B :", nrow(my_dataset$B), "genes x", ncol(my_dataset$B), "samples\n")
cat("Row sums of P (range):", round(range(rowSums(my_dataset$P)), 4), "\n")

5 Running the optimization

5.1 Strategy "all" - Independent sampling

results_all <- run_experiment(
  dataset               = my_dataset,
  W_prime               = 0,
  bulkName              = "SimBulk",
  refName               = "SimRef",
  hp_max_evals          = 150L,
  algo_select           = "random",
  output_base_dir       = tempdir(),
  hspaceTechniqueChoose = "all"
)

cat("Completed trials:", nrow(results_all$trials), "\n")
head(results_all$trials[, c("lambda_", "gamma", "p_prime", "loss", "constraint")])

5.2 Strategy "restrictionEspace" - linked sampling

results_restr <- run_experiment(
  dataset               = my_dataset,
  W_prime               = 0,
  bulkName              = "SimBulk",
  refName               = "SimRef",
  hp_max_evals          = 150L,
  algo_select           = "random",
  output_base_dir       = tempdir(),
  hspaceTechniqueChoose = "restrictionEspace"
)

cat("Completed trials:", nrow(results_restr$trials), "\n")
head(results_restr$trials[, c("lambda_", "gamma", "p_prime", "loss", "constraint")])

6 Comparing the Two Strategies

Once both runs are complete, we can overlay their \((\gamma, \lambda)\) distributions to compare coverage:

best_all   <- results_all$trials[which.min(results_all$trials$loss), ]
best_restr <- results_restr$trials[which.min(results_restr$trials$loss), ]

cat("--- all ---\n")
cat(sprintf("  lambda = %.3g  |  gamma = %.3g  |  loss = %.4f\n",
            best_all$lambda_, best_all$gamma, best_all$loss))

cat("--- restrictionEspace ---\n")
cat(sprintf("  lambda = %.3g  |  gamma = %.3g  |  loss = %.4f\n",
            best_restr$lambda_, best_restr$gamma, best_restr$loss))

plot(
  results_all$trials$gamma,
  results_all$trials$lambda_,
  log  = "xy",
  pch  = 19, cex = 0.5,
  col  = adjustcolor("steelblue", 0.4),
  xlab = expression(gamma), ylab = expression(lambda),
  main = "Sampled configurations: all (blue) vs restrictionEspace (orange)"
)
points(
  results_restr$trials$gamma,
  results_restr$trials$lambda_,
  pch = 19, cex = 0.5,
  col = adjustcolor("darkorange", 0.4)
)
legend("topleft",
       legend = c("all", "restrictionEspace"),
       col    = c("steelblue", "darkorange"),
       pch    = 19, pt.cex = 1.2)

7 Session Info

sessionInfo()

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.