KL-Optimality: designs for model discrimination

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Background

Standard optimality criteria (D, A, I, …) assume the model is correctly specified and aim to estimate its parameters as precisely as possible. KL-Optimality addresses a different question: given two competing models, where should observations be taken to tell them apart most easily?

The criterion is based on the Kullback-Leibler divergence between the reference model \(f_1\) and the best-fitting version of the rival \(f_2\):

\[\Phi_{\text{KL}}(\xi) = \int \min_{\beta_2} \text{KL}(f_1(y \mid x) \,\|\, f_2(y \mid x, \beta_2))\, \xi(dx).\]

The design \(\xi^*\) that maximises \(\Phi_{\text{KL}}\) places observations where the two models are most distinguishable on average.

Reference: López-Fidalgo, Tommasi & Trandafir (2007). An optimal experimental design criterion for discriminating between non-normal models. J. R. Stat. Soc. B, 69(2), 231–242.

API overview

opt_des(
  "KL-Optimality",
  model        = <reference model formula>,
  parameters   = <reference parameter names>,
  par_values   = <nominal values>,
  design_space = <scalar interval or named list>,
  # rival specification:
  rival_model  = <rival formula>,       # default: same as model
  rival_params = <rival param names>,   # default: same as parameters
  rival_pars   = <starting values for internal optimisation>,
  rival_lower  = <lower bounds for rival params>,
  rival_upper  = <upper bounds for rival params>,
  # GLM family:
  family       = "Normal",   # "Poisson", "Binomial", "Gamma"
  phi          = 1,          # dispersion parameter
  # alternative: supply a custom KL function
  kl_fun       = NULL
)

The internal optimisation over rival parameters is done automatically. The optimal rival values are stored as a hidden attribute and shown by summary().

Example 1: Michaelis-Menten vs linear rival (Normal, 1D)

Context. Enzyme kinetics with one substrate. The reference model is Michaelis-Menten \(\mu(x) = V_{\max} x / (K_m + x)\) with \(V_{\max} = 2\), \(K_m = 1\). The rival is a linear rate \(\mu(x) = a x\), valid only when \(x \ll K_m\). At high concentrations MM saturates while the linear model keeps growing — that is where the models diverge most.

result_kl_mm <- opt_des(
  "KL-Optimality",
  model        = y ~ Vmax * x / (Km + x),
  parameters   = c("Vmax", "Km"),
  par_values   = c(2, 1),
  design_space = c(0.1, 5),
  rival_model  = y ~ a * x,
  rival_params = c("a"),
  rival_pars   = c(1),
  rival_lower  = c(0.2),
  rival_upper  = c(2.5),
  family       = "Normal",
  phi          = 1
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 7 done (11/s) | 626ms
ℹ The lower bound for efficiency is 99.9957589718079%
#> 
                                                   
result_kl_mm
#> Optimal design for KL-Optimality:
#>      Point    Weight
#> 1 1.127578 0.8159831
#> 2 5.000000 0.1840169

summary() shows the GLM family and the optimal rival parameters found by the internal optimisation:

summary(result_kl_mm)
#> Model:
#> y ~ Vmax * x/(Km + x)
#> Weight function:
#> function (x) 
#> 1 
#> 
#> GLM family: Normal  (phi = 1)
#> Optimal rival parameters: 0.445
#> 
#> Optimal design for KL-Optimality:
#>      Point    Weight
#> 1 1.127578 0.8159831
#> 2 5.000000 0.1840169
#> 
#> Minimum efficiency (Atwood): 99.9957589718079%
#> Criterion value:             0.155804

The sensitivity plot shows \(d_{\text{KL}}(x, \xi)\) with the Equivalence Theorem bound (dashed). Support points concentrate in the saturation region where MM and the linear rival differ most:

plot(result_kl_mm)

KL sensitivity function: MM vs linear rival.

Efficiency of a five-point uniform design relative to the KL-optimal:

design_unif <- data.frame(
  Point  = c(0.1, 1.3, 2.5, 3.8, 5.0),
  Weight = rep(1/5, 5)
)
eff_kl <- design_efficiency(design_unif, result_kl_mm)
#> ℹ The efficiency of the design is 44.1071528869909%
cat("Efficiency of uniform design:", round(eff_kl * 100, 2), "%\n")
#> Efficiency of uniform design: 44.11 %

Example 2: exponential decay, Poisson family

Context. A drug clearance study where the event count follows a Poisson process with rate \(\mu(x) = \exp(a - bx)\). The reference model has nominal decay rate \(b = 0.5\). The rival represents a faster-elimination regime (\(b \in [0.8, 1.5]\)), which could arise from drug interactions. The KL divergence for Poisson is \(\text{KL} = \mu_2 - \mu_1 - \mu_1 \log(\mu_2/\mu_1)\); since the rival decays faster, the gap is largest at long follow-up times.

result_kl_pois <- opt_des(
  "KL-Optimality",
  model        = y ~ exp(a - b * x),
  parameters   = c("a", "b"),
  par_values   = c(2, 0.5),
  design_space = c(0, 4),
  rival_pars   = c(2, 1.0),
  rival_lower  = c(1.5, 0.8),
  rival_upper  = c(2.5, 1.5),
  family       = "Poisson",
  phi          = 1
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 4 done (7.2/s) | 555ms
ℹ The lower bound for efficiency is 99.9999938849073%
#> 
                                                    
result_kl_pois
#> Optimal design for KL-Optimality:
#>   Point    Weight
#> 1     0 0.2012678
#> 2     4 0.7987322
summary(result_kl_pois)
#> Model:
#> y ~ exp(a - b * x)
#> Weight function:
#> function (x) 
#> 1 
#> 
#> GLM family: Poisson  (phi = 1)
#> Optimal rival parameters: 2.2799, 0.8
#> 
#> Optimal design for KL-Optimality:
#>   Point    Weight
#> 1     0 0.2012678
#> 2     4 0.7987322
#> 
#> Minimum efficiency (Atwood): 99.9999938849073%
#> Criterion value:             0.318553

plot(result_kl_pois)

KL sensitivity for Poisson decay model.

The optimal rival parameters found internally confirm the closest rival within the constraint:

hv <- attr(result_kl_pois, "hidden_value")
cat("Optimal rival: a =", round(hv$beta2_star[1], 3),
    " b =", round(hv$beta2_star[2], 3), "\n")
#> Optimal rival: a = 2.28  b = 0.8

Using `make_kl_fun()` for custom KL functions

For some discrimination problems the KL divergence does not reduce to the standard cumulant-generating-function form that opt_des() uses internally. make_kl_fun() automates the construction of the KL function from two model-family pairs.

Supported pairs

Combination	Formula
Same family, same \(\phi\)	Standard cumulant form
Normal vs Normal, \(\phi_1 \neq \phi_2\)	\(\frac{1}{2}\left[\log\frac{\phi_2}{\phi_1} + \frac{\phi_1}{\phi_2} + \frac{(\mu_1-\mu_2)^2}{\phi_2} - 1\right]\)
Gamma vs Gamma, different shape	Closed form with digamma/lgamma

For other pairs, supply kl_fun directly as a function (x, beta2) -> scalar.

Example 3: Normal with different variances (1D)

Context. Same exponential decay model, but now the rival has variance \(\phi_2 = 4\) (four times larger). The KL function includes a constant term \(\frac{1}{2}[\log(\phi_2/\phi_1) + \phi_1/\phi_2 - 1]\) even when the means coincide, so the design must also account for the variance difference.

kl_fn_var <- make_kl_fun(
  "Normal",
  model1      = y ~ a * exp(-b * x),
  params1     = c("a", "b"),
  par_values1 = c(1, 0.5),
  phi1        = 1,
  family2     = "Normal",
  model2      = y ~ c * exp(-d * x),
  params2     = c("c", "d"),
  phi2        = 4
)

result_kl_var <- opt_des(
  "KL-Optimality",
  model        = y ~ a * exp(-b * x),
  parameters   = c("a", "b"),
  par_values   = c(1, 0.5),
  design_space = c(0, 4),
  kl_fun       = kl_fn_var,
  rival_pars   = c(1, 1.0),
  rival_lower  = c(0.5, 0.8),
  rival_upper  = c(2.0, 1.5)
)
#> 
⠙ Calculating optimal design 10 done (4.9/s) | 2.1s

⠹ Calculating optimal design 11 done (4.8/s) | 2.3s

⠸ Calculating optimal design 12 done (4.7/s) | 2.6s

⠼ Calculating optimal design 13 done (4.6/s) | 2.8s

⠴ Calculating optimal design 14 done (4.6/s) | 3.1s

⠦ Calculating optimal design 15 done (4.6/s) | 3.3s

⠧ Calculating optimal design 16 done (4.6/s) | 3.5s

⠇ Calculating optimal design 17 done (4.6/s) | 3.7s

⠏ Calculating optimal design 18 done (4.7/s) | 3.9s

⠋ Calculating optimal design 19 done (4.6/s) | 4.2s

⠙ Calculating optimal design 20 done (4.6/s) | 4.4s

⠹ Calculating optimal design 21 done (4.6/s) | 4.5s
#> ℹ Stop condition not reached, max iterations performed
#> 
⠸ Calculating optimal design 22 done (4.7/s) | 4.7s
ℹ The lower bound for efficiency is 99.9671378206639%
#> 
                                                    
result_kl_var
#> Optimal design for KL-Optimality:
#>      Point     Weight
#> 1 0.000000 0.16826906
#> 2 1.333333 0.07912872
#> 3 1.989481 0.61240029
#> 4 2.666667 0.14020193
summary(result_kl_var)
#> Model:
#> y ~ a * exp(-b * x)
#> Weight function:
#> function (x) 
#> 1 
#> 
#> KL function: user-supplied
#> Optimal rival parameters: 1.1353, 0.8
#> 
#> Optimal design for KL-Optimality:
#>      Point     Weight
#> 1 0.000000 0.16826906
#> 2 1.333333 0.07912872
#> 3 1.989481 0.61240029
#> 4 2.666667 0.14020193
#> 
#> Minimum efficiency (Atwood): 99.9671378206639%
#> Criterion value:             0.320445

plot(result_kl_var)

KL sensitivity: Normal phi=1 vs Normal phi=4.

Example 4: two-factor MM vs linear rival (`make_kl_fun`, 2D)

make_kl_fun() works with models that use different subsets of the design variables. Here the reference uses \((x_1, x_2)\) (bisubstrate MM) while the rival depends only on \(x_1\) (single-substrate linear approximation):

kl_fn_2d <- make_kl_fun(
  "Normal",
  model1      = y ~ Vmax * x1 * x2 / ((Km1 + x1) * (Km2 + x2)),
  params1     = c("Vmax", "Km1", "Km2"),
  par_values1 = c(1, 1, 1),
  model2      = y ~ alpha * x1,
  params2     = "alpha"
)

result_kl_2d <- opt_des(
  "KL-Optimality",
  model        = y ~ Vmax * x1 * x2 / ((Km1 + x1) * (Km2 + x2)),
  parameters   = c("Vmax", "Km1", "Km2"),
  par_values   = c(1, 1, 1),
  design_space = list(x1 = c(0.1, 5), x2 = c(0.1, 5)),
  kl_fun       = kl_fn_2d,
  rival_pars   = c(0.5),
  rival_lower  = c(0.05),
  rival_upper  = c(2.0)
)
#> 
⠙ Calculating optimal design 3 done (1.3/s) | 2.3s
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠹ Calculating optimal design 6 done (2.4/s) | 2.5s
ℹ The lower bound for efficiency is 99.991707653218%
#> 
                                                   
result_kl_2d
#> Optimal design for KL-Optimality (2 factors):
#>         x1  x2    Weight
#> 1 2.062645 5.0 0.7079501
#> 2 5.000000 0.1 0.2920499

The heatmap concentrates support where the difference between the bisubstrate hyperbola and the linear rival is greatest (saturation regime in both \(x_1\) and \(x_2\)):

plot(result_kl_2d)

KL sensitivity heatmap: 2D MM vs linear rival.

Example 5: discriminating error structures — the Hill model

Reference: De la Calle-Arroyo, Leorato, Rodríguez-Aragón & Tommasi (2025). Augmented designs to choose between constant absolute and relative errors. Chemometrics and Intelligent Laboratory Systems, doi:10.1016/j.chemolab.2025.105374.

Context. Dose-response experiments with the Hill (4-parameter Emax) model:

\[\eta(x, \beta) = (E_{\text{con}} - b) \cdot \frac{(x/\text{IC}_{50})^s}{1 + (x/\text{IC}_{50})^s} + b\]

with \(E_{\text{con}} = 1.70\), \(b = 0.137\), \(\text{IC}_{50} = 111\), \(s = -1.03\) and \(x \in [0.01, 1500]\) nM. Two error structures are competing:

Relative error (reference): \(\varepsilon \sim N(0,\, \sigma_{\text{rel}}^2 \cdot \eta^2)\)
Absolute error (rival): \(\varepsilon \sim N(0,\, \sigma_{\text{abs}}^2)\)

The KL divergence minimised over the rival variance \(\sigma_{\text{abs}}^2\) is

\[\text{KL}(x,\, \sigma_{\text{abs}}^2) = \frac{1}{2}\!\left(\frac{\eta^2(x)}{\sigma_{\text{abs}}^2} - 1 + \log\frac{\sigma_{\text{abs}}^2}{\eta^2(x)}\right).\]

We supply this directly as kl_fun since it is not covered by the standard Normal-vs-Normal formula (the reference variance is \(x\)-dependent):

kl_fun_hill <- function(x, beta2) {
  sigma_sq <- beta2[1]
  eta <- (1.70 - 0.137) * (x / 111)^(-1.03) / (1 + (x / 111)^(-1.03)) + 0.137
  0.5 * (eta^2 / sigma_sq - 1 + log(sigma_sq / eta^2))
}

result_kl_hill <- opt_des(
  "KL-Optimality",
  model        = y ~ (Econ - b) * (x / IC50)^s / (1 + (x / IC50)^s) + b,
  parameters   = c("Econ", "b", "IC50", "s"),
  par_values   = c(1.70, 0.137, 111, -1.03),
  design_space = c(0.01, 1500),
  kl_fun       = kl_fun_hill,
  rival_pars   = c(1.0),
  rival_lower  = c(1e-4),
  rival_upper  = c(1e6)
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 5 done (40/s) | 127ms
ℹ The lower bound for efficiency is 99.9997190548208%
#> 
                                                   
result_kl_hill
#> Optimal design for KL-Optimality:
#>     Point   Weight
#> 1    0.01 0.233994
#> 2 1500.00 0.766006
summary(result_kl_hill)
#> Model:
#> y ~ (Econ - b) * (x/IC50)^s/(1 + (x/IC50)^s) + b
#> Weight function:
#> function (x) 
#> 1 
#> 
#> KL function: user-supplied
#> Optimal rival parameters: 0.7192
#> 
#> Optimal design for KL-Optimality:
#>     Point   Weight
#> 1    0.01 0.233994
#> 2 1500.00 0.766006
#> 
#> Minimum efficiency (Atwood): 99.9997190548208%
#> Criterion value:             0.813492

plot(result_kl_hill)

KL sensitivity for the Hill model (error structure discrimination).

The KL-optimal design places all weight at the two extremes of the design space. This matches the result reported in the paper (support at \(x = 0.01\) with weight 0.23 and \(x = 1500\) with weight 0.77). The Atwood criterion at 100 % confirms the design is exactly optimal.

The optimal rival variance recovered internally:

hv_hill <- attr(result_kl_hill, "hidden_value")
cat("Optimal rival sigma_abs^2 =", round(hv_hill$beta2_star, 4), "\n")
#> Optimal rival sigma_abs^2 = 0.7192

Note that with only two support points this design cannot estimate all four parameters of the Hill model. The paper proposes combining it with a D-optimal design via the augmentation workflow described in vignette("optedr-augment").

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.

KL-Optimality: designs for model discrimination

Background

API overview

Example 1: Michaelis-Menten vs linear rival (Normal, 1D)

Example 2: exponential decay, Poisson family

Using make_kl_fun() for custom KL functions

Supported pairs

Example 3: Normal with different variances (1D)

Example 4: two-factor MM vs linear rival (make_kl_fun, 2D)

Example 5: discriminating error structures — the Hill model

Using `make_kl_fun()` for custom KL functions

Example 4: two-factor MM vs linear rival (`make_kl_fun`, 2D)