Augmenting designs with controlled efficiency loss

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Motivation

In practice an experiment is rarely designed from scratch. A researcher may already have data collected at certain conditions and want to add new observations to improve estimation — without discarding what has already been measured. The key question is: where can new points be placed so that the efficiency of the augmented design stays above an acceptable threshold?

optedr answers this question with two functions used in sequence:

get_augment_region() — computes the candidate region: the set of design points whose addition keeps the D-efficiency of the augmented design above a user-specified threshold delta_val.
augment_design() — adds a chosen point to the initial design and rescales the weights.

Both functions support the same optimality criteria as opt_des() and work for any number of factors.

Key parameters

Parameter	Role
`init_design`	Current design (data frame with `Point`/`Weight` in 1D, or factor columns + `Weight` in multi-D)
`alpha`	Fraction of total weight assigned to the new point after augmentation
`delta_val`	Minimum acceptable D-efficiency of the augmented design
`calc_optimal_design`	If `TRUE`, also computes the optimal design and uses it as the reference for efficiency
`new_points`	Data frame of points to add (non-interactive mode); omit for interactive mode
`par_int`	Indices of parameters of interest (Ds-Optimality only)
`n_lhs`	Number of Latin-Hypercube candidates for the region search (multi-D)

One-factor augmentation

Step 1: compute the candidate region

We start with a uniform three-point design for Antoine’s equation and look for points that keep the D-efficiency of the augmented design above 85 %.

init_des <- data.frame(
  Point  = c(30, 60, 90),
  Weight = c(1/3, 1/3, 1/3)
)

region <- get_augment_region(
  criterion           = "D-Optimality",
  init_design         = init_des,
  alpha               = 0.25,
  model               = y ~ 10^(a - b / (c + x)),
  parameters          = c("a", "b", "c"),
  par_values          = c(8.07131, 1730.63, 233.426),
  design_space        = c(1, 100),
  calc_optimal_design = FALSE,
  delta_val           = 0.85
)

print(region)
#> Augment candidate region  (delta = 0.8500)
#>   Intervals: [5.361, 100]

region$region is a data frame of candidate intervals. Each row gives a lower and upper bound on the design space where the new point can be placed.

Step 2: choose a point and augment

new_pt <- mean(region$region[1:2])

augmented <- augment_design(
  criterion           = "D-Optimality",
  init_design         = init_des,
  alpha               = 0.25,
  model               = y ~ 10^(a - b / (c + x)),
  parameters          = c("a", "b", "c"),
  par_values          = c(8.07131, 1730.63, 233.426),
  design_space        = c(1, 100),
  calc_optimal_design = FALSE,
  delta_val           = 0.85,
  new_points          = data.frame(Point = new_pt, Weight = 1)
)

print(augmented)
#>      Point Weight
#> 1 30.00000   0.25
#> 2 60.00000   0.25
#> 3 90.00000   0.25
#> 4 52.68026   0.25
cat("Sum of weights:", sum(augmented$Weight), "\n")
#> Sum of weights: 1

Comparing efficiency before and after

result_opt <- opt_des(
  "D-Optimality",
  y ~ 10^(a - b / (c + x)), c("a", "b", "c"),
  c(8.07131, 1730.63, 233.426), c(1, 100)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (27/s) | 802ms
ℹ The lower bound for efficiency is 99.9986187558838%
#> 
                                                    

eff_before <- design_efficiency(init_des, result_opt)
#> ℹ The efficiency of the design is 38.5312233962882%
eff_after  <- design_efficiency(augmented, result_opt)
#> ℹ The efficiency of the design is 34.2933573610529%

cat("Efficiency before augmenting:", round(eff_before * 100, 2), "%\n")
#> Efficiency before augmenting: 38.53 %
cat("Efficiency after augmenting: ", round(eff_after  * 100, 2), "%\n")
#> Efficiency after augmenting:  34.29 %
cat("Gain:                        ", round((eff_after - eff_before) * 100, 2),
    "percentage points\n")
#> Gain:                         -4.24 percentage points

Using the optimal design as reference (`calc_optimal_design = TRUE`)

When calc_optimal_design = TRUE, the function internally computes the optimal design and uses it to define the efficiency threshold. This is the recommended mode when no optimal design has been computed yet:

region_opt <- get_augment_region(
  criterion           = "D-Optimality",
  init_design         = init_des,
  alpha               = 0.25,
  model               = y ~ 10^(a - b / (c + x)),
  parameters          = c("a", "b", "c"),
  par_values          = c(8.07131, 1730.63, 233.426),
  design_space        = c(1, 100),
  calc_optimal_design = TRUE,
  delta_val           = 0.85
)

Two-factor augmentation

In multi-dimensional spaces get_augment_region() samples candidate points with a Latin Hypercube (controlled by n_lhs) and returns a data frame of candidates together with their estimated efficiency gain. A heatmap of the efficiency function is displayed automatically.

Initial design and candidate region

init_2d <- data.frame(
  x1     = c(0.8, 10, 5),
  x2     = c(10, 0.8, 5),
  Weight = c(1/3, 1/3, 1/3)
)

result_2D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters   = c("Vmax", "K1", "K2"),
  par_values   = c(1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10))
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (19/s) | 1.2s
ℹ The lower bound for efficiency is 99.9989441805938%
#> 
                                                   

region_2d <- get_augment_region(
  criterion           = "D-Optimality",
  init_design         = init_2d,
  alpha               = 0.25,
  model               = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters          = c("Vmax", "K1", "K2"),
  par_values          = c(1, 1, 1),
  design_space        = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
  calc_optimal_design = FALSE,
  delta_val           = 0.85
)

#> ℹ 1893 candidate points with efficiency >= 0.85 (from LHS sample of 2000)

region_2d$region is a data frame of sampled candidates, each with an efficiency column. Pick the candidate that maximises efficiency:

best_2d <- region_2d$region[which.max(region_2d$region$efficiency), ]

eff_antes <- suppressMessages(design_efficiency(init_2d, result_2D))

aug_2d <- augment_design(
  criterion           = "D-Optimality",
  init_design         = init_2d,
  alpha               = 0.25,
  model               = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters          = c("Vmax", "K1", "K2"),
  par_values          = c(1, 1, 1),
  design_space        = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
  calc_optimal_design = FALSE,
  delta_val           = 0.85,
  new_points          = data.frame(x1 = best_2d$x1, x2 = best_2d$x2, Weight = 1)
)

#> ℹ 1907 candidate points with efficiency >= 0.85 (from LHS sample of 2000)
#> Sample of candidate points:
#>           x1       x2 efficiency
#> 1  5.0800948 0.565028  0.9395912
#> 2  6.9302761 7.020460  1.0890899
#> 3  1.0567895 3.294382  0.9257834
#> 4  3.8720654 3.746035  0.8635142
#> 5  2.1816247 5.556157  0.8668131
#> 6  1.9701608 4.457731  0.8617009
#> 7  3.9151828 8.208373  0.9847953
#> 8  9.0526676 9.713675  1.2208120
#> 9  0.4385586 6.797933  0.9286939
#> 10 6.7020570 4.107606  0.9600131
#> 11 5.0907975 3.073397  0.8737682
#> 12 9.4098767 8.629254  1.2035761
#> 13 8.8662268 7.580767  1.1613383
#> 14 9.5557108 9.520544  1.2282312
#> 15 1.8981463 5.497884  0.8705040

eff_despues <- suppressMessages(design_efficiency(aug_2d, result_2D))

cat("Efficiency before:", round(eff_antes  * 100, 2), "%\n")
#> Efficiency before: 68.4 %
cat("Efficiency after: ", round(eff_despues * 100, 2), "%\n")
#> Efficiency after:  85.21 %
print(aug_2d)
#>          x1        x2 Weight
#> 1  0.800000 10.000000   0.25
#> 2 10.000000  0.800000   0.25
#> 3  5.000000  5.000000   0.25
#> 4  9.985942  9.905381   0.25

Three-factor augmentation

For three or more factors the candidate region is displayed as a scatter-matrix coloured by candidate/non-candidate status, with the current design shown as triangles.

init_3d <- data.frame(
  x1     = c(0.8, 10,  10,  0.8, 10),
  x2     = c(10,  0.8, 10,  10,  0.8),
  x3     = c(10,  10,  0.8, 0.8, 10),
  Weight = rep(0.2, 5)
)

region_3d <- get_augment_region(
  criterion           = "D-Optimality",
  init_design         = init_3d,
  alpha               = 0.45,
  model               = y ~ Vmax * x1 * x2 * x3 / ((K1+x1) * (K2+x2) * (K3+x3)),
  parameters          = c("Vmax", "K1", "K2", "K3"),
  par_values          = c(1, 1, 1, 1),
  design_space        = list(x1 = c(0.1, 10), x2 = c(0.1, 10), x3 = c(0.1, 10)),
  calc_optimal_design = FALSE,
  delta_val           = 0.93
)

#> ℹ 948 candidate points with efficiency >= 0.93 (from LHS sample of 2000)
cat("Number of candidate points:", nrow(region_3d$region), "\n")
#> Number of candidate points: 948
plot(region_3d$plot)

Augmenting with Ds-Optimality

When the goal is to augment while preserving estimation quality for a subset of parameters, use criterion = "Ds-Optimality" and pass par_int:

region_ds <- get_augment_region(
  criterion           = "Ds-Optimality",
  init_design         = init_2d,
  alpha               = 0.25,
  model               = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters          = c("Vmax", "K1", "K2"),
  par_values          = c(1, 1, 1),
  design_space        = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
  calc_optimal_design = FALSE,
  par_int             = c(1),
  delta_val           = 0.85,
  n_lhs               = 5000
)

#> ℹ 3493 candidate points with efficiency >= 0.85 (from LHS sample of 5000)

best_ds <- region_ds$region[which.max(region_ds$region$efficiency), ]

aug_ds <- augment_design(
  criterion           = "Ds-Optimality",
  init_design         = init_2d,
  alpha               = 0.25,
  model               = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters          = c("Vmax", "K1", "K2"),
  par_values          = c(1, 1, 1),
  design_space        = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
  calc_optimal_design = FALSE,
  par_int             = c(1),
  delta_val           = 0.85,
  new_points          = data.frame(x1 = best_ds$x1, x2 = best_ds$x2, Weight = 1),
  n_lhs               = 5000
)

#> ℹ 3483 candidate points with efficiency >= 0.85 (from LHS sample of 5000)
#> Sample of candidate points:
#>           x1        x2 efficiency
#> 1  2.8650576 9.9589864  1.0325916
#> 2  3.9641865 0.3004131  0.9416510
#> 3  4.8414839 6.6097326  1.3811784
#> 4  8.8811449 3.6413103  1.2404576
#> 5  4.9945051 3.3432554  0.8884957
#> 6  8.9748253 8.1732178  2.5510765
#> 7  5.0966627 3.8485719  0.9822186
#> 8  3.9080466 1.0166707  0.8757124
#> 9  2.4809849 0.2282649  0.9329601
#> 10 5.7111956 8.1552418  1.8190827
#> 11 0.2588158 1.3963485  0.9684463
#> 12 7.8919712 3.4886202  1.1313313
#> 13 9.6786526 6.4234792  2.2182760
#> 14 9.4158933 3.3784509  1.1810324
#> 15 8.2251447 9.7007518  2.6831869
print(aug_ds)
#>          x1        x2 Weight
#> 1  0.800000 10.000000   0.25
#> 2 10.000000  0.800000   0.25
#> 3  5.000000  5.000000   0.25
#> 4  9.890709  9.955085   0.25

Interactive mode

Omitting new_points (and delta_val) from both functions triggers an interactive session where the package plots the candidate region and asks the user to type a point. This mode is documented in ?augment_design.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.

Augmenting designs with controlled efficiency loss

Motivation

Key parameters

One-factor augmentation

Step 1: compute the candidate region

Step 2: choose a point and augment

Comparing efficiency before and after

Using the optimal design as reference (calc_optimal_design = TRUE)

Two-factor augmentation

Initial design and candidate region

Three-factor augmentation

Augmenting with Ds-Optimality

Interactive mode

Using the optimal design as reference (`calc_optimal_design = TRUE`)