The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette documents the grid_scan()
function
provided in this package which allows high-performance repeated
simulation runs over a grid of parameters and/or initial states. Here,
we take the classical LG circadian clock model and demonstrate the
following:
Workflow shown here shall be generally useful for interpretation of wet-lab experiment results as study of clock mutants typically ends up with hypotheses where certain parameter(s) of the circadian clock is/are affected.
grid_scan()
is generally useful for cases where you have
an Odin model too; this function is decoupled from specifics of
circadian clock models.
library(clockSim)
library(matrixStats)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following object is masked from 'package:matrixStats':
#>
#> count
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
Under wildtype condition the circadian clock is a limit cycle oscillator with ~24hr period. The LG model reproduces such behavior. To see this we run simulation with default setting:
model_gen <- getOdinGen()$continuous_LG
model <- model_gen$new()
sim_hours <- seq(from = 0, to = 2400, by = 1)
res <- model$run(sim_hours) |> as.data.frame()
res$time <- res$t
plot(plot_phase(res, M_T, C_N))
print(compute_period(res$M_T |> tail(n = 240), method = "lomb"))
#> period power snr p.value
#> 2.390000e+01 8.558178e-01 NA 2.142988e-100
As shown above:
While promising, however, the LG model is a complex 10-state system. We next explore different properties of the system:
10-variable makes it impossible to do a extensive grid on initial states. To see how fast our simulation runs:
With each simulation at ~10ms, we should be able to do 1) a very coarse grid on all 10 states with ~3 for each state, or 2) an extensive grid on selected states (say TIM/PER RNA):
310∗10ms∼2502∗10ms∼600sec
With the CRAN build time limitation, in this vignette we will not run the ~10min grid above but instead deal with a smaller grid on TIM/PER RNA.
First, we define the grid. We compute summary of all states in the baseline simulation and use the min/mean/max to define the grid. For each state, we scan N multiples of mean+N*spread of the baseline simulation, where spread = (max-min)/2 and N = 2,3, …, Nmax. For the CRAN grid Nmax=5.
# Compute summary
summary <- res |>
select(-t, -time) |>
apply(2, summary)
# Only keep min/mean/max
summary <- summary[c(1,4,6),]
# Add on mean+N*spread, N=2,3,...,N_max
N_max <- 3 # For larger scan increase this. CRAN=3
get_multiples <- function(s, k) {
# Extract components
min <- s[1, ]
mean <- s[2, ]
delta <- s[3, ] - min
# Create new rows using vectorized operations
multiples <- outer(k, delta) |> sweep(2, mean, "+")
attr(multiples, "original") <- s
# Return
multiples
}
summary <- get_multiples(summary, 2:N_max)
summary <- summary[,c("M_T", "M_P")] # Only RNA states
# Create grid
grid <- expand.grid(
summary |> as.data.frame(), KEEP.OUT.ATTRS = FALSE)
# User variables for initial state start with setUserInitial_
names(grid) <- paste0("setUserInitial_",names(grid))
grid_scan()
Now perform grid scan of initial states. grid_scan()
is
designed for efficient repeated model execution on parameter grid. Refer
to its help for more details, and here I summarize its design
concept:
expand.grid()
.parallel
.We want the following statistics:
?compute_rmse
and
?compute_cosine
.These statistics mean we apply the following function to each run data:
default_attractor <- model_gen$new()$run(sim_hours)
default_attractor <-
default_attractor[(length(sim_hours)-240):length(sim_hours),]
stat.fn <- function(raw_run, reference = default_attractor){
# Return code == 2 means successful integration (at least for lsoda)
succ <- attr(raw_run, "istate")[1] == 2
# Subset only the last 240 time points - should be stabilized
raw_run <- raw_run[(nrow(raw_run)-240):nrow(raw_run),]
# Compute normalized RMSE
nrmse <- compute_rmse(raw_run, reference, normalize = "range")
nrmse <- max(nrmse)
# Compute cosine similarity
cos <- compute_cosine(raw_run, reference)
cos <- min(cos)
# Return
c(converged = succ, nrmse = nrmse, cos = cos)
}
Before running, let’s see how this statistics function will slow down simulation. As shown, we add about ~10% time and ~40% memory footprint, which is minimal.
print(bench::mark(stat.fn(model$run(sim_hours))))
#> # A tibble: 1 × 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
#> 1 stat.fn(mod… 22.9ms 24.7ms 39.7 1.35MB 0 20 0 504ms
#> # ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list>
print(bench::mark(model$run(sim_hours)))
#> # A tibble: 1 × 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
#> 1 model$run(s… 20.1ms 22.6ms 44.1 735KB 2.10 21 1 476ms
#> # ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list>
Now let’s run the grid scan of initial states. We run
grid_scan
, and then wrangle the results into a data frame.
As below, all converged to the same attractor.
scan <-
grid_scan(model_gen, grid, apply.fn = stat.fn,
n.core = 1, custom.export = "default_attractor",
sim_hours)
process_scan <- function(){
.scanDF <- scan |> unlist(use.names = FALSE) |> matrix(ncol = 3, byrow=TRUE)
colnames(.scanDF) <- names(scan[[1]])
result <- cbind(grid, .scanDF |> as.data.frame())
summary(
result |> select(converged, nrmse, cos)
)
}
process_scan()
#> converged nrmse cos
#> Min. :1 Min. :2.178e-05 Min. :1
#> 1st Qu.:1 1st Qu.:2.674e-05 1st Qu.:1
#> Median :1 Median :3.498e-05 Median :1
#> Mean :1 Mean :4.622e-05 Mean :1
#> 3rd Qu.:1 3rd Qu.:5.446e-05 3rd Qu.:1
#> Max. :1 Max. :9.314e-05 Max. :1
Plot two phase diagrams as an example (note that I rerun
grid_scan()
with default apply_fn = identity
to get the raw data for plotting):
# Rerun scan
scan <-
grid_scan(model_gen, grid, apply.fn = identity,
n.core = 1, custom.export = "default_attractor",
sim_hours)
# Show first and last of grid
first <- grid[1,]
last <- grid[nrow(grid),]
print(first)
#> setUserInitial_M_T setUserInitial_M_P
#> 1 11.47733 6.50043
print(last)
#> setUserInitial_M_T setUserInitial_M_P
#> 4 16.11438 9.314509
plot(plot_phase(scan[[1]] |> as.data.frame(), M_T, C_N))
plot(plot_phase(scan[[nrow(grid)]] |> as.data.frame(), M_T, C_N))
How about modifying one parameter and see if that affects initial state stability with the same grid? Just add that parameter new value as a constant to the grid!
Remember that our goal is to see whether parameter change causes convergence property change - so our reference attractor should be computed with the new parameter too. As below, just as an example, doubling translation rate of TIM mRNA does not affect the behavior - all initial states still converged to the clock attractor.
new_grid <- grid
new_grid$k_sT <- 1.8 # 2X of original, check model$content()
new_model <- model_gen$new()
new_model$set_user(k_sT = 1.8)
default_attractor <- new_model$run(sim_hours)
default_attractor <-
default_attractor[(length(sim_hours)-240):length(sim_hours),]
# Then, repeat the same code above.
scan <-
grid_scan(model_gen, new_grid, apply.fn = stat.fn,
n.core = 1, custom.export = "default_attractor",
sim_hours)
process_scan()
#> converged nrmse cos
#> Min. :1 Min. :9.734e-07 Min. :1
#> 1st Qu.:1 1st Qu.:3.700e-06 1st Qu.:1
#> Median :1 Median :6.884e-06 Median :1
#> Mean :1 Mean :6.884e-06 Mean :1
#> 3rd Qu.:1 3rd Qu.:1.007e-05 3rd Qu.:1
#> Max. :1 Max. :1.279e-05 Max. :1
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.