The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The futurize package allows you to easily turn sequential code
into parallel code by piping the sequential code to the futurize()
function. Easy!
library(futurize)
plan(multisession)
library(parameters)
model <- lm(mpg ~ wt, data = mtcars)
fit <- bootstrap_model(model, iterations = 1000) |> futurize()
This vignette demonstrates how to use this approach to parallelize parameters
functions, such as bootstrap_model() and bootstrap_parameters().
The parameters package (part of the easystats ecosystem)
provides utilities for processing and summarizing statistical models.
The bootstrap_model() function generates a distribution of model
estimates by refitting the model multiple times using bootstrapped
samples. This process can be computationally demanding, especially for
complex models or a large number of iterations. Since each bootstrap
iteration is independent, it is a perfect candidate for
parallelization.
Consider a linear model where we want to obtain bootstrapped estimates of the coefficients:
library(parameters)
model <- lm(mpg ~ wt + cyl, data = mtcars)
## Generate 1000 bootstrap replicates (sequentially)
boot_dist <- bootstrap_model(model, iterations = 1000)
To parallelize this using futurize, simply pipe the call to
futurize():
library(futurize)
library(parameters)
model <- lm(mpg ~ wt + cyl, data = mtcars)
## Generate 1000 bootstrap replicates (in parallel)
boot_dist <- bootstrap_model(model, iterations = 1000) |> futurize()
This will distribute the bootstrap iterations across the available parallel workers, given that we have set up a parallel backend, e.g.
plan(multisession)
The bootstrap_parameters() function is a higher-level wrapper that
calls bootstrap_model() and then summarizes the results. It can
also be parallelized in the same way:
library(futurize)
plan(multisession)
library(parameters)
model <- lm(mpg ~ wt + cyl, data = mtcars)
boot_params <- bootstrap_parameters(model, iterations = 1000) |> futurize()
The following parameters functions are supported by futurize():
bootstrap_model() with seed = TRUE as the defaultbootstrap_parameters() with seed = TRUE as the defaultFor comparison, here is what it takes to parallelize bootstrap_model() using
the parallel package directly, without futurize:
library(parameters)
library(parallel)
model <- lm(mpg ~ wt + cyl, data = mtcars)
## Set up a PSOCK cluster
ncpus <- 4L
cl <- makeCluster(ncpus)
## Run bootstrapping in parallel
boot_dist <- bootstrap_model(model, iterations = 1000,
parallel = "snow", n_cpus = ncpus,
cluster = cl)
## Tear down the cluster
stopCluster(cl)
With futurize, the cluster management is handled automatically.
You just control the backend with plan().
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.