The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The futurize package allows you to easily turn sequential code
into parallel code by piping the sequential code to the futurize()
function. Easy!
library(futurize)
plan(multisession)
library(pls)
data(yarn)
m <- plsr(density ~ NIR, ncomp = 10, data = yarn, validation = "CV") |> futurize()
This vignette demonstrates how to use this approach to parallelize pls
functions such as mvr(), plsr(), pcr(), and crossval().
The pls package provides Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR) methods. These methods often use cross-validation (CV) to determine the number of components to use, which can be computationally intensive and is an ideal candidate for parallelization.
The plsr() function is used to perform PLS regression. When
validation = "CV" is specified, it performs cross-validation.
library(pls)
data(yarn)
## Sequential evaluation
m <- plsr(density ~ NIR, ncomp = 10, data = yarn, validation = "CV")
To make it evaluate in parallel, simply pipe the call to futurize():
library(futurize)
library(pls)
data(yarn)
## Parallel evaluation
m <- plsr(density ~ NIR, ncomp = 10, data = yarn, validation = "CV") |> futurize()
This will automatically use the parallel backend set by plan(), e.g.
plan(multisession)
The crossval() function can be used to perform cross-validation on
an already fitted model:
library(futurize)
plan(multisession)
library(pls)
data(yarn)
m1 <- plsr(density ~ NIR, ncomp = 10, data = yarn)
## Parallel cross-validation
m_cv <- crossval(m1, segments = 10) |> futurize()
The following pls functions are supported by futurize():
mvr()plsr()pcr()cppls()crossval() with seed = TRUE as the defaultFor comparison, here is what it takes to parallelize pls functions
using the parallel package directly, without futurize:
library(pls)
library(parallel)
## Set up a cluster
ncpus <- 4L
cl <- makeCluster(ncpus)
## Configure pls to use the cluster
old_opts <- pls.options(parallel = cl)
## Run regression with cross-validation
data(yarn)
m <- plsr(density ~ NIR, ncomp = 10, data = yarn, validation = "CV")
## Restore original options and stop the cluster
pls.options(old_opts)
stopCluster(cl)
This requires you to manually manage the cluster lifecycle and the
global pls.options(). With futurize, the cluster setup and
option management are handled automatically and localized to the
function call.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.