The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

1 Why memshare?

Most parallel R workflows duplicate large objects into every worker process. That wastes RAM and time. memshare stores big objects once in shared memory and lets workers attach to them as ordinary R vectors/matrices via ALTREP views. You get:

  • minimal memory use (one in-RAM copy),
  • no serialization of big objects to workers,
  • drop‑in apply/lapply-style APIs that manage sharing for you.

This vignette is a quick, practical guide, for technical details we refer to [Thrun and Märte, 2025]


2 Install

install.packages("memshare")         # CRAN
# remotes::install_github("yourname/memshare")  # dev

Requirements: R ≥ 4.0, C++17 toolchain.


3 5‑minute tour

3.1 1) Column-wise work on a matrix (memApply)

library(memshare)

set.seed(1)
n <- 10000; p <- 2000
X <- matrix(rnorm(n * p), n, p)   # numeric/double matrix
y <- rnorm(n)

# Correlate each column with y, in parallel, without copying X to workers
res <- memApply(
  X = X, MARGIN = 2,
  FUN = function(v, y) cor(v, y),
  VARS = list(y = y)           # shared side data
)
str(res)

What happened?
X and y were placed in shared memory; workers received views (ALTREP) instead of copies. Each worker extracted the i-th column as v, ran FUN(v, y), and returned a result. All views were released automatically at the end.

3.2 2) List workloads (memLapply)

list_length <- 1000
d <- 200
L <- lapply(1:list_length, function(i) matrix(rnorm(d * d), d, d))
w <- rnorm(d)

ans <- memLapply(L, function(el, w) el %*% w, VARS = list(w = w))
length(ans); dim(ans[[1]])

3.3 3) Low-level control (register / retrieve / release)

ns <- "demo"
X  <- matrix(rnorm(1e6), 1000, 1000)
registerVariables(ns, list(X = X))

vw <- retrieveViews(ns, "X")
mean(vw$X[ , 1])
releaseViews(ns, "X")

releaseVariables(ns, "X")

4 Concepts that matter

  • Namespace: a string key that identifies a shared-memory context (e.g., "demo").
  • Pages: the actual shared-memory buffers owned by a session.
  • Views: ALTREP wrappers that let R treat shared-memory buffers like normal objects.

Unload the package (or release views/variables) to clean up. Memory is freed once no views remain.


5 Common patterns

5.1 Feature map over columns (fast and memory-light)

score <- function(v, a, b) sum((v - a)^2) / (1 + b)  # any column-wise work
ns <- "scores"
a <- rnorm(n); b <- runif(1)

out <- memApply(X = X, MARGIN = 2, FUN = score, VARS = list(a = a, b = b), NAMESPACE = ns)

5.2 Multiple passes on the same data

Reuse the same namespace to avoid re-registering large objects.

ns <- "reuse"
registerVariables(ns, list(X = X))
pass1 <- memApply("X", 2, function(v) sd(v), NAMESPACE = ns)
pass2 <- memApply("X", 2, function(v) mean(v), NAMESPACE = ns)
releaseVariables(ns, "X")

6 Tips and best practices

  • FUN’s first argument must be the vector/list element (v for memApply, el for memLapply).
    Any extra shared variables in VARS must use exactly the same names in FUN’s signature.
  • Matrices/vectors must be basic numeric (double) without S3 class attributes (ALTREP expects raw storage).
  • If you provide your own cluster, you can still use clusterExport for small copied objects; big ones belong in VARS.
  • Free memory promptly: releaseViews() in workers (handled automatically by memApply/memLapply), and releaseVariables() in the master when done.
  • Detaching the package removes handles and clears shared variables unless another R process still holds a view.
  • Keep write access simple (read-mostly is safest). If multiple workers write to the same region, coordinate externally.

7 Troubleshooting

  • “Unknown input format for X/VARS”: ensure X is a numeric matrix (double) or a character name of a registered object; VARS is either a named list (to register) or character vector of existing names.
  • Memory not freed: check viewList() in workers; any remaining views prevent releaseVariables() from reclaiming memory.
  • Anonymous functions and namespaces: if NAMESPACE is missing and FUN is an inline lambda, the default namespace is "unnamed". Prefer explicit NAMESPACE in production.

8 Essentials

  • registerVariables(namespace, variableList) — put objects into shared memory.
  • retrieveViews(namespace, variableNames) — get ALTREP views (workers).
  • releaseViews(namespace, variableNames) — release worker views.
  • releaseVariables(namespace, variableNames) — free objects (master).
  • memApply(X, MARGIN, FUN, NAMESPACE = NULL, VARS = NULL, MAX.CORES = NULL) — matrix apply with shared memory.
  • memLapply(X, FUN, NAMESPACE = NULL, VARS = NULL, MAX.CORES = NULL) — list apply with shared memory.

9 References

[Thrun and Märte, 2025] Thrun, M.C., Märte, J.: Memshare: Memory Sharing for Multicore Computation in R with an Application to Feature Selection by Mutual Information using PDE, The R Journal, in revision, 2025.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.