The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Performance, Memory, and Resilience Workflows

Introduction

R is an in-memory language. When working with large datasets, long-running loops, parallel processes, or external web APIs, system memory can quickly fill up, causing R to crash.

devkit provides resource optimization and resilience modules designed to keep your environment lean, secure, and crash-proof.


🧹 Interactive Memory Cleanup

R does not always immediately release memory back to the operating system when objects are deleted. Large data frames or matrices can linger in your environment.

Sweeping Large Global Objects

sweep_memory() inspects the global environment, identifies objects exceeding a specified size threshold (in MB), and prompts you to delete them.

# Interactively sweep objects larger than 10MB
sweep_memory(threshold = 10)

Cleaning Temporary Files & Orphaned Devices

R sessions generate temporary directories and graphics devices (e.g., PDFs, PNGs). If a script errors out before closing a device, the file handles remain locked.

hunt_zombies() scans for and closes orphaned graphics devices and removes standard R temp files.

# Close zombie graphics devices and flush temp files
hunt_zombies()

sweep_temp_cache() specifically targets cache directories created by packages (such as knitr, raster, or memoise), reclaiming disk space.

# Flush cache directories to free disk space
sweep_temp_cache()

🛡️ Safeguarding Iterations with the Loop Guardian

When running large loops that generate or accumulate data, you run the risk of running out of RAM (Out of Memory/OOM crash).

loop_guardian() checks your system’s free memory at the end of each iteration. If the available RAM drops below a critical threshold, it halts the loop safely, saving your state and preventing a system-wide crash.

# Define a long loop with the loop guardian
data_list <- list()
for (i in 1:1000) {
  # Perform heavy computation
  data_list[[i]] <- runif(1e6)
  
  # Guard loop; will halt if free memory is less than 500MB
  loop_guardian(threshold_mb = 500, current_iteration = i)
}

💾 Crash-Resilient Batch Processing (Save & Resume)

For jobs that run for hours or days, an unexpected error or power outage can wipe out all progress.

dispatch_checkpoints() wraps batch operations in a checkpointing system. It saves progress at specified intervals. If the run is interrupted, re-running the command automatically resumes execution from the last saved checkpoint.

# List of items to process
items <- paste0("item_", 1:100)

# Resilient batch processing with checkpoints
results <- dispatch_checkpoints(
  items = items,
  process_fun = function(item) {
    # Perform computation
    Sys.sleep(0.1)
    return(paste(item, "processed"))
  },
  checkpoint_dir = "checkpoints",
  checkpoint_interval = 10
)

⚡ Scaffolding Parallel Pipelines

Setting up parallel clusters in R requires boilerplate code (registering cores, setting up clusters, handling errors, and cleaning up clusters on exit).

scaffold_parallel() generates a production-ready parallel execution template tailored to your specific data object and core requirements.

# Generate parallel setup code for a dataframe 'sales_data' inside a function
scaffold_parallel(
  data_obj = "sales_data",
  func_name = "process_sales",
  cores = 4
)

🌐 Resilient and Polite Network Requests

When fetching data from web APIs, network hiccups or rate limits (HTTP status 429) can break your pipeline.

network_diplomat() wraps standard HTTP requests, implementing exponential backoff (retrying with increasing delays) and automatically respecting the rate limit headers sent by servers.

# Make a rate-resilient API request
api_response <- network_diplomat(
  url = "https://api.example.com/data",
  method = "GET",
  max_retries = 5,
  backoff_factor = 2
)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.