The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Seamless AWS cloud bursting for parallel R workloads
staRburst lets you run parallel R code on AWS with zero infrastructure management. Scale from your laptop to 100+ cloud workers with a single function call. Supports both EC2 (recommended for performance and cost) and Fargate (serverless) backends.
starburst_map()
function - no new concepts to learnCRAN submission in progress for v0.3.6 (expected within 2-4 weeks).
Once available:
install.packages("starburst")Development version from GitHub:
remotes::install_github("scttfrdmn/starburst")library(starburst)
# One-time setup (2 minutes)
starburst_setup()
# Run parallel computation on AWS
results <- starburst_map(
1:1000,
function(x) expensive_computation(x),
workers = 50
)
#> 🚀 Starting starburst cluster with 50 workers
#> 💰 Estimated cost: ~$2.80/hour
#> 📊 Processing 1000 items with 50 workers
#> 📦 Created 50 chunks (avg 20 items per chunk)
#> 🚀 Submitting tasks...
#> ✓ Submitted 50 tasks
#> ⏳ Progress: 50/50 tasks (3.2 minutes elapsed)
#>
#> ✓ Completed in 3.2 minutes
#> 💰 Estimated cost: $0.15library(starburst)
# Define simulation
simulate_portfolio <- function(seed) {
set.seed(seed)
returns <- rnorm(252, mean = 0.0003, sd = 0.02)
prices <- cumprod(1 + returns)
list(
final_value = prices[252],
sharpe_ratio = mean(returns) / sd(returns) * sqrt(252)
)
}
# Run 10,000 simulations on 100 AWS workers
results <- starburst_map(
1:10000,
simulate_portfolio,
workers = 100
)
#> 🚀 Starting starburst cluster with 100 workers
#> 💰 Estimated cost: ~$5.60/hour
#> 📊 Processing 10000 items with 100 workers
#> ⏳ Progress: 100/100 tasks (3.1 minutes elapsed)
#>
#> ✓ Completed in 3.1 minutes
#> 💰 Estimated cost: $0.29
# Extract results
final_values <- sapply(results, function(x) x$final_value)
sharpe_ratios <- sapply(results, function(x) x$sharpe_ratio)
# Summary
mean(final_values) # Average portfolio outcome
quantile(final_values, c(0.05, 0.95)) # Risk range
# Comparison:
# Local (single core): ~4 hours
# Cloud (100 workers): 3 minutes, $0.29# Create cluster once
cluster <- starburst_cluster(workers = 50, cpu = 4, memory = "8GB")
# Run multiple analyses
results1 <- cluster$map(dataset1, analysis_function)
results2 <- cluster$map(dataset2, processing_function)
results3 <- cluster$map(dataset3, modeling_function)
# All use the same Docker image and configuration# For memory-intensive workloads
results <- starburst_map(
large_datasets,
memory_intensive_function,
workers = 20,
cpu = 8,
memory = "16GB"
)
# For CPU-intensive workloads
results <- starburst_map(
cpu_tasks,
cpu_intensive_function,
workers = 50,
cpu = 4,
memory = "8GB"
)Run long jobs and disconnect - results persist in S3:
# Start detached session
session <- starburst_session(workers = 50, detached = TRUE)
# Submit work and get session ID
session$submit(quote({
results <- starburst_map(huge_dataset, expensive_function)
saveRDS(results, "results.rds")
}))
session_id <- session$session_id
# Disconnect - job continues running
# Later (hours/days), reconnect:
session <- starburst_session_attach(session_id)
status <- session$status() # Check progress
results <- session$collect() # Get results
# Cleanup when done
session$cleanup(force = TRUE)# Set cost limits
starburst_config(
max_cost_per_job = 10, # Hard limit
cost_alert_threshold = 5 # Warning at $5
)
# Costs shown transparently
results <- starburst_map(data, fn, workers = 100)
#> 💰 Estimated cost: ~$3.50/hour
#> ✓ Completed in 23 minutes
#> 💰 Estimated cost: $1.34staRburst automatically handles AWS Fargate quota limitations:
results <- starburst_map(data, fn, workers = 100, cpu = 4)
#> ⚠ Requested 100 workers (400 vCPUs) but quota allows 25 workers (100 vCPUs)
#> ⚠ Using 25 workers instead
#> 💰 Estimated cost: ~$1.40/hourYour work still completes, just with fewer workers. You can request quota increases through AWS Service Quotas.
starburst_map(.x, .f, workers, ...) - Parallel map over
datastarburst_cluster(workers, cpu, memory) - Create
reusable clusterstarburst_setup() - Initial AWS configurationstarburst_config(...) - Update configurationstarburst_status() - Check cluster statusstarburst_config(
region = "us-east-1",
max_cost_per_job = 10,
cost_alert_threshold = 5
)Full documentation available at starburst.ing
| Feature | staRburst | RStudio Server on EC2 | Coiled (Python) |
|---|---|---|---|
| Setup time | 2 minutes | 30+ minutes | 5 minutes |
| Infrastructure management | Zero | Manual | Zero |
| Learning curve | Minimal | Medium | Medium |
| Auto scaling | Yes | No | Yes |
| Cost optimization | Automatic | Manual | Automatic |
| R-native | Yes | Yes | No (Python) |
AWS_PROFILE setstarburstECSExecutionRole - for ECS/ECR accessstarburstECSTaskRole - for S3 accessFor detailed setup instructions, see the Getting Started guide.
starburst_map,
starburst_cluster)future backend integrationfuture.apply, furrr,
targetsContributions welcome! See the GitHub repository for contribution guidelines.
Apache License 2.0 - see LICENSE
Copyright 2026 Scott Friedman
@software{starburst,
title = {staRburst: Seamless AWS Cloud Bursting for R},
author = {Scott Friedman},
year = {2026},
version = {0.3.6},
url = {https://starburst.ing},
license = {Apache-2.0}
}Built using the paws AWS SDK for R.
Container management with renv and rocker.
Inspired by Coiled for Python/Dask.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.