The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Bounded Outcome Risk Guard for Model Evaluation
BORG catches data leakage that inflates your model’s performance — before you report the wrong number.
library(BORG)
# You scaled the data, then split it. Looks fine?
data_scaled <- scale(iris[, 1:4])
train_idx <- 1:100
test_idx <- 101:150
borg_inspect(data_scaled, train_idx = train_idx, test_idx = test_idx)
#> INVALID — Hard violation: preprocessing_leak
#> "Normalization parameters were computed on data beyond training set"The test set means leaked into the scaler. Your reported accuracy is wrong. BORG finds this automatically — for scaling, PCA, recipes, caret pipelines, and more.
A model shows 95% accuracy on test data, then drops to 60% in production. The usual cause: data leakage. Information from the test set contaminated training, and the reported metrics were wrong.
A Princeton meta-analysis found leakage errors in 648 published papers across 30 fields. In civil war prediction research, correcting leakage revealed that “complex ML models do not perform substantively better than decades-old Logistic Regression.” The reported gains were artifacts.
BORG addresses this problem by automatically detecting six categories of leakage — index overlap, duplicate rows, preprocessing leakage, target leakage, group leakage, and temporal violations — across common R frameworks (base R, caret, tidymodels, mlr3). Beyond detection, BORG diagnoses data dependencies (spatial, temporal, clustered), generates appropriate cross-validation schemes, and produces publication-ready methods paragraphs with test statistics.
These features make the package useful in domains like:
borg(): Main entry point for all
validation
borg_inspect(): Detailed inspection of
specific objects
caret::preProcess,
recipes::recipe, prcomprsample resampling objectslm, glm,
ranger, etc.)borg_diagnose(): Analyze data for
dependency structure
borg_compare_cv(): Run random and
blocked CV side by side on the same data
plot() for visual comparisonborg_power(): Estimate power loss from
switching to blocked CV
summary(): Generate publication-ready
methods paragraphs
borg_compare_cv() inflation estimates when
availableborg_certificate() /
borg_export(): Machine-readable validation
certificates in YAML/JSON for audit trails| Category | Impact | Response |
|---|---|---|
| Hard Violation | Results invalid | Blocks evaluation |
| Soft Inflation | Results biased | Warns, allows with caution |
Hard Violations: - index_overlap - Same
row in train and test - duplicate_rows - Identical
observations across sets - preprocessing_leak - Scaler/PCA
fitted on full data - target_leakage - Feature with |r|
> 0.99 with target - group_leakage - Same group in train
and test - temporal_leak - Test data predates training
Soft Inflation: - proxy_leakage -
Feature with |r| 0.95-0.99 with target - spatial_proximity
- Test points close to training - spatial_overlap - Test
inside training convex hull
# Install from GitHub
# install.packages("pak")
pak::pak("gcol33/BORG")
# Or using devtools
# install.packages("devtools")
devtools::install_github("gcol33/BORG")library(BORG)
# Clean split — passes validation
result <- borg(iris, train_idx = 1:100, test_idx = 101:150)
result
#> Status: VALID
#> Hard violations: 0
#> Soft inflations: 0
# Overlapping indices — caught immediately
borg(iris, train_idx = 1:100, test_idx = 51:150)
#> INVALID — index_overlap: Train and test indices overlap (50 shared indices)# caret preProcess fitted on ALL data (common mistake)
library(caret)
pp <- preProcess(mtcars, method = c("center", "scale"))
borg_inspect(pp, train_idx = 1:25, test_idx = 26:32, data = mtcars)
#> Hard violation: preprocessing_leak
#> "preProcess centering parameters were computed on data beyond training set"# Feature highly correlated with outcome
leaky_data <- data.frame(
x = rnorm(100),
outcome = rnorm(100)
)
leaky_data$leaked <- leaky_data$outcome + rnorm(100, sd = 0.01)
borg_inspect(leaky_data, train_idx = 1:70, test_idx = 71:100, target = "outcome")
#> Hard violation: target_leakage_direct# Clinical data with patient IDs
clinical <- data.frame(
patient_id = rep(1:10, each = 10),
measurement = rnorm(100)
)
# Random split ignoring patients
set.seed(123)
idx <- sample(100)
train_idx <- idx[1:70]
test_idx <- idx[71:100]
borg_inspect(clinical, train_idx, test_idx, groups = "patient_id")
#> Hard violation: group_leakagespatial_data <- data.frame(
lon = runif(200, -10, 10),
lat = runif(200, -10, 10),
response = rnorm(200)
)
# Let BORG diagnose and generate appropriate CV folds
result <- borg(spatial_data, coords = c("lon", "lat"), target = "response", v = 5)
result$diagnosis@recommended_cv
#> "spatial_block"# Prove to reviewers that random CV inflates metrics
comparison <- borg_compare_cv(
spatial_data,
formula = response ~ lon + lat,
coords = c("lon", "lat"),
repeats = 10
)
print(comparison)
plot(comparison)# summary() writes a publication-ready methods paragraph
result <- borg(spatial_data, coords = c("lon", "lat"), target = "response")
summary(result)
#> Model performance was evaluated using spatial block cross-validation
#> (k = 5 folds). Spatial autocorrelation was detected in the data
#> (Moran's I = 0.12, p < 0.001)...
# Three citation styles
summary(result, style = "nature")
summary(result, style = "ecology")BORG works with common ML frameworks:
# caret
library(caret)
pp <- preProcess(mtcars[, -1], method = c("center", "scale"))
borg_inspect(pp, train_idx = 1:25, test_idx = 26:32, data = mtcars)
# tidymodels
library(recipes)
rec <- recipe(mpg ~ ., data = mtcars) |>
step_normalize(all_numeric_predictors()) |>
prep()
borg_inspect(rec, train_idx = 1:25, test_idx = 26:32, data = mtcars)| Function | Purpose |
|---|---|
borg() |
Main entry point — diagnose data or validate splits |
borg_inspect() |
Detailed inspection of objects |
borg_diagnose() |
Analyze data dependencies |
borg_validate() |
Validate complete workflow |
borg_assimilate() |
Assimilate leaky pipelines into compliance |
borg_compare_cv() |
Empirical random vs blocked CV comparison |
borg_power() |
Power analysis after blocking |
plot() |
Visualize results |
summary() |
Generate methods text for papers |
borg_certificate() |
Create validation certificate |
borg_export() |
Export certificate to YAML/JSON |
“Software is like sex: it’s better when it’s free.” — Linus Torvalds
I’m a PhD student who builds R packages in my free time because I believe good tools should be free and open. I started these projects for my own work and figured others might find them useful too.
If this package saved you some time, buying me a coffee is a nice way to say thanks. It helps with my coffee addiction.
MIT (see the LICENSE.md file)
@software{BORG,
author = {Colling, Gilles},
title = {BORG: Bounded Outcome Risk Guard for Model Evaluation},
year = {2026},
url = {https://github.com/gcol33/BORG}
}These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.