The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Benchmark genomic-selection models — classic and machine-learning — from SNP marker data, through one interface, with breeding-relevant cross-validation and honest accuracy reporting.
The problem GSbench addresses: people increasingly throw
glmnet, ranger, or xgboost at
marker matrices, but hand-roll the cross-validation (often incorrectly)
and compare models on unequal footing. GSbench fits the standard
baselines (GBLUP, ridge marker effects) and the ML
methods behind a single gs_fit()/predict()
API, runs them through the same CV, and reports predictive ability you
can actually trust — plus a stacked ensemble that combines them.
# install.packages("remotes")
remotes::install_github("mqfarooqi1/GSbench")Only graphics, stats and withr
are required. The ML backends — glmnet,
ranger, xgboost — are optional (Suggests);
install whichever you want to use.
library(GSbench)
sim <- simulate_population(n = 300, m = 2000, h2 = 0.5, seed = 1)
# one model
fit <- gs_fit(sim$pheno, sim$geno, model = "gblup")
gebv <- predict(fit, sim$geno)
# compare every available model (incl. the stacked ensemble) under one CV
bench <- gs_benchmark(sim$pheno, sim$geno, k = 5, seed = 1)
bench
plot(bench) model mean sd n_folds
elastic_net 0.367 0.187 5
gblup 0.334 0.189 5
ensemble 0.328 0.165 5
random_forest 0.269 0.185 5
xgboost 0.185 0.318 5
(accuracy = predictive ability, cor(pred, observed) on held-out data)
Core (base R, no compiled code, no heavy deps):
| Function | Purpose |
|---|---|
simulate_population() |
Reproducible SNP + phenotype simulator with known h² |
qc_markers(), impute_markers() |
Call-rate / MAF / monomorphic filtering, mean imputation |
Gmatrix() |
VanRaden additive genomic relationship matrix |
gblup() |
GBLUP by REML — validated to match
rrBLUP::mixed.solve to 6×10⁻⁵ |
Modelling & evaluation:
| Function | Purpose |
|---|---|
gs_fit() / predict() |
Unified interface: "gblup", "elastic_net",
"random_forest", "xgboost",
"ensemble" |
gs_cv() |
Cross-validation: random k-fold (CV1) or leave-one-group-out (family/environment) |
gs_ensemble() |
Stacked super-learner — combines base models with non-negative CV-learned weights |
gs_benchmark() + plot() |
Run all available models through one CV and compare |
available_models() |
Which models are usable in your session |
rrBLUP in the test suite — same
variance components, GEBVs correlating at 1.0.Muhammad Farooqi · https://github.com/mqfarooqi1
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.