The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
An intelligent teaching assistant based on LLMs to help interpret
statistical model outputs in R.
EnTraineR builds audience-aware prompts (beginner, applied,
advanced) that never invent numbers: it passes verbatim
outputs from R and instructs how to explain them.
Works out of the box to produce high-quality prompts.
Optionally, you can connect your own LLM backend (via your functions built on top oftrainer_core_generate_or_return()).
From GitHub:
# install.packages("remotes")
remotes::install_github("Sebastien-Le/EnTraineR")Optional but recommended packages for examples: - FactoMineR, SensoMineR (model objects used in examples) - stringr (to squish multi-line intros)
AovSum) with F-tests and T-testsFactoMineR::LinearModel) including model
selection notesbeginner: plain-language teaching focusapplied: decisions and practical implicationsadvanced: technical but concise, with appropriate
cautionsgemini_generate() sends your prompt to Google Gemini
(Generative Language API) and returns the text reply.The package ships 3 small datasets for teaching:
deforestation
Air and water temperatures before/after riparian deforestation.
Variables: Temp_water, Temp_air,
Deforestation (BEFORE/AFTER).
ham
Sensory descriptors for 21 hams and an Overall liking
score.
Useful for multiple regression demonstrations.
poussin
Chick weights by brooding Temperature (T1/T2/T3) and
Gender (Female/Male).
Useful for two-factor ANOVA examples.
These datasets are the intellectual property of L’Institut Agro Rennes Angers and are used for the “Statistical Approach” course module.
data(deforestation); str(deforestation)
data(ham); summary(ham)
data(poussin); with(poussin, table(Temperature, Gender))# install.packages("SensoMineR")
library(SensoMineR)
data(chocolates)
# Build AovSum (example similar to chocolates::Granular ~ Product*Panelist)
res <- AovSum(Granular ~ Product*Panelist, data = sensochoc)
intro <- "Six chocolates have been evaluated by a sensory panel,
according to a sensory attribute: granular.
The panel has been trained according to this attribute
and panellists should be reproducible when rating this attribute."
intro <- gsub("\n", " ", intro)
intro <- stringr::str_squish(intro)
p <- trainer_AovSum(
aovsum_obj = res,
audience = "applied",
t_test = c("Product", "Panelist"), # filter T-test section
introduction = intro
)
cat(p) # a ready-to-use prompt for an LLM or for teaching# install.packages("FactoMineR"); install.packages("stringr")
library(FactoMineR)
intro_ham <- "Can we predict ham overall liking from its sensory profile?"
intro_ham <- stringr::str_squish(gsub("\n", " ", intro_ham))
fit <- LinearModel(`Overall liking` ~ ., data = ham, selection = "bic")
pr <- trainer_LinearModel(
lm_obj = fit,
introduction = intro_ham,
audience = "advanced"
)
cat(pr)Another linear model with interaction and a categorical factor:
fit2 <- LinearModel(Temp_water ~ Temp_air * Deforestation,
data = deforestation, selection = "none")
pr2 <- trainer_LinearModel(
lm_obj = fit2,
introduction = "Effect of deforestation on the air-water temperature link.",
audience = "beginner"
)
cat(pr2)t-test:
tt <- t.test(rnorm(20, 0.1), mu = 0)
cat(trainer_t_test(tt, audience = "beginner"))Variance F-test:
vt <- var.test(rnorm(25, sd = 1.0), rnorm(30, sd = 1.3))
cat(trainer_var_test(vt, audience = "applied"))Proportion test:
pt <- prop.test(x = c(42, 35), n = c(100, 90))
cat(trainer_prop_test(pt, audience = "advanced", summary_only = TRUE))Correlation test:
set.seed(1)
x <- rnorm(30); y <- 0.5 * x + rnorm(30, sd = 0.8)
ct <- cor.test(x, y, method = "pearson")
cat(trainer_cor_test(ct, audience = "applied"))Chi-squared test:
m <- matrix(c(10, 20, 30, 40), nrow = 2)
cx <- chisq.test(m, correct = TRUE)
cat(trainer_chisq_test(cx, audience = "beginner"))gemini_generate() lets you send a prompt to Google
Gemini and get the response back as text.
# 1) Set your API key once per session (or in .Renviron)
Sys.setenv(GEMINI_API_KEY = "your_key_here")
# 2) Send a prompt
txt <- gemini_generate(
prompt = "Say hello in one short sentence.",
model = "gemini-2.5-flash", # accepts "gemini-2.5-flash" or "models/gemini-2.5-flash"
temperature = 0.2,
user_agent = "EnTraineR/0.9.0 (https://github.com/Sebastien-Le/EnTraineR)"
)
cat(txt)All prompts emphasize: do not invent numbers; use only what appears in the printed output.
By default, trainers return a prompt string (i.e.,
generate = FALSE).
If you have a generator backend, you can pass
generate = TRUE and a llm_model name;
implement your own trainer_core_generate_or_return() to
call your LLM API.
Issues and pull requests are welcome. Please: - Keep code ASCII and roxygen2-ready. - Add tests and examples where relevant. - Follow the audience style guidelines.
See the DESCRIPTION file for license terms.
If EnTraineR helps your teaching or analyses, starring the
repo is appreciated.
Thanks to the R community and the authors of FactoMineR
and SensoMineR for inspiring teaching tools and example
datasets used in demonstrations.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.