The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
agriReg provides a streamlined workflow for the most
common regression tasks in agricultural research:
| Task | Function |
|---|---|
| Clean field data | clean_agri_data() |
| Normalise yield | yield_normalize() |
| Flag outliers | outlier_flag() |
| Linear / mixed model | fit_linear() |
| Polynomial selection | fit_polynomial() |
| Nonlinear growth curves | fit_nonlinear() |
| Optimum dose / rate | optimum_dose() |
| Dose-response (LL.4/5) | fit_dose_response() |
| Effective dose (ED50…) | ed_estimates() |
| Compare models | compare_models() |
| Goodness-of-fit stats | gof_stats() |
| Residual diagnostics | residual_check() |
Load one of the bundled example datasets and inspect it.
wheat <- load_example_data("wheat_trial")
head(wheat)
#> block nitrogen phosphorus variety yield biomass
#> 1 A 0 low V1 3.362 4.935
#> 2 A 0 low V2 3.746 5.826
#> 3 A 0 high V1 3.799 5.679
#> 4 A 0 high V2 4.239 6.296
#> 5 A 60 low V1 5.408 7.625
#> 6 A 60 low V2 5.381 7.965wheat_clean <- clean_agri_data(wheat,
yield_col = "yield",
flag_outliers = TRUE)
summary(wheat_clean)
#> agriData summary -- response: 'yield'
#>
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 3.343 5.041 6.097 5.719 6.703 7.328
#>
#> Outliers flagged: 0 / 48Normalise yield within each block using z-scoring:
wheat_norm <- yield_normalize(wheat_clean,
yield_col = "yield",
method = "zscore",
group_by = "block")
head(wheat_norm[, c("block", "yield", "yield_norm")])
#> agriData object
#> Rows : 6
#> Columns : 3
#>
#> First 6 rows:
#> block yield yield_norm
#> 1 A 3.362 -1.8258491
#> 2 A 3.746 -1.5347896
#> 3 A 3.799 -1.4946173
#> 4 A 4.239 -1.1611116
#> 5 A 5.408 -0.2750475
#> 6 A 5.381 -0.2955126The standalone outlier_flag() supports three
methods:
wheat_clean <- outlier_flag(wheat_clean, col = "yield", method = "modz")
table(wheat_clean$yield_outlier)
#>
#> FALSE
#> 48m_lm <- fit_linear(wheat_clean, "yield ~ nitrogen + phosphorus")
print(m_lm)
#> ==============================
#> agriReg: Linear Model (agriLM)
#> ==============================
#> Engine : lm
#> Formula : yield ~ nitrogen + phosphorus
#>
#> Coefficients:
#> (Intercept) nitrogen phosphoruslow
#> 4.4240083 0.0167675 -0.4274167
#>
#> R2: 0.8784 | Adj-R2: 0.8701 | RMSE: 0.4261The plot() method returns ggplot2 objects —
pipe them, save them, or patch them together with
patchwork.
plot(m_lm, type = "fit")plot(m_lm, type = "residuals")plot(m_lm, type = "qq")residual_check(m_lm)Add a random intercept per block to account for field heterogeneity:
m_lmer <- fit_linear(wheat_clean,
formula = "yield ~ nitrogen + phosphorus + variety",
random = "(1|block)")
summary(m_lmer)
#> ==============================
#> agriReg: Linear Model Summary
#> ==============================
#> Engine : lmer
#>
#> Formula: yield ~ nitrogen + phosphorus + variety
#> Random : (1|block)
#>
#> Fixed effects:
#> (Intercept) nitrogen phosphoruslow varietyV2
#> 4.2722167 0.0167675 -0.4274167 0.3035833
#>
#> Random effects (variance components):
#> grp var1 vcov sdcor
#> block (Intercept) 0.000000 0.0000
#> Residual <NA> 0.172902 0.4158
#>
#> Number of obs: 48 | Groups: block, Residual
#>
#> --- Goodness-of-fit ---
#> R2 : 0.8938
#> Adj-R2 : 0.8915
#> RMSE : 0.3981
#> MAE : 0.3771
#> AIC : 80.77
#> BIC : 92.00Automatically test degrees 1–3 and return the best-fitting model:
m_poly <- fit_polynomial(wheat_clean,
x_col = "nitrogen",
y_col = "yield",
max_degree = 3)
coef(m_poly)
#> (Intercept) I(nitrogen^1) I(nitrogen^2)
#> 3.8360916667 0.0354779167 -0.0001039468Load the maize growth dataset:
maize <- load_example_data("maize_growth")
# Use well-watered plants only
maize_ww <- maize[maize$treatment == "WW", ]
head(maize_ww)
#> plant_id days biomass_g treatment
#> 1 1 1 0.00 WW
#> 2 1 2 10.98 WW
#> 3 1 3 0.00 WW
#> 4 1 4 14.10 WW
#> 5 1 5 0.00 WW
#> 6 1 6 0.28 WWThe 3-parameter logistic is the most common growth curve in agronomy. Parameter interpretation:
m_log <- fit_nonlinear(maize_ww,
x_col = "days",
y_col = "biomass_g",
model = "logistic")
summary(m_log)
#> ==================================
#> agriReg: Nonlinear Model Summary
#> ==================================
#> Model : logistic
#>
#>
#> Formula: biomass_g ~ Asym/(1 + exp((xmid - days)/scal))
#>
#> Parameters:
#> Estimate Std. Error t value Pr(>|t|)
#> Asym 497.8621 2.1803 228.34 <2e-16 ***
#> xmid 47.0653 0.1869 251.81 <2e-16 ***
#> scal 8.5507 0.1569 54.52 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 23.27 on 497 degrees of freedom
#>
#> Number of iterations to convergence: 5
#> Achieved convergence tolerance: 2.749e-07
#>
#>
#> --- Goodness-of-fit ---
#> R2 : 0.9869
#> RMSE : 23.1983
#> MAE : 17.2210
#> AIC : 4571.02
#> BIC : 4587.88plot(m_log)The Gompertz is often a better fit for biological growth that is asymmetric around the inflection point:
m_gomp <- fit_nonlinear(maize_ww, "days", "biomass_g", "gompertz")
coef(m_gomp)Classic model for nutrient-response data where yield approaches a plateau:
m_asym <- fit_nonlinear(wheat_clean,
x_col = "nitrogen",
y_col = "yield",
model = "asymptotic")
plot(m_asym)The quadratic polynomial is widely used to find the economic optimum nitrogen rate:
m_quad <- fit_nonlinear(wheat_clean,
x_col = "nitrogen",
y_col = "yield",
model = "quadratic")
opt <- optimum_dose(m_quad)
cat(sprintf("Optimum nitrogen rate : %.1f kg/ha\n", opt["x_opt"]))
#> Optimum nitrogen rate : 170.7 kg/ha
cat(sprintf("Predicted max yield : %.2f t/ha\n", opt["y_max"]))
#> Predicted max yield : 6.86 t/haSuitable when yield increases linearly up to a critical input level and then plateaus (common for phosphorus response):
m_lp <- fit_nonlinear(wheat_clean,
x_col = "nitrogen",
y_col = "yield",
model = "linear_plateau")
cat("Critical point (plateau onset):", optimum_dose(m_lp)["x_opt"], "kg/ha\n")
#> Critical point (plateau onset): 94.86183 kg/haLoad the herbicide dataset:
herb <- load_example_data("herbicide_trial")
# Subset to one species
amaranth <- herb[herb$species == "Amaranth", ]The LL.4 model is the industry standard for herbicide dose-response:
\[y = c + \frac{d - c}{1 + \exp(b(\log(x) - \log(e)))}\]
where b = slope, c = lower asymptote, d = upper asymptote, e = ED50.
m_dr <- fit_dose_response(amaranth,
dose_col = "dose_g_ha",
resp_col = "weed_control_pct",
fct = "LL.4")
summary(m_dr)
#> ==================================
#> agriReg: Dose-Response Summary
#> ==================================
#> Model : LL.4
#> Dose : dose_g_ha
#> Response: weed_control_pct
#>
#>
#> Model fitted: Log-logistic (ED50 as parameter) (4 parms)
#>
#> Parameter estimates:
#>
#> Estimate Std. Error t-value p-value
#> b:(Intercept) 1.905897 0.095867 19.8807 < 2e-16 ***
#> c:(Intercept) 1.576654 0.799879 1.9711 0.05603 .
#> d:(Intercept) 97.986892 0.875390 111.9351 < 2e-16 ***
#> e:(Intercept) 82.259133 2.334523 35.2360 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error:
#>
#> 2.74034 (38 degrees of freedom)
#>
#> ED10 / ED50 / ED90:
#>
#> Estimated effective doses
#>
#> Estimate Std. Error Lower Upper
#> e:1:10 25.9720 1.7769 22.3749 29.5691
#> e:1:50 82.2591 2.3345 77.5331 86.9851
#> e:1:90 260.5332 15.7523 228.6443 292.4221
#> Estimate Std. Error Lower Upper
#> e:1:10 25.97199 1.776870 22.37490 29.56907
#> e:1:50 82.25913 2.334523 77.53314 86.98513
#> e:1:90 260.53321 15.752300 228.64435 292.42208plot(m_dr, log_dose = TRUE)ed_estimates(m_dr, respLev = c(10, 50, 90))
#>
#> Estimated effective doses
#>
#> Estimate Std. Error Lower Upper
#> e:1:10 25.9720 1.7769 22.3749 29.5691
#> e:1:50 82.2591 2.3345 77.5331 86.9851
#> e:1:90 260.5332 15.7523 228.6443 292.4221
#> Estimate Std. Error Lower Upper
#> e:1:10 25.97199 1.776870 22.37490 29.56907
#> e:1:50 82.25913 2.334523 77.53314 86.98513
#> e:1:90 260.53321 15.752300 228.64435 292.42208Compare all models fitted to the wheat data:
cmp <- compare_models(
ols = m_lm,
mixed = m_lmer,
asymptotic = m_asym,
quadratic = m_quad,
lp = m_lp
)
print(cmp)
#> model engine n_par AIC BIC RMSE MAE R2
#> 1 quadratic quadratic 3 27.105 34.590 0.2953 0.2419 0.9416
#> 2 lp linear_plateau 3 33.296 40.781 0.3149 0.2568 0.9335
#> 3 ols lm 3 62.315 69.800 0.4261 0.3753 0.8784
#> 4 mixed lmer 1 80.769 91.997 0.3981 0.3771 0.8938
#> 5 asymptotic asymptotic 2 205.438 211.051 1.9320 1.1392 -1.5010
#> delta_AIC
#> 1 0.000
#> 2 6.191
#> 3 35.210
#> 4 53.664
#> 5 178.333The table is ranked by AIC. delta_AIC shows the
difference from the best model — values below 2 are considered
equivalent; above 10 is strong evidence against the weaker model.
All agriReg objects are compatible with
broom for tidy output:
library(broom)
tidy(m_lm$fit)
#> # A tibble: 3 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 4.42 0.124 35.7 1.13e-34
#> 2 nitrogen 0.0168 0.000947 17.7 6.59e-22
#> 3 phosphoruslow -0.427 0.127 -3.36 1.57e- 3
glance(m_lm$fit)
#> # A tibble: 1 × 12
#> r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.878 0.873 0.440 162. 2.60e-21 2 -27.2 62.3 69.8
#> # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>p <- plot(m_log)
ggplot2::ggsave("logistic_growth.png", p, width = 7, height = 4.5, dpi = 300)sessionInfo()
#> R version 4.2.1 (2022-06-23 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 26200)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] broom_1.0.12 ggplot2_4.0.2 emmeans_2.0.2 agriReg_0.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.1.1 mvtnorm_1.3-6 lattice_0.20-45 tidyr_1.3.2
#> [5] gtools_3.9.5 reformulas_0.4.4 zoo_1.8-15 utf8_1.2.6
#> [9] digest_0.6.39 R6_2.6.1 backports_1.5.0 evaluate_1.0.5
#> [13] coda_0.19-4.1 pillar_1.11.1 Rdpack_2.6.6 rlang_1.1.7
#> [17] otel_0.2.0 multcomp_1.4-30 rstudioapi_0.18.0 minqa_1.2.8
#> [21] car_3.1-5 nloptr_2.2.1 jquerylib_0.1.4 Matrix_1.6-5
#> [25] rmarkdown_2.30 labeling_0.4.3 splines_4.2.1 lme4_2.0-1
#> [29] S7_0.2.1 compiler_4.2.1 xfun_0.57 pkgconfig_2.0.3
#> [33] mgcv_1.8-40 htmltools_0.5.9 tidyselect_1.2.1 tibble_3.3.1
#> [37] codetools_0.2-18 dplyr_1.2.0 withr_3.0.2 MASS_7.3-57
#> [41] rbibutils_2.4.1 grid_4.2.1 nlme_3.1-168 jsonlite_2.0.0
#> [45] xtable_1.8-8 gtable_0.3.6 lifecycle_1.0.5 magrittr_2.0.3
#> [49] scales_1.4.0 estimability_1.5.1 cli_3.6.5 cachem_1.1.0
#> [53] carData_3.0-6 farver_2.1.2 bslib_0.10.0 drc_3.0-1
#> [57] generics_0.1.4 vctrs_0.7.2 boot_1.3-28 sandwich_3.1-1
#> [61] Formula_1.2-5 TH.data_1.1-5 RColorBrewer_1.1-3 tools_4.2.1
#> [65] glue_1.7.0 purrr_1.0.2 plotrix_3.8-14 abind_1.4-8
#> [69] pbkrtest_0.5.5 parallel_4.2.1 fastmap_1.2.0 survival_3.3-1
#> [73] yaml_2.3.12 knitr_1.51 patchwork_1.3.2 sass_0.4.10These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.