The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The olr package provides a systematic way to identify
the best linear regression model by testing all
combinations of predictor variables. You can choose to optimize
based on either R-squared or adjusted
R-squared.
# Load data
crudeoildata <- read.csv(system.file("extdata", "crudeoildata.csv", package = "olr"))
dataset <- crudeoildata[, -1]
# Define variables
responseName <- 'CrudeOil'
predictorNames <- c('RigCount', 'API', 'FieldProduction', 'RefinerNetInput',
'OperableCapacity', 'Imports', 'StocksExcludingSPR',
'NonCommercialLong', 'NonCommercialShort',
'CommercialLong', 'CommercialShort', 'OpenInterest')# Full model using R-squared
model_r2 <- olr(dataset, responseName, predictorNames, adjr2 = FALSE)## Returning model with max R-squared.
##
## Call:
## lm(formula = CrudeOil ~ RigCount + API + FieldProduction + RefinerNetInput +
## OperableCapacity + Imports + StocksExcludingSPR + NonCommercialLong +
## NonCommercialShort + CommercialLong + CommercialShort + OpenInterest,
## data = dataset)
##
## Coefficients:
## (Intercept) RigCount API FieldProduction
## 0.0068578950 -0.3551354134 0.0004393875 0.2670366950
## RefinerNetInput OperableCapacity Imports StocksExcludingSPR
## 0.3535677365 0.0030449534 -0.1034192549 0.7417144521
## NonCommercialLong NonCommercialShort CommercialLong CommercialShort
## -0.5643353759 0.0207113857 -1.3007001952 1.8508558043
## OpenInterest
## -0.0409690597
# Adjusted R-squared model
model_adjr2 <- olr(dataset, responseName, predictorNames, adjr2 = TRUE)## Returning model with max adjusted R-squared.
##
## Call:
## lm(formula = CrudeOil ~ RigCount + RefinerNetInput + Imports +
## StocksExcludingSPR + NonCommercialLong + CommercialLong +
## CommercialShort, data = dataset)
##
## Coefficients:
## (Intercept) RigCount RefinerNetInput Imports
## 0.008256759 -0.380836990 0.322995592 -0.102405212
## StocksExcludingSPR NonCommercialLong CommercialLong CommercialShort
## 0.694028117 -0.528991035 -1.219766893 1.676484528
# Actual values
actual <- dataset[[responseName]]
fitted_r2 <- model_r2$fitted.values
fitted_adjr2 <- model_adjr2$fitted.values
# Data frames for ggplot
plot_data <- data.frame(
Index = 1:length(actual),
Actual = actual,
R2_Fitted = fitted_r2,
AdjR2_Fitted = fitted_adjr2
)
# Plot both fits
ggplot(plot_data, aes(x = Index)) +
geom_line(aes(y = Actual), color = "black", size = 1, linetype = "dashed") +
geom_line(aes(y = R2_Fitted), color = "steelblue", size = 1) +
labs(
title = "Full Model (R-squared): Actual vs Fitted Values",
subtitle = "Observation Index used in place of dates (parsed from original dataset)",
x = "Observation Index",
y = "CrudeOil % Change"
) +
theme_minimal()ggplot(plot_data, aes(x = Index)) +
geom_line(aes(y = Actual), color = "black", size = 1, linetype = "dashed") +
geom_line(aes(y = AdjR2_Fitted), color = "limegreen", size = 1.1) +
labs(
title = "Optimal Model (Adjusted R-squared): Actual vs Fitted Values",
subtitle = "Observation Index used in place of dates (parsed from original dataset)",
x = "Observation Index",
y = "CrudeOil % Change"
)+
theme_minimal() +
theme(plot.background = element_rect(color = "limegreen", size = 2))| Metric | adjr2 = FALSE (All 12 Predictors) | adjr2 = TRUE (Best Subset of 7 Predictors) |
|---|---|---|
| Adjusted R-squared | 0.6145 | 0.6531 ✅ (higher is better) |
| Multiple R-squared | 0.7018 | 0.699 |
| Residual Std. Error | 0.02388 | 0.02265 ✅ (lower is better) |
| F-statistic (p-value) | 8.042 (1.88e-07) | 15.26 (3.99e-10) ✅ (stronger model) |
| Model Complexity | 12 predictors | 7 predictors ✅ (simpler, more robust) |
| Significant Coeffs | 4 | 6 ✅ (more signal, less noise) |
| R² Difference | — | ~0.003 ❗ (negligible) |
olr() function automates model
selection by testing every valid predictor combination.adjr2 = TRUE to prioritize models that
balance accuracy and parsimony.The adjusted R² model outperformed the full model on: - Adjusted R² - F-statistic - Residual error - Model simplicity - # of significant coefficients
👉 Use adjusted R² (adjr2 = TRUE) in practice to
avoid overfitting and ensure interpretability.
Created by Mathew Fok • Author of the olr
package
Contact:
quiksilver67213@yahoo.com
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.