The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The goal of NumericEnsembles is to automatically conduct a thorough analysis of numeric data. The user only needs to provide the data and answer a few questions (such as which column to analyze). NumericEnsembles fits 23 individual models to the training data, and also makes predictions and checks accuracy for each of the individual models. It also builds 17 ensembles from the 23 individual data, fits each ensemble model to the training data then makes predictions and tracks accuracy for each ensemble. The package also automatically returns 26 plots (such as train vs holdout for the best model), 6 tables (such as head of the data), and a grand summary table sorted by accuracy with the best model at the top of the report.
You can install the development version of NumericEnsembles like so:
::install_github("InfiniteCuriosity/NumericEnsembles") devtools
NumericEnsembles will automatically build 40 models to predict the sale price of houses in Boston, from the Boston housing data set.
library(NumericEnsembles)
Numeric(data = MASS::Boston,
colnum = 14,
numresamples = 25,
how_to_handle_strings = 0,
do_you_have_new_data = "N",
save_all_trained_models = "N",
remove_ensemble_correlations_greater_than = 1.00,
use_parallel = "Y",
train_amount = 0.60,
test_amount = 0.20,
validation_amount = 0.20
)
The 40 models which are all built automatically and without error are:
The 26 plots created automatically:
The tables created automatically are:
The NumericEnsembles package also has a way to create trained models and test those pre-trained models on totally unseen data using the same pre-trained models as on the initial analysis.
The package contains two example data sets to demonstrate this result. Boston_Housing is the Boston Housing data set, but the first five rows have been removed. We will build our models on that data set. NewBoston is totally new data, and actually the first five rows from the original Boston Housing data set.
library(NumericEnsembles)
Numeric(data = Boston_housing,
colnum = 14,
numresamples = 25,
how_to_handle_strings = 0,
do_you_have_new_data = "Y",
save_all_trained_models = "Y",
remove_ensemble_correlations_greater_than = 1.00,
use_parallel = "Y",
train_amount = 0.60,
test_amount = 0.20,
validation_amount = 0.20
)
Use the data set New_Boston when asked for “What is the URL of the new data?”. The URL for the new data is: https://raw.githubusercontent.com/InfiniteCuriosity/EnsemblesData/refs/heads/main/NewBoston.csv
External data may be used to accomplish the same result.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.