The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The goal of LogisticEnsembles is to perform a complete analysis of logistic data. The package automatically returns 36 models (23 individual and 13 ensembles of models)
You can install the development version of LogisticEnsembles like so:
::install_github("InfiniteCuriosity/LogisticEnsembles") devtools
This is a basic example which shows you how to solve a common problem:
library(LogisticEnsembles)
Logistic(data = LogisticEnsembles::Lebron,
colnum = 6,
numresamples = 25,
remove_VIF_greater_than = 4.00,
remove_ensemble_correlations_greater_than = 0.98,
save_all_trained_models = "N",
save_all_plots = "N",
how_to_handle_strings = 0,
do_you_have_new_data = "N",
use_parallel = "Y",
train_amount = 0.60,
test_amount = 0.20,
validation_amount = 0.20)
Each of the 33 models returns a probability between 0 and 1. Each of the 33 models fit the data to the training set, make predictions and measure accuracy on the test and validation sets.
The list of 33 models:
The 25 plots automatically created by the package are: 1. Correlation of the data as numbers and colors 2. Correlation of the data as colors and circles 3. 33 ROC curves (specificity vs sensitivity), including ROC value 4. Accuracy by model, fixed scales 5. Accuracy data including train and holdout results including train and holdout 6. Model accuracy barchart 7. Overfitting plot by model and resample 8. Duration barchart 9. Over or underfitting barchart 10. Boxplots of the numeric data 11. Barchart of target (0 or 1) vs target 12. True positive rate by model, fixed scales 13. True positive rate by model, free scales 14. True negative rate by model, fixed scales 15. True negative rate by model, free scales 16. False positive rate by model, fixed scales 17. False positive rate by model, free scales 18. False negative rate by model, fixed scales 19. False negative rate by model, free scales 20. F1 score by model, fixed scales 21. F1 score by model, free scales 22. Positive predictive value by model, fixed scales 23. Positive predictive value by model, free scales 24. Negative predictive value by model, fixed scales 25. Negative predictive value by model, free scales
The tables and reports automatically created: 1. Summary report. This includes the Model, Accuracy, True Positive, True Negative, False Positive, False Negative, Positive Predictive Value, Negative Predictive Value, F1 score, Area under the curve, overfitting min, overfitting mean, overfitting max, and duration. 2. Data summary 3. Head of the ensemble 4. Correlation of the ensemble 5. Variance Inflation factor 6. Correlation of the data 7. Head of the data frame
The package also returns all 33 summary confusion matrices, alphabetical by model. If the user uses resampling, it adds up the values, so any error is visibile. For example, for the Lebron data:
Summary_tables$Random Forest
y_test
rf_test_probabilities 0 1 0 7586 0 1 0 7779
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.