The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

collinear 3.0.0

Breaking Changes

API Changes

Renamed Functions

Old Name (v2.0) New Name (v3.0)
identify_predictors() Split into identify_valid_variables(), identify_numeric_variables(), identify_categorical_variables(), identify_logical_variables()
identify_predictors_categorical() identify_categorical_variables()
identify_predictors_numeric() identify_numeric_variables()
identify_predictors_zero_variance() identify_zero_variance_variables()
identify_predictors_type() Removed (merged into identify_valid_variables())

Renamed f_ Functions for Preference Order

Old Name (v2.0) New Name (v3.0)
f_r2_glm_gaussian() f_numeric_glm()
f_r2_gam_gaussian() f_numeric_gam()
f_r2_rf() f_numeric_rf()
f_r2_glm_poisson() f_count_glm()
f_r2_gam_poisson() f_count_gam()
f_auc_glm_binomial() f_binomial_glm()
f_auc_gam_binomial() f_binomial_gam()
f_auc_rf_binomial() f_binomial_rf()
f_v_rf() f_categorical_rf()
f_count_rf() (new)

Major New Features

Adaptive Multicollinearity Thresholds

When both max_cor = NULL and max_vif = NULL, the function now automatically determines optimal filtering thresholds using:

This data-driven approach adapts to each dataset’s correlation structure, preventing over-filtering while maintaining statistically meaningful bounds.

Tidymodels Integration

Cross-Validation Support in Preference Order

Rich Output Structure

collinear() now returns comprehensive results including:

S3 methods print() and summary() for collinear_output and collinear_selection classes provide clean output formatting.

Correlation Matrix Improvements


New Functions

Multicollinearity Assessment

Preference Order

S3 Methods

New Datasets and Models

Name Description
experiment_adaptive_thresholds Validation experiment results (10,000 iterations)
experiment_cor_vs_vif Correlation vs VIF equivalence experiment results
gam_cor_to_vif Fitted GAM for mapping max_cor to max_vif
prediction_cor_to_vif Look-up table for threshold equivalence
toy Simple dataset illustrating multicollinearity concepts
vi_smol Smaller version of vi dataset (610 rows) for faster examples
vi_responses Character vector of response variable names

Improvements

VIF Computation

Validation

Documentation


Bug Fixes


Deprecated


collinear 2.0.0

Main Improvements

  1. Expanded Functionality: Functions collinear() and preference_order() support both categorical and numeric responses and predictors, and can handle several responses at once.

  2. Robust Selection Algorithms: Enhanced selection in vif_select() and cor_select().

  3. Enhanced Functionality to Rank Predictors: New functions to compute association between response and predictors covering most use-cases, and automated function selection depending on data features.

  4. Simplified Target Encoding: Streamlined and parallelized for better efficiency, and new default is "loo" (leave-one-out).

  5. Parallelization and Progress Bars: Utilizes future and progressr for enhanced performance and user experience.


collinear 1.1.1

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.