The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
nestcv.glmnet, nestcv.train,
outercv and repeatcv (thanks to Ryan Thompson
for useful code for repeatcv).multicore_fork
has been removed.pred_nestcv_glmnet_class() and
pred_train_class() (thanks to SamGG).inner_folds in nestcv.train()
(thanks to Ryan Thompson).repeatcv to enable return of fitted models from
the outer CV for variable importance or SHAP value calculation.defaultSummary() function from
caret which uses the square of Pearson correlation
coefficient (r-squared), instead of the correct coefficient of
determination which is calculated as 1 - rss/tss, where
rss = residual sum of squares, tss = total sum
of squares. The correct formula for R-squared is now being applied.x is a single predictor.prc() which enables easy building of
precision-recall curves from ‘nestedcv’ models and
repeatcv() results.predict method for cva.glmnet.|> can be used instead.metrics() which gives additional performance
metrics for binary classification models such as F1 score, Matthew’s
correlation coefficient and precision recall AUC.pls_filter() which uses partial least squares
regression to filter features.repeatedcv()
leading to significant improvement in speed.nestcv.train(). If argument cv.cores >1,
openMP multithreading is now disabled, which prevents caret models
xgbTree and xgbLinear from crashing, and
allows them to be parallelised efficiently over the outer CV loops.var_stability() and its plots.nestcv.glmnet()repeatcv() to apply repeated nested
CV to the main nestedcv model functions for robust
measurement of model performance.modifyX argument to all
nestedcv models. This allows more powerful manipulation of
the predictors such as scaling, imputing missing values, adding extra
columns through variable manipulations. Importantly these are applied to
train and test input data separately.predict() function for
nestcv.SuperLearner()pred_SuperLearner wrapper for use with
fastshap::explainnestcv.SuperLearner() on
windows.nestcv.glmnet()verbose in nestcv.train(),
nestcv.glmnet() and outercv()to show
progress.multicore_fork in
nestcv.train() and outercv() to allow choice
of parallelisation between forked multicore processing using
mclapply or non-forked using parLapply. This
can help prevent errors with certain multithreaded caret models
e.g. model = "xgbTree".one_hot() changed all_levels argument
default to FALSE to be compatible with regression models by
default.lm_filter() full results
tablelm_filter() where variables
with zero variance were incorrectly reporting very low p-values in
linear models instead of returning NA. This is due to how
rank deficient models are handled by RcppEigen::fastLmPure.
Default method for fastLmPure has been changed to
0 to allow detection of rank deficient models.weight() caused by NA. Allow
weight() to tolerate character vectors.keep_factors
option has been added to filters to control filtering of factors with 3
or more levels.one_hot() for fast one-hot encoding of factors
and character columns by creating dummy variables.stat_filter() which applies univariate filtering
to dataframes with mixed datatype (continuous & categorical
combined).anova_filter() from
Rfast::ftests() to
matrixTests::col_oneway_welch() for much better
accuracynestcv.train()
(Matt Siggins suggestion)n_inner_folds argument to
nestcv.train() to make it easier to set the number of inner
CV folds, and inner_folds argument which enables setting
the inner CV fold indices directly (suggestion Aline Wildberger)plot_shap_beeswarm() caused by change in
fastshap 0.1.0 output from tibble to matrixnestcv.train()pass_outer_folds to both
nestcv.glmnet and nestcv.train: this enables
passing of passing of outer CV fold indices stored in
outer_folds to the final round of CV. Note this can only
work if n_outer_folds = number of inner CV folds and
balancing is not applied so that y is a consistent
length.nfolds for final CV equals
n_inner_folds in nestcv.glmnet()plot_var_stability() to be more user
friendlytop argument to shap plotsfastshap for calculating SHAP values.force_vars argument to
glmnet_filter()ranger_filter()nestcv.train() from models such as
gbm. This fixes multicore bug when using standard R gui on
mac/linux.nestcv.glmnet() model has 0 or 1
coefficients.nestedcv models now return xsub containing
a subset of the predictor matrix x with filtered variables
across outer folds and the final fitboxplot_model() no longer needs the predictor matrix to
be specified as it is contained in xsub in
nestedcv modelsboxplot_model() now works for all nestedcv
model typesvar_stability() to assess variance and
stability of variable importance across outer folds, and directionality
for binary outcomeplot_var_stability() to plot variable
stability across outer foldsfinalCV = NA option which skips fitting the final
model completely. This gives a useful speed boost if performance metrics
are all that is needed.model argument in outercv now prefers a
character value instead of a function for the model to be fittedoutercvnestcv.train which
improves error detection in caret. So nestcv.train can be
run in multicore mode straightaway.nestcv.glmnetnestcv.glmnetouter_train_predict argument to enable saving of
predictions on outer training foldstrain_preds to obtain outer training fold
predictionstrain_summary to show performance metrics
on outer training foldssmote()SuperLearner packagenestcv.train and nestcv.glmnetnestcv.train for caret models with tuning
parameters which are factorsnestcv.train for caret models using
regressionnestcv.train and
nestcv.glmnet to tune final model parameters using a final
round of CV on the whole datasetnestcv.train and
outercvrandomsample() to handle class imbalance using
random over/undersamplingsmote() for SMOTE algorithm for increasing minority
class databoot_ttest()nestcv.glmnet() is mean of best lambdas
on log scaleplot_varImp for plotting variable importance for
nestcv.glmnet final modelsnestcv.glmnet()cva.glmnet()plot.cva.glmnetalphaSet in
plot.cva.glmnettrain function of caretfilterFUN is no longer done through ... but
with a list of arguments passed through a new argument
filter_options.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.