The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Author: Maciej Nasinski
Check the miceFast website for more details
miceFast provides fast methods for imputing missing
data, leveraging an object-oriented programming paradigm and optimized
linear algebra routines.
The package includes convenient helper functions compatible with
data.table, dplyr, and other popular R
packages.
Major speed improvements occur when:
- Using a grouping variable, where the data is
automatically sorted by group, significantly reducing computation time.
- Performing multiple imputations, by evaluating the
underlying quantitative model only once for multiple draws. - Running
Predictive Mean Matching (PMM), thanks to presorting
and binary search.
For performance details, see performance_validity.R
in
the extdata
folder.
It is recommended to read the Advanced Usage Vignette.
You can install miceFast from CRAN:
install.packages("miceFast")
Or install the development version from GitHub:
# install.packages("devtools")
::install_github("polkas/miceFast") devtools
Below is a short demonstration. See the vignette for advanced usage and best practices.
library(miceFast)
set.seed(1234)
data(air_miss)
# Visualize the NA structure
upset_NA(air_miss, 6)
# Simple and naive fill
<- naive_fill_NA(air_miss)
imputed_data
# Compare with other packages:
# Hmisc
library(Hmisc)
data.frame(Map(function(x) Hmisc::impute(x, "random"), air_miss))
# mice
library(mice)
::complete(mice::mice(air_miss, printFlag = FALSE)) mice
miceFast
objects (Rcpp modules).fill_NA()
: Single imputation (lda
,
lm_pred
, lm_bayes
,
lm_noise
).fill_NA_N()
: Multiple imputations (pmm
,
lm_bayes
, lm_noise
).VIF()
: Variance Inflation Factor calculations.naive_fill_NA()
: Automatic naive imputations.compare_imp()
: Compare original vs. imputed
values.upset_NA()
: Visualize NA structure using UpSetR.Quick Reference Table:
Function | Description |
---|---|
new(miceFast) |
Creates an OOP instance with numerous imputation methods (see the vignette). |
fill_NA() |
Single imputation: lda , lm_pred ,
lm_bayes , lm_noise . |
fill_NA_N() |
Multiple imputations (N repeats): pmm ,
lm_bayes , lm_noise . |
VIF() |
Computes Variance Inflation Factors. |
naive_fill_NA() |
Performs automatic, naive imputations. |
compare_imp() |
Compares imputations vs. original data. |
upset_NA() |
Visualizes NA structure using an UpSet plot. |
Benchmark testing (on R 4.2, macOS M1) shows miceFast can significantly reduce computation time, especially in these scenarios:
x * (number of multiple imputations)
faster, since the
model is computed only once.For performance details, see performance_validity.R
in
the extdata
folder.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.