The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Autoencoding Random Forests (‘RFAE’) provide a method to autoencode data using Random Forests (‘RF’), which involves projecting the data to a latent feature space of chosen dimensionality (usually a lower dimension), and then decoding the latent representations back into the input space. The encoding stage is useful for feature engineering and data visualisation tasks, akin to how principal component analysis (‘PCA’) is used , and the decoding stage is usefulfor compression and denoising tasks. At its core, ‘RFAE’ is a post-processing pipeline on a trained random forest model. This means that it can accept any trained RF of ranger object type: ‘RF’, ‘URF’ or ARFs’. Because of this, it inherits RFs’ robust performance and capacity to seamlessly handle mixed-type tabular data.
The package can be installed by running:
devtools::install_github("bips-hb/RFAE")
You can also clone the repository and run:
devtools::build("RFAE")
Using Fisher’s iris dataset, we train a RF and pass it through the autoencoding pipeline:
# Set seed
set.seed(1)
# Split training and test
trn <- sample(1:nrow(iris), 100)
tst <- setdiff(1:nrow(iris), trn)
# Train RF
rf <- ranger::ranger(Species ~ ., data = iris[trn, ], num.trees=50)
Encode data and project test data to create new embeddings:
# Fit encoder object
emap <- encode(rf, iris[trn, ], k=2)
# Embed new test samples
emb <- predict(emap, rf, iris[tst, ])
Decode test samples back to the input space:
# Decode samples
out <- decode_knn(rf, emap, emb, k=5)$x_hat
Measure the reconstruction error between decoded and actual samples:
error <- reconstruction_error(out, iris[tst, ])
For more detailed examples, refer to the package vignette.
The Python version of RFAE is currently under development. A preliminary version is currently available at RFAE_py
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.