README

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

utiml: Utilities for Multi-label Learning

The utiml package is a framework to support multi-label processing, like Mulan on Weka.

The main methods available on this package are organized in the groups: - Classification methods - Evaluation methods - Pre-process utilities - Sampling methods - Threshold methods

Instalation

install.packages("utiml")

This will also install mldr. To run the examples in this document, you also need to install the packages:

# Base classifiers (SVM and Random Forest)
install.packages(c("e1071", "randomForest"))

Install via github (development version)

devtools::install_github("rivolli/utiml")

Multi-label Classification

Running Binary Relevance Method

library(utiml) # Create two partitions (train and test) of toyml multi-label dataset ds <- create_holdout_partition(toyml, c(train=0.65, test=0.35)) # Create a Binary Relevance Model using e1071::svm method brmodel <- br(ds$train, "SVM", seed=123) # Predict prediction <- predict(brmodel, ds$test) # Show the predictions head(as.bipartition(prediction)) head(as.ranking(prediction)) # Apply a threshold newpred <- rcut_threshold(prediction, 2) # Evaluate the models result <- multilabel_evaluate(ds$tes, prediction, "bipartition") thresres <- multilabel_evaluate(ds$tes, newpred, "bipartition") # Print the result print(round(cbind(Default=result, RCUT=thresres), 3))

Running Ensemble of Classifier Chains

library(utiml) # Create three partitions (train, val, test) of emotions dataset partitions <- c(train = 0.6, val = 0.2, test = 0.2) ds <- create_holdout_partition(emotions, partitions, method="iterative") # Create an Ensemble of Classifier Chains using Random Forest (randomForest package) eccmodel <- ecc(ds$train, "RF", m=3, cores=parallel::detectCores(), seed=123) # Predict val <- predict(eccmodel, ds$val, cores=parallel::detectCores()) test <- predict(eccmodel, ds$test, cores=parallel::detectCores()) # Apply a threshold thresholds <- scut_threshold(val, ds$val, cores=parallel::detectCores()) new.val <- fixed_threshold(val, thresholds) new.test <- fixed_threshold(test, thresholds) # Evaluate the models measures <- c("subset-accuracy", "F1", "hamming-loss", "macro-based") result <- cbind( Test = multilabel_evaluate(ds$tes, test, measures), TestWithThreshold = multilabel_evaluate(ds$tes, new.test, measures), Validation = multilabel_evaluate(ds$val, val, measures), ValidationWithThreshold = multilabel_evaluate(ds$val, new.val, measures) ) print(round(result, 3))

More examples and details are available on functions documentations and vignettes, please refer to the documentation.

How to cite?

@article{RJ-2018-041, author = {Adriano Rivolli and Andre C. P. L. F. de Carvalho}, title = {{The utiml Package: Multi-label Classification in R}}, year = {2018}, journal = {{The R Journal}}, doi = {10.32614/RJ-2018-041}, url = {https://doi.org/10.32614/RJ-2018-041}, pages = {24--37}, volume = {10}, number = {2} }

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.