The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
R package for converting R models to PMML
This library is a thin R wrapper around the JPMML-R library.
The current version is 0.29.0 (10 November, 2024):
See the NEWS.md file.
Installing a release version from CRAN:
install.packages("r2pmml")
Alternatively, installing the latest snapshot version from GitHub
using the devtools
package:
library("devtools")
install_github("jpmml/r2pmml")
Loading the package:
library("r2pmml")
Training and exporting a simple randomForest
model:
library("randomForest")
library("r2pmml")
data(iris)
# Train a model using raw Iris dataset
= randomForest(Species ~ ., data = iris, ntree = 7)
iris.rf print(iris.rf)
# Export the model to PMML
r2pmml(iris.rf, "iris_rf.pmml")
The r2pmml
function takes an optional argument
preProcess
, which associates the model with data
pre-processing transformations.
Training and exporting a more sophisticated randomForest
model:
library("caret")
library("randomForest")
library("r2pmml")
data(iris)
# Create a preprocessor
= preProcess(iris, method = c("range"))
iris.preProcess
# Use the preprocessor to transform raw Iris dataset to pre-processed Iris dataset
= predict(iris.preProcess, newdata = iris)
iris.transformed
# Train a model using pre-processed Iris dataset
= randomForest(Species ~., data = iris.transformed, ntree = 7)
iris.rf print(iris.rf)
# Export the model to PMML.
# Pass the preprocessor as the `preProcess` argument
r2pmml(iris.rf, "iris_rf.pmml", preProcess = iris.preProcess)
Alternatively, it is possible to associate lm
,
glm
and randomForest
models with data
pre-processing transformations using model
formulae.
Training and exporting a glm
model:
library("plyr")
library("r2pmml")
# Load and prepare the Auto-MPG dataset
= read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", quote = "\"", header = FALSE, na.strings = "?", row.names = NULL, col.names = c("mpg", "cylinders", "displacement", "horsepower", "weight", "acceleration", "model_year", "origin", "car_name"))
auto $origin = as.factor(auto$origin)
auto$car_name = NULL
auto= na.omit(auto)
auto
# Train a model
= glm(mpg ~ (. - horsepower - weight - origin) ^ 2 + I(displacement / cylinders) + cut(horsepower, breaks = c(0, 50, 100, 150, 200, 250)) + I(log(weight)) + revalue(origin, replace = c("1" = "US", "2" = "Europe", "3" = "Japan")), data = auto)
auto.glm
# Export the model to PMML
r2pmml(auto.glm, "auto_glm.pmml")
ranger
Training and exporting a ranger
model:
library("ranger")
library("r2pmml")
data(iris)
# Train a model.
# Keep the forest data structure by specifying `write.forest = TRUE`
= ranger(Species ~ ., data = iris, num.trees = 7, write.forest = TRUE)
iris.ranger print(iris.ranger)
# Export the model to PMML.
# Pass the training dataset as the `data` argument
r2pmml(iris.ranger, "iris_ranger.pmml", data = iris)
xgboost
Training and exporting an xgb.Booster
model:
library("xgboost")
library("r2pmml")
data(iris)
= iris[, 1:4]
iris_X = as.integer(iris[, 5]) - 1
iris_y
# Generate R model matrix
= model.matrix(~ . - 1, data = iris_X)
iris.matrix
# Generate XGBoost DMatrix and feature map based on R model matrix
= xgb.DMatrix(iris.matrix, label = iris_y)
iris.DMatrix = as.fmap(iris.matrix)
iris.fmap
# Train a model
= xgboost(data = iris.DMatrix, missing = NULL, objective = "multi:softmax", num_class = 3, nrounds = 13)
iris.xgb
# Export the model to PMML.
# Pass the feature map as the `fmap` argument.
# Pass the name and category levels of the target field as `response_name` and `response_levels` arguments, respectively.
# Pass the value of missing value as the `missing` argument
# Pass the optimal number of trees as the `ntreelimit` argument (analogous to the `ntreelimit` argument of the `xgb::predict.xgb.Booster` function)
r2pmml(iris.xgb, "iris_xgb.pmml", fmap = iris.fmap, response_name = "Species", response_levels = c("setosa", "versicolor", "virginica"), missing = NULL, ntreelimit = 7, compact = TRUE)
Tweaking JVM configuration:
Sys.setenv(JAVA_TOOL_OPTIONS = "-Xms4G -Xmx8G")
r2pmml(iris.rf, "iris_rf.pmml")
Employing a custom converter class:
r2pmml(iris.rf, "iris_rf.pmml", converter = "com.mycompany.MyRandomForestConverter", converter_classpath = "/path/to/myconverter-1.0-SNAPSHOT.jar")
Removing the package:
remove.packages("r2pmml")
Up-to-date:
Slightly outdated:
R2PMML is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.
If you would like to use R2PMML in a proprietary software project, then it is possible to enter into a licensing agreement which makes R2PMML available under the terms and conditions of the BSD 3-Clause License instead.
R2PMML is developed and maintained by Openscoring Ltd, Estonia.
Interested in using Java PMML API software in your company? Please contact info@openscoring.io
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.