Random Forest

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Random Forest

Function	Works
`tidypredict_fit()`, `tidypredict_sql()`, `parse_model()`	✔
`tidypredict_to_column()`	✗
`tidypredict_test()`	✗
`tidypredict_interval()`, `tidypredict_sql_interval()`	✗
`parsnip`	✔

How it works

Here is a simple randomForest() model using the iris dataset:

library(dplyr)
library(tidypredict)
library(randomForest)

model <- randomForest(Species ~ ., data = iris, ntree = 100, proximity = TRUE)

Under the hood

The parser is based on the output from the randomForest::getTree() function. It will return as many decision paths as there are non-NA rows in the prediction field.

getTree(model, labelVar = TRUE) %>%
  head()
#>   left daughter right daughter    split var split point status prediction
#> 1             2              3 Petal.Length        2.50      1       <NA>
#> 2             0              0         <NA>        0.00     -1     setosa
#> 3             4              5 Petal.Length        5.05      1       <NA>
#> 4             6              7  Petal.Width        1.90      1       <NA>
#> 5             0              0         <NA>        0.00     -1  virginica
#> 6             8              9 Sepal.Length        4.95      1       <NA>

The output from parse_model() is transformed into a dplyr, a.k.a Tidy Eval, formula. The entire decision tree becomes one dplyr::case_when() statement

tidypredict_fit(model)[1]
#> [[1]]
#> case_when(Petal.Length < 2.5 ~ "setosa", Petal.Length >= 5.05 & 
#>     Petal.Length >= 2.5 ~ "virginica", Petal.Width >= 1.9 & Petal.Length < 
#>     5.05 & Petal.Length >= 2.5 ~ "virginica", Sepal.Length < 
#>     4.95 & Petal.Width < 1.9 & Petal.Length < 5.05 & Petal.Length >= 
#>     2.5 ~ "virginica", Petal.Width < 1.75 & Sepal.Length >= 4.95 & 
#>     Petal.Width < 1.9 & Petal.Length < 5.05 & Petal.Length >= 
#>     2.5 ~ "versicolor", Sepal.Width < 3 & Petal.Width >= 1.75 & 
#>     Sepal.Length >= 4.95 & Petal.Width < 1.9 & Petal.Length < 
#>     5.05 & Petal.Length >= 2.5 ~ "virginica", Sepal.Width >= 
#>     3 & Petal.Width >= 1.75 & Sepal.Length >= 4.95 & Petal.Width < 
#>     1.9 & Petal.Length < 5.05 & Petal.Length >= 2.5 ~ "versicolor")

From there, the Tidy Eval formula can be used anywhere where it can be operated. tidypredict provides three paths:

Use directly inside dplyr, mutate(iris, !! tidypredict_fit(model))
Use tidypredict_to_column(model) to a piped command set
Use tidypredict_to_sql(model) to retrieve the SQL statement

parsnip

tidypredict also supports randomForest model objects fitted via the parsnip package.

library(parsnip)

parsnip_model <- rand_forest(mode = "classification") %>%
  set_engine("randomForest") %>%
  fit(Species ~ ., data = iris)

tidypredict_fit(parsnip_model)[[1]]
#> case_when(Petal.Length < 2.45 & Sepal.Length < 5.45 ~ "setosa", 
#>     Petal.Width < 1.6 & Petal.Length >= 2.45 & Sepal.Length < 
#>         5.45 ~ "versicolor", Petal.Width >= 1.6 & Petal.Length >= 
#>         2.45 & Sepal.Length < 5.45 ~ "virginica", Petal.Length >= 
#>         4.7 & Sepal.Length < 5.75 & Sepal.Length >= 5.45 ~ "virginica", 
#>     Petal.Width < 0.65 & Petal.Length < 4.7 & Sepal.Length < 
#>         5.75 & Sepal.Length >= 5.45 ~ "setosa", Petal.Width >= 
#>         0.65 & Petal.Length < 4.7 & Sepal.Length < 5.75 & Sepal.Length >= 
#>         5.45 ~ "versicolor", Petal.Length < 2.55 & Petal.Length < 
#>         4.95 & Sepal.Length >= 5.75 & Sepal.Length >= 5.45 ~ 
#>         "setosa", Petal.Length >= 5.05 & Petal.Length >= 4.95 & 
#>         Sepal.Length >= 5.75 & Sepal.Length >= 5.45 ~ "virginica", 
#>     Petal.Width < 1.7 & Petal.Length >= 2.55 & Petal.Length < 
#>         4.95 & Sepal.Length >= 5.75 & Sepal.Length >= 5.45 ~ 
#>         "versicolor", Petal.Width >= 1.7 & Petal.Length >= 2.55 & 
#>         Petal.Length < 4.95 & Sepal.Length >= 5.75 & Sepal.Length >= 
#>         5.45 ~ "virginica", Sepal.Length < 6.5 & Petal.Length < 
#>         5.05 & Petal.Length >= 4.95 & Sepal.Length >= 5.75 & 
#>         Sepal.Length >= 5.45 ~ "virginica", Sepal.Length >= 6.5 & 
#>         Petal.Length < 5.05 & Petal.Length >= 4.95 & Sepal.Length >= 
#>         5.75 & Sepal.Length >= 5.45 ~ "versicolor")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.