The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Sören Künzel, Theo Saarinen, Simon Walter, Edward Liu, Allen Tang, Jasjeet Sekhon
Rforestry is a fast implementation of Random Forests, Gradient Boosting, and Linear Random Forests, with an emphasis on inference and interpretability.
install.packages("devtools")
.devtools::has_devel()
to check whether you do. If no
development environment exists, Windows users download and install Rtools and
macOS users download and install Xcode.devtools::install_github("forestry-labs/Rforestry")
. For
Windows users, you’ll need to skip 64-bit compilation
devtools::install_github("forestry-labs/Rforestry", INSTALL_opts = c('--no-multiarch'))
due to an outstanding gcc issue.set.seed(292315)
library(Rforestry)
<- sample(nrow(iris), 3)
test_idx <- iris[-test_idx, -1]
x_train <- iris[-test_idx, 1]
y_train <- iris[test_idx, -1]
x_test
<- forestry(x = x_train, y = y_train)
rf = predict(rf, x_test, aggregation = "weightMatrix")$weightMatrix
weights
%*% y_train
weights predict(rf, x_test)
A fast implementation of random forests using ridge penalized splitting and ridge regression for predictions.
Example:
set.seed(49)
library(Rforestry)
<- c(100)
n <- rnorm(n)
a <- rnorm(n)
b <- rnorm(n)
c <- 4*a + 5.5*b - .78*c
y <- data.frame(a,b,c)
x <- forestry(x, y, ridgeRF = TRUE)
forest predict(forest, x)
A parameter controlling monotonic constraints for features in forestry.
library(Rforestry)
<- rnorm(150)+5
x <- .15*x + .5*sin(3*x)
y <- data.frame(x1 = x, x2 = rnorm(150)+5, y = y + rnorm(150, sd = .4))
data_train
<- forestry(x = data_train %>% select(-y),
monotone_rf y = data_train$y,
monotonicConstraints = c(-1,-1),
nodesizeStrictSpl = 5,
nthread = 1,
ntree = 25)
predict(monotone_rf, feature.new = data_train %>% select(-y))
We can return the predictions for the training dataset using only the trees in which each observation was out of bag. Note that when there are few trees, or a high proportion of the observations sampled, there may be some observations which are not out of bag for any trees. The predictions for these are returned NaN.
library(Rforestry)
# Train a forest
<- forestry(x = iris[,-1],
rf y = iris[,1],
ntree = 500)
# Get the OOB predictions for the training set
<- getOOBpreds(rf)
oob_preds
# This should be equal to the OOB error
sum((oob_preds - iris[,1])^2)
getOOB(rf)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.