The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: An Implementation of Isolation Forest
Version: 1.1.3
Description: Isolation forest is anomaly detection method introduced by the paper Isolation based Anomaly Detection (Liu, Ting and Zhou <doi:10.1145/2133360.2133363>).
URL: https://github.com/talegari/solitude
BugReports: https://github.com/talegari/solitude/issues
Imports: ranger (≥ 0.11.0), data.table (≥ 1.11.4), igraph (≥ 1.2.2), future.apply (≥ 0.2.0), R6 (≥ 2.4.0), lgr (≥ 0.3.4),
Depends: R (≥ 3.5.0),
Suggests: tidyverse, uwot, mlbench, rsample
License: GPL-3
Encoding: UTF-8
RoxygenNote: 7.1.1
NeedsCompilation: no
Packaged: 2021-07-29 19:14:19 UTC; dattachidambara
Author: Komala Sheshachala Srikanth [aut, cre], David Zimmermann [ctb]
Maintainer: Komala Sheshachala Srikanth <sri.teach@gmail.com>
Repository: CRAN
Date/Publication: 2021-07-29 20:00:02 UTC

An Implementation of Isolation Forest

Description

Isolation forest is an anomaly detection method introduced by the paper Isolation based Anomaly Detection (Liu, Ting and Zhou <doi:10.1145/2133360.2133363>)

Author(s)

Srikanth Komala Sheshachala

See Also

Useful links:


Check for a single integer

Description

for a single integer

Usage

is_integerish(x)

Arguments

x

input

Value

TRUE or FALSE

Examples

## Not run: is_integerish(1)

Fit an Isolation Forest

Description

'solitude' class implements the isolation forest method introduced by paper Isolation based Anomaly Detection (Liu, Ting and Zhou <doi:10.1145/2133360.2133363>). The extremely randomized trees (extratrees) required to build the isolation forest is grown using ranger function from ranger package.

Design

$new() initiates a new 'solitude' object. The possible arguments are:

$fit() fits a isolation forest for the given dataframe or sparse matrix, computes depths of terminal nodes of each tree and stores the anomaly scores and average depth values in $scores object as a data.table

$predict() returns anomaly scores for a new data as a data.table

Details

Methods

Public methods


Method new()

Usage
isolationForest$new(
  sample_size = 256,
  num_trees = 100,
  replace = FALSE,
  seed = 101,
  nproc = NULL,
  respect_unordered_factors = NULL,
  max_depth = ceiling(log2(sample_size))
)

Method fit()

Usage
isolationForest$fit(dataset)

Method predict()

Usage
isolationForest$predict(data)

Method clone()

The objects of this class are cloneable with this method.

Usage
isolationForest$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples

## Not run: 
library("solitude")
library("tidyverse")
library("mlbench")

data(PimaIndiansDiabetes)
PimaIndiansDiabetes = as_tibble(PimaIndiansDiabetes)
PimaIndiansDiabetes

splitter   = PimaIndiansDiabetes %>%
  select(-diabetes) %>%
  rsample::initial_split(prop = 0.5)
pima_train = rsample::training(splitter)
pima_test  = rsample::testing(splitter)

iso = isolationForest$new()
iso$fit(pima_train)

scores_train = pima_train %>%
  iso$predict() %>%
  arrange(desc(anomaly_score))

scores_train

umap_train = pima_train %>%
  scale() %>%
  uwot::umap() %>%
  setNames(c("V1", "V2")) %>%
  as_tibble() %>%
  rowid_to_column() %>%
  left_join(scores_train, by = c("rowid" = "id"))

umap_train

umap_train %>%
  ggplot(aes(V1, V2)) +
  geom_point(aes(size = anomaly_score))

scores_test = pima_test %>%
  iso$predict() %>%
  arrange(desc(anomaly_score))

scores_test

## End(Not run)

Depth of each terminal node of all trees in a ranger model

Description

Depth of each terminal node of all trees in a ranger model is returned as a three column tibble with column names: 'id_tree', 'id_node', 'depth'. Note that root node has the node_id = 0.

Usage

terminalNodesDepth(model)

Arguments

model

A ranger model

Details

This function may be parallelized using a future backend.

Value

A tibble with three columns: 'id_tree', 'id_node', 'depth'.

Examples

rf = ranger::ranger(Species ~ ., data = iris, num.trees = 100)
terminalNodesDepth(rf)

Depth of each terminal node of a single tree in a ranger model

Description

Depth of each terminal node of a single tree in a ranger model. Note that root node has the id_node = 0.

Usage

terminalNodesDepthPerTree(treelike)

Arguments

treelike

Output of 'ranger::treeInfo'

Value

data.table with two columns: id_node and depth

Examples

## Not run: 
  rf = ranger::ranger(Species ~ ., data = iris)
  terminalNodesDepthPerTree(ranger::treeInfo(rf, 1))

## End(Not run)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.