The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
banditpam
is an R package that lets you do \(k\)-mediods clustering efficiently as
described in Tiwari, et. al. (2020).
We illustrate with a simple example using simulated data from a Gaussian Mixture Model with the the following means: \((0, 0)\), \((-5, 5)\) and \((5, 5)\).
set.seed(10)
n_per_cluster <- 40
means <- list(c(0, 0), c(-5, 5), c(5, 5))
X <- do.call(rbind, lapply(means, MASS::mvrnorm, n = n_per_cluster, Sigma = diag(2)))
Let’s cluster the observations in this X
matrix using 3
clusters. The first step is to create a KMedoids
object:
Next we fit the data with a specified loss, \(l_2\) here. A good habit is to set the seed before fitting for reproducibility.
And we can now extract the medoid observation indices.
A plot shows the results where we color the medoids in red.
d <- as.data.frame(X); names(d) <- c("x", "y")
dd <- d[med_indices, ]
ggplot(data = d) +
geom_point(aes(x, y)) +
geom_point(aes(x, y), data = dd, color = "red")
We can also change the loss function and see how the mediods change.
One can query some performance statistics too; see help on
KMedoids
.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.