The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

sparsediscrim

The R package sparsediscrim provides a collection of sparse and regularized discriminant analysis classifiers that are especially useful for when applied to small-sample, high-dimensional data sets.

The package was archived in 2018 and was re-released in 2021. The package code was forked from John Ramey’s repo and subsequently modified.

Installation

You can install the stable version on CRAN:

install.packages('sparsediscrim', dependencies = TRUE)

If you prefer to download the latest version, instead type:

library(devtools)
install_github('topepo/sparsediscrim')

Usage

The formula and non-formula interfaces can be used:

library(sparsediscrim)

data(parabolic, package = "modeldata")

qda_mod <- qda_shrink_mean(class ~ ., data = parabolic)
# or
qda_mod <- qda_shrink_mean(x = parabolic[, 1:2], y = parabolic$class)

qda_mod
#> Shrinkage-Mean-Based Diagonal QDA
#> 
#> Sample Size: 500 
#> Number of Features: 2 
#> 
#> Classes and Prior Probabilities:
#>   Class1 (48.8%), Class2 (51.2%)

# Prediction uses the `type` argument: 

parabolic_grid <-
   expand.grid(X1 = seq(-5, 5, length = 100),
               X2 = seq(-5, 5, length = 100))


parabolic_grid$qda <- predict(qda_mod, parabolic_grid, type = "prob")$Class1

library(ggplot2)
ggplot(parabolic, aes(x = X1, y = X2)) +
   geom_point(aes(col = class), alpha = .5) +
   geom_contour(data = parabolic_grid, aes(z = qda), col = "black", breaks = .5) +
   theme_bw() +
   theme(legend.position = "top") +
   coord_equal()

Classifiers

The sparsediscrim package features the following classifier (the R function is included within parentheses):

High-Dimensional Regularized Discriminant Analysis (rda_high_dim()) from Ramey et al. (2015)

The sparsediscrim package also includes a variety of additional classifiers intended for small-sample, high-dimensional data sets. These include:

Classifier	Author	R Function
Diagonal Linear Discriminant Analysis	Dudoit et al. (2002)	`lda_diag()`
Diagonal Quadratic Discriminant Analysis	Dudoit et al. (2002)	`qda_diag()`
Shrinkage-based Diagonal Linear Discriminant Analysis	Pang et al. (2009)	`lda_shrink_cov()`
Shrinkage-based Diagonal Quadratic Discriminant Analysis	Pang et al. (2009)	`qda_shrink_cov()`
Shrinkage-mean-based Diagonal Linear Discriminant Analysis	Tong et al. (2012)	`lda_shrink_mean()`
Shrinkage-mean-based Diagonal Quadratic Discriminant Analysis	Tong et al. (2012)	`qda_shrink_mean()`
Minimum Distance Empirical Bayesian Estimator (MDEB)	Srivistava and Kubokawa (2007)	`lda_emp_bayes()`
Minimum Distance Rule using Modified Empirical Bayes (MDMEB)	Srivistava and Kubokawa (2007)	`lda_emp_bayes_eigen()`
Minimum Distance Rule using Moore-Penrose Inverse (MDMP)	Srivistava and Kubokawa (2007)	`lda_eigen()`

We also include modifications to Linear Discriminant Analysis (LDA) with regularized covariance-matrix estimators:

Moore-Penrose Pseudo-Inverse (lda_pseudo())
Schafer-Strimmer estimator (lda_schafer())
Thomaz-Kitani-Gillies estimator (lda_thomaz())

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.