The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

sparsediscrim

Lifecycle: experimental Codecov test coverage R-CMD-check

The R package sparsediscrim provides a collection of sparse and regularized discriminant analysis classifiers that are especially useful for when applied to small-sample, high-dimensional data sets.

The package was archived in 2018 and was re-released in 2021. The package code was forked from John Ramey’s repo and subsequently modified.

Installation

You can install the stable version on CRAN:

install.packages('sparsediscrim', dependencies = TRUE)

If you prefer to download the latest version, instead type:

library(devtools)
install_github('topepo/sparsediscrim')

Usage

The formula and non-formula interfaces can be used:

library(sparsediscrim)

data(parabolic, package = "modeldata")

qda_mod <- qda_shrink_mean(class ~ ., data = parabolic)
# or
qda_mod <- qda_shrink_mean(x = parabolic[, 1:2], y = parabolic$class)

qda_mod
#> Shrinkage-Mean-Based Diagonal QDA
#> 
#> Sample Size: 500 
#> Number of Features: 2 
#> 
#> Classes and Prior Probabilities:
#>   Class1 (48.8%), Class2 (51.2%)

# Prediction uses the `type` argument: 

parabolic_grid <-
   expand.grid(X1 = seq(-5, 5, length = 100),
               X2 = seq(-5, 5, length = 100))


parabolic_grid$qda <- predict(qda_mod, parabolic_grid, type = "prob")$Class1

library(ggplot2)
ggplot(parabolic, aes(x = X1, y = X2)) +
   geom_point(aes(col = class), alpha = .5) +
   geom_contour(data = parabolic_grid, aes(z = qda), col = "black", breaks = .5) +
   theme_bw() +
   theme(legend.position = "top") +
   coord_equal()

Classifiers

The sparsediscrim package features the following classifier (the R function is included within parentheses):

The sparsediscrim package also includes a variety of additional classifiers intended for small-sample, high-dimensional data sets. These include:

Classifier Author R Function
Diagonal Linear Discriminant Analysis Dudoit et al. (2002) lda_diag()
Diagonal Quadratic Discriminant Analysis Dudoit et al. (2002) qda_diag()
Shrinkage-based Diagonal Linear Discriminant Analysis Pang et al. (2009) lda_shrink_cov()
Shrinkage-based Diagonal Quadratic Discriminant Analysis Pang et al. (2009) qda_shrink_cov()
Shrinkage-mean-based Diagonal Linear Discriminant Analysis Tong et al. (2012) lda_shrink_mean()
Shrinkage-mean-based Diagonal Quadratic Discriminant Analysis Tong et al. (2012) qda_shrink_mean()
Minimum Distance Empirical Bayesian Estimator (MDEB) Srivistava and Kubokawa (2007) lda_emp_bayes()
Minimum Distance Rule using Modified Empirical Bayes (MDMEB) Srivistava and Kubokawa (2007) lda_emp_bayes_eigen()
Minimum Distance Rule using Moore-Penrose Inverse (MDMP) Srivistava and Kubokawa (2007) lda_eigen()

We also include modifications to Linear Discriminant Analysis (LDA) with regularized covariance-matrix estimators:

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.