The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

LearnNonparam

Overview

This R package implements several non-parametric tests in chapters 1-5 of Higgins (2004), including tests for one sample, two samples, k samples, paired comparisons, blocked designs, trends and association. Built with Rcpp for efficiency and R6 for flexible, object-oriented design, it provides a unified framework for performing or creating custom permutation tests.

Installation

Install the stable version from CRAN:

install.packages("LearnNonparam")

Install the development version from Github:

# install.packages("remotes")
remotes::install_github("qddyy/LearnNonparam")

Usage

library(LearnNonparam)

Construct a test object

from some R6 class directly

t <- Wilcoxon$new(n_permu = 1e6)

using the pmt (permutation test) wrapper

# recommended for a unified API
t <- pmt("twosample.wilcoxon", n_permu = 1e6)

Provide it with samples

set.seed(-1)

t$test(rnorm(10, 1), rnorm(10, 0))

Check the results

t$statistic

t$p_value

options(digits = 3)

t$print()

ggplot2::theme_set(ggplot2::theme_minimal())

t$plot(style = "ggplot2", binwidth = 1)

Modify some settings and observe the change
```
t$type <- "asymp"
t$p_value
```

See pmts() for tests implemented in this package.

key	class	test
onesample.quantile	Quantile	Quantile Test
onesample.cdf	CDF	Inference on Cumulative Distribution Function
twosample.difference	Difference	Two-Sample Test Based on Mean or Median
twosample.wilcoxon	Wilcoxon	Two-Sample Wilcoxon Test
twosample.scoresum	ScoreSum	Two-Sample Test Based on Sum of Scores
twosample.ansari	AnsariBradley	Ansari-Bradley Test
twosample.siegel	SiegelTukey	Siegel-Tukey Test
twosample.rmd	RatioMeanDeviance	Ratio Mean Deviance Test
twosample.ks	KolmogorovSmirnov	Two-Sample Kolmogorov-Smirnov Test
ksample.oneway	OneWay	One-Way Test for Equal Means
ksample.kw	KruskalWallis	Kruskal-Wallis Test
ksample.jt	JonckheereTerpstra	Jonckheere-Terpstra Test
multcomp.studentized	Studentized	Multiple Comparison Based on Studentized Statistic
paired.sign	Sign	Two-Sample Sign Test
paired.difference	PairedDifference	Paired Comparison Based on Differences
rcbd.oneway	RCBDOneWay	One-Way Test for Equal Means in RCBD
rcbd.friedman	Friedman	Friedman Test
rcbd.page	Page	Page Test
association.corr	Correlation	Test for Association Between Paired Samples
table.chisq	ChiSquare	Chi-Square Test on Contingency Table

Extending

define_pmt allows users to define new permutation tests. Take the two-sample Wilcoxon test as an example:

t_custom <- define_pmt(
    # this is a two-sample permutation test
    inherit = "twosample",
    statistic = function(x, y) {
        # (optional) pre-calculate certain constants that remain invariant during permutation
        m <- length(x)
        n <- length(y)
        # return a closure to calculate the test statistic
        function(x, y) sum(x) / m - sum(y) / n
    },
    # reject the null hypothesis when the test statistic is too large or too small
    rejection = "lr", n_permu = 1e5
)

Also, the statistic can be written in C++. Leveraging Rcpp sugars and C++14 features, only minor modifications are needed to make it compatible with C++ syntax.

t_cpp <- define_pmt(
    inherit = "twosample", rejection = "lr", n_permu = 1e5,
    statistic = "[](const auto& x, const auto& y) {
        auto m = x.length();
        auto n = y.length();
        return [=](const auto& x, const auto& y) {
            return sum(x) / m - sum(y) / n;
        };
    }"
)

It’s easy to check that t_custom and t_cpp are equivalent:

x <- rnorm(10, mean = 0)
y <- rnorm(10, mean = 5)

set.seed(0)
t_custom$test(x, y)$print()

set.seed(0)
t_cpp$test(x, y)$print()

Performance

coin is a commonly used R package for performing permutation tests. Below is a benchmark:

library(coin)

data <- c(x, y)
group <- factor(c(rep("x", length(x)), rep("y", length(y))))

options(LearnNonparam.pmt_progress = FALSE)
benchmark <- microbenchmark::microbenchmark(
    R = t_custom$test(x, y),
    Rcpp = t_cpp$test(x, y),
    coin = wilcox_test(data ~ group, distribution = approximate(nresample = 1e5, parallel = "no"))
)

benchmark

It can be seen that C++ brings significantly better performance than pure R, even surpassing the coin package (under sequential execution). However, all tests in this package are currently written in R with no plans for migration to C++ in the future. This is because the primary goal of this package is not to maximize performance but to offer a flexible framework for permutation tests.

References

Higgins, J. J. 2004. An Introduction to Modern Nonparametric Statistics. Duxbury Advanced Series. Brooks/Cole.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.