The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This package includes one function wmwm.test()
, which
performs the two-sample hypothesis test method proposed in (Zeng et al.,
2024) for univariate data when data are not fully observed. This method
is a theoretical extension of Wilcoxon-Mann-Whitney test in the presence
of missing data, which controls the Type I error regardless of values of
missing data.
Bounds of the Wilcoxon-Mann-Whitne test statistic and its p-value will be computed in the presence of missing data. The p-value of the test method proposed in (Zeng et al., 2024) is then returned as the maximum possible p-value of the Wilcoxon-Mann-Whitney test.
You can install the development version of wmwm from GitHub with:
# install.packages("devtools")
::install_github("Yijin-Zeng/Wilcoxon-Mann-Whitney-Test-with-Missing-data") devtools
This is a basic example which shows you how to perform the test with missing data:
library(wmwm)
#### Assume all samples are distinct.
<- c(6.2, 3.5, NA, 7.6, 9.2)
X <- c(0.2, 1.3, -0.5, -1.7)
Y ## By default, when the sample sizes of both X and Y are smaller than 50,
## exact distribution will be used.
wmwm.test(X, Y, ties = FALSE, alternative = 'two.sided')
#> $p.value
#> [1] 0.1904762
#>
#> $bounds.statistic
#> [1] 16 20
#>
#> $bounds.pvalue
#> [1] 0.01587302 0.19047619
#>
#> $alternative
#> [1] "two.sided"
#>
#> $ties.method
#> [1] FALSE
#>
#> $description.bounds
#> [1] "bounds.pvalue is the bounds of the exact p-value"
#>
#> $data.name
#> [1] "X and Y"
## using normality approximation with continuity correction:
wmwm.test(X, Y, ties = FALSE, alternative = 'two.sided', exact = FALSE, correct = TRUE)
#> $p.value
#> [1] 0.1779096
#>
#> $bounds.statistic
#> [1] 16 20
#>
#> $bounds.pvalue
#> [1] 0.01996445 0.17790959
#>
#> $alternative
#> [1] "two.sided"
#>
#> $ties.method
#> [1] FALSE
#>
#> $description.bounds
#> [1] "bounds.pvalue is the bounds of the p-value obtained using normal approximation with continuity correction"
#>
#> $data.name
#> [1] "X and Y"
#### Assume samples can be tied.
<- c(6, 9, NA, 7, 9)
X <- c(0, 1, 0, -1)
Y ## When the samples can be tied, normality approximation will be used.
## By default, lower.boundary = -Inf, upper.boundary = Inf.
wmwm.test(X, Y, ties = TRUE, alternative = 'two.sided')
#> Warning in boundsPValueWithTies(X, Y, alternative = alternative, lower.boundary
#> = lower.boundary, : cannot bound exact p-value with ties
#> $p.value
#> [1] 0.174277
#>
#> $bounds.statistic
#> [1] 16 20
#>
#> $bounds.pvalue
#> [1] 0.01745104 0.17427702
#>
#> $alternative
#> [1] "two.sided"
#>
#> $ties.method
#> [1] TRUE
#>
#> $description.bounds
#> [1] "bounds.pvalue is the bounds of the p-value obtained using normal approximation with continuity correction"
#>
#> $data.name
#> [1] "X and Y"
## specifying lower.boundary and upper.boundary:
wmwm.test(X, Y, ties = TRUE, alternative = 'two.sided', lower.boundary = -1, upper.boundary = 9)
#> Warning in boundsPValueWithTies(X, Y, alternative = alternative, lower.boundary
#> = lower.boundary, : cannot bound exact p-value with ties
#> $p.value
#> [1] 0.1383146
#>
#> $bounds.statistic
#> [1] 16.5 20.0
#>
#> $bounds.pvalue
#> [1] 0.01745104 0.13831461
#>
#> $alternative
#> [1] "two.sided"
#>
#> $ties.method
#> [1] TRUE
#>
#> $description.bounds
#> [1] "bounds.pvalue is the bounds of the p-value obtained using normal approximation with continuity correction"
#>
#> $data.name
#> [1] "X and Y"
The R function stats::wilcox.test()
executes
Wilcoxon-Mann-Whitney two-sample when all samples are observed.
Zeng Y, Adams NM, Bodenham DA. On two-sample testing for data with arbitrarily missing values. arXiv preprint arXiv:2403.15327. 2024 Mar 22.
Mann, Henry B., and Donald R. Whitney. “On a test of whether one of two random variables is stochastically larger than the other.” The annals of mathematical statistics (1947): 50-60.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.