cdf.test {spsurvey}R Documentation

Cumulative Distribution Function - Inference

Description

This function calculates the Wald, Rao-Scott first order corrected (mean eigenvalue corrected), and Rao-Scott second order corrected (Satterthwaite corrected) statistics for categorical data to test for differences between two cumulative distribution functions (CDFs). The functions calculates both standard versions of those three statistics, which are distributed as Chi-squared random variables, plus modified version of the statistics, which are distributed as F random variables.

Usage

cdf.test(sample1, sample2, bounds, vartype="Local")

Arguments

sample1 the sample from the first population in the form of a list containing the following components:
z = the response value for each site
wgt = the final adjusted weight (inverse of the sample inclusion probability) for each site
x = x-coordinate for location for each site, which may be NULL
y = y-coordinate for location for each site, which may be NULL
sample2 the sample from the second population in the form of a list containing the following components:
z = the response value for each site
wgt = the final adjusted weight (inverse of the sample inclusion probability) for each site
x = x-coordinate for location for each site, which may be NULL
y = y-coordinate for location for each site, which may be NULL
bounds upper bounds for calculating the classes for the CDF.
vartype the choice of variance estimator, where "Local" = local mean estimator and "SRS" = SRS estimator. The default is "Local".

Details

The user supplies the set of upper bounds for defining the classes for the CDFs. The Horvitz-Thompson ratio estimator, i.e., the ratio of two Horvitz-Thompson estimators, is used to calculate estimates of the class proportions for the CDFs. Variance estimates for the test statistics are calculated using either the local mean variance estimator or the simple random sampling (SRS) variance estimator. The choice of variance estimator is subject to user control. The SRS variance estimator uses the independent random sample approximation to calculate joint inclusion probabilities. The function checks for compatability of input values and removes missing values.

Value

Value is a data frame containing the test statistic, degrees of freedom (two values labeled Degrees of Freedom_1 and Degrees of Freedom_2), and p value for the Wald, mean eigenvalue, and Satterthwaite test procedures, which includes both Chi-squared distribution and F distribution versions of the procedures. For the Chi-squared versions of the test procedures, Degrees of Freedom_1 contains the relevant value and Degrees of Freedom_2 is set to missing (NA). For the F-based versions of the test procedures Degrees of Freedom_1 contains the numerator degrees of freedom and Degrees of Freedom_2 contains the denominator degrees of freedom.

Author(s)

Tom Kincaid Kincaid.Tom@epa.gov

References

Kincaid, T.M. (2000). Testing for differences between cumulative distribution functions from complex environmental sampling surveys. In 2000 Proceeding of the Section on Statistics and the Environment, American Statistical Association, Alexandria, VA.

Examples

resp <- rnorm(100, 10, 1)
wgt <- runif(100, 40, 60)
sample1 <- list(z=resp, wgt=wgt)
sample2 <- list(z=resp+0.5, wgt=wgt)
bounds <- sort(c(sample1$z, sample2$z))[floor(seq(200/3, 200, length=3))]
cdf.test(sample1, sample2, bounds, vartype="SRS")

xcoord <- runif(100)
ycoord <- runif(100)
sample1 <- list(z=resp, wgt=wgt, x=xcoord, y=ycoord)
sample2 <- list(z=1.05*resp, wgt=wgt, x=xcoord, y=ycoord)
cdf.test(sample1, sample2, bounds)

[Package spsurvey version 1.6.2 Index]