Comparison of Simulated Distribution to Theoretical Distribution or Empirical Data

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Allison C Fialkowski

2018-06-28

Headrick and Kowalchuk (2007) outlined a general method for comparing a simulated distribution \(\Large Y\) to a given theoretical distribution \(\Large Y^*\). Note that these could easily be modified for comparison to an empirical vector of data:

Example

Use these steps to compare a simulated exponential(mean = 2) variable to the theoretical exponential(mean = 2) density. (Note that the printr package is invoked to display the tables.)

Step 1: Obtain the standardized cumulants

In R, the exponential parameter is rate <- 1/mean.

library("SimMultiCorrData")
library("printr")
stcums <- calc_theory(Dist = "Exponential", params = 0.5)

Step 2: Simulate the variable

Note that calc_theory returns the standard deviation, not the variance. The simulation functions require variance as the input.

H_exp <- nonnormvar1("Polynomial", means = stcums[1], vars = stcums[2]^2, 
                     skews = stcums[3], skurts = stcums[4], 
                     fifths = stcums[5], sixths = stcums[6], n = 10000, 
                     seed = 1234)

## Constants: Distribution  1  
## 
## Constants calculation time: 0 minutes 
## Total Simulation time: 0 minutes

Look at the power method constants.

as.matrix(H_exp$constants, nrow = 1, ncol = 6, byrow = TRUE)

c0	c1	c2	c3	c4	c5
-0.3077396	0.8005605	0.318764	0.0335001	-0.0036748	0.0001587

Look at a summary of the target distribution.

as.matrix(round(H_exp$summary_targetcont[, c("Distribution", "mean", "sd", 
                                             "skew", "skurtosis", "fifth", 
                                             "sixth")], 5), nrow = 1, ncol = 7,
          byrow = TRUE)

	Distribution	mean	sd	skew	skurtosis	fifth	sixth
mean	1	2	2	2	6	24	120

Compare to a summary of the simulated distribution.

as.matrix(round(H_exp$summary_continuous[, c("Distribution", "mean", "sd", 
                                             "skew", "skurtosis", "fifth", 
                                             "sixth")], 5), nrow = 1, ncol = 7,
          byrow = TRUE)

	Distribution	mean	sd	skew	skurtosis	fifth	sixth
X1	1	1.99987	2.0024	2.03382	6.18067	23.74145	100.3358

Step 3: Determine if the constants generate a valid power method pdf

H_exp$valid.pdf

## [1] "TRUE"

Step 4: Select a critical value

Let \(\Large \alpha = 0.05\).

y_star <- qexp(1 - 0.05, rate = 0.5) # note that rate = 1/mean
y_star

## [1] 5.991465

Step 5: Solve for \(\Large z'\)

Since the exponential(2) distribution has a mean and standard deviation equal to 2, solve \(\Large 2 * p(z') + 2 - y_star = 0\) for \(\Large z'\). Here, \(\Large p(z') = c0 + c1 * z' + c2 * z'^2 + c3 * z'^3 + c4 * z'^4 + c5 * z'^5\).

f_exp <- function(z, c, y) {
  return(2 * (c[1] + c[2] * z + c[3] * z^2 + c[4] * z^3 + c[5] * z^4 + 
                c[6] * z^5) + 2 - y)
}

z_prime <- uniroot(f_exp, interval = c(-1e06, 1e06), 
                   c = as.numeric(H_exp$constants), y = y_star)$root
z_prime

## [1] 1.644926

Step 6: Calculate \(\Large \Phi(z')\)

1 - pnorm(z_prime)

## [1] 0.04999249

This is approximately equal to the \(\Large \alpha\) value of 0.05, indicating the method provides a good approximation to the actual distribution.

Step 7: Plot graphs

plot_sim_pdf_theory(sim_y = H_exp$continuous_variable[, 1], 
                    Dist = "Exponential", params = 0.5)

We can also plot the empirical cdf and show the cumulative probability up to y_star.

plot_sim_cdf(sim_y = H_exp$continuous_variable[, 1], calc_cprob = TRUE, 
             delta = y_star)

Calculate descriptive statistics.

as.matrix(t(stats_pdf(c = H_exp$constants[1, ], method = "Polynomial", 
                    alpha = 0.025, mu = stcums[1], sigma = stcums[2])))

trimmed_mean	median	mode	max_height
1.858381	1.384521	0.104872	1.094213

References

Headrick, T C, and R K Kowalchuk. 2007. “The Power Method Transformation: Its Probability Density Function, Distribution Function, and Its Further Use for Fitting Data.” Journal of Statistical Computation and Simulation 77: 229–49. doi:10.1080/10629360600605065.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.