This vignette is the description of algorithm in this package and the correctness check via simulation experiment. The algorithm of this package comes from Moschopoulos Peter G. (1985). This vignette also give some useful informations.
Assume that we have several random variables, \(X_1, ..., X_n\), and all random variables follow gamma distribution independently with shape parameters \(\alpha_i\) and scale parameters \(\beta_i\), where \(i = 1, ..., n\). Then, the density of \(Y = X_1 + ... + X_n\) can be expressed as:
\[g(y) = C \sum_{k=0}^{\infty} \lambda_k y^{\rho + k - 1} e^{-y/\beta_1} / (\Gamma(\rho + k) \beta_{1}^{\rho + k})\]
And the distribution function \(G(w)=Pr(Y<w)\) is expressed as:
\[G(w) = C \sum_{k=0}^{\infty} \lambda_k \int_{0}^{w} (y^{\rho + k - 1} e^{-y/\beta_1} / (\Gamma(\rho + k) \beta_{1}^{\rho + k})) dy\]
The integrate in this formula is incomplete gamma function and can be calculated by distribution function of gamma distribution.
More details about this algorithm can be found in paper of Moschopoulos Peter G. (1985).
Assume that we have two random variables, \(X_1\) and \(X_2\), where \(X_1\) is a gamma distribution with shape parameter \(3\), and rate parameter \(2\), and \(X_2\) is a gamma distribution with shape parameter \(4\), and rate parameter \(3\). The density and distribution funciton of \(Y = X_1 + X_2\) will be calculated.
Correctness check for density function:
y <- rcoga(1000000, c(3,4), c(2,3))
grid <- seq(0, 8, length.out=1000)
pdf <- dcoga(grid, shape=c(3, 4), rate=c(2, 3))
plot(density(y), col="blue")
lines(grid, pdf, col="red")
Correctness check for distribution function:
y <- rcoga(1000000, c(3,4), c(2,3))
grid <- seq(0, 8, length.out=1000)
cdf <- pcoga(grid, shape=c(3, 4), rate=c(2, 3))
plot(ecdf(y), col="blue")
lines(grid, cdf, col="red")
The ‘dcoga’ and ‘pcoga’ functions in this package ‘coga’ is based on Cpp code. The following experiment shows the advantage of Cpp code, which runs on a Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz computer.
grid <- seq(0, 15, length.out=10)
microbenchmark::microbenchmark(
dcoga(grid, shape=c(3,4,5), rate=c(2,3,4)),
coga:::dcoga.R(grid, shape=c(3,4,5), rate=c(2,3,4)),
pcoga(grid, shape=c(3,4,5), rate=c(2,3,4)),
coga:::pcoga.R(grid, shape=c(3,4,5), rate=c(2,3,4))
)
## Unit: milliseconds
## expr min
## dcoga(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 1.317128
## coga:::dcoga.R(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 40.042678
## pcoga(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 4.266748
## coga:::pcoga.R(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 50.086309
## lq mean median uq max neval
## 1.420452 1.922557 1.606208 1.866404 22.05880 100
## 43.311773 52.665573 46.301379 62.560726 113.68096 100
## 4.955067 5.783562 5.157502 5.749830 27.32323 100
## 53.192434 65.423030 58.647805 76.481842 118.76037 100
Please take care of that dcoga
, and pcoga
in this package can handle different lengths of parameter shape
and rate
by recycling shorter parameter. That means that dcoga(3, c(2,3), c(3,4,5,3,4))
and dcoga(3, c(2,3,2,3,2), c(3,4,5,3,4))
will give the same result.
[1] Moschopoulos, Peter G. “The distribution of the sum of independent gamma random variables.” Annals of the Institute of Statistical Mathematics 37.1 (1985): 541-544.