This R package coga can help you to calculate density and distribution function of convolution of gamma distributions. The convolution of gamma distributions is the sum of series of independent gamma distributions. The algorithm of this package comes from Moschopoulos Peter G. (1985). The R coda in this vignette also can be considered as useful examples.
Assume that we have several random variables, \(X_1, ..., X_n\), and all random variables follow gamma distribution independently with shape parameters \(\alpha_i\) and scale parameters \(\beta_i\), where \(i = 1, ..., n\). Then, the density of \(Y = X_1 + ... + X_n\) can be expressed as:
\[g(y) = C \sum_{k=0}^{\infty} \lambda_k y^{\rho + k - 1} e^{-y/\beta_1} / (\Gamma(\rho + k) \beta_{1}^{\rho + k})\]
And the distribution function \(G(w)=Pr(Y<w)\) is expressed as:
\[G(w) = C \sum_{k=0}^{\infty} \lambda_k \int_{0}^{w} (y^{\rho + k - 1} e^{-y/\beta_1} / (\Gamma(\rho + k) \beta_{1}^{\rho + k})) dy\]
The integrate in this formula is incomplete gamma function and can be calculated by distribution function of gamma distribution.
More details about this algorithm can be found in paper of Moschopoulos Peter G. (1985).
Assume that we have two random variables, \(X_1\) and \(X_2\), where \(X_1\) is a gamma distribution with shape parameter \(3\), and rate parameter \(2\), and \(X_2\) is a gamma distribution with shape parameter \(4\), and rate parameter \(3\). The density and distribution funciton of \(Y = X_1 + X_2\) will be calculated.
Correctness check for density function:
y <- rcoga(1000000, c(3,4), c(2,3))
grid <- seq(0, 8, length.out=1000)
pdf <- dcoga(grid, shape=c(3, 4), rate=c(2, 3))
plot(density(y), col="blue")
lines(grid, pdf, col="red")
Correctness check for distribution function:
y <- rcoga(1000000, c(3,4), c(2,3))
grid <- seq(0, 8, length.out=1000)
cdf <- pcoga(grid, shape=c(3, 4), rate=c(2, 3))
plot(ecdf(y), col="blue")
lines(grid, cdf, col="red")
The ‘dcoga’ and ‘pcoga’ functions in this package ‘coga’ is based on Cpp code. The following experiment shows the advantage of Cpp code, which runs on a Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz computer.
grid <- seq(0, 15, length.out=10)
microbenchmark::microbenchmark(
dcoga(grid, shape=c(3,4,5), rate=c(2,3,4)),
coga:::dcoga.R(grid, shape=c(3,4,5), rate=c(2,3,4)),
pcoga(grid, shape=c(3,4,5), rate=c(2,3,4)),
coga:::pcoga.R(grid, shape=c(3,4,5), rate=c(2,3,4))
)
## Unit: milliseconds
## expr min
## dcoga(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 1.291005
## coga:::dcoga.R(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 29.972928
## pcoga(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 4.075797
## coga:::pcoga.R(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 37.945149
## lq mean median uq max neval
## 1.449642 2.072026 1.733290 1.864550 38.17066 100
## 32.316995 38.529133 34.938479 35.619007 104.49107 100
## 4.931446 8.275947 5.248308 5.574377 46.21576 100
## 39.865460 50.433972 43.021125 53.660163 85.49018 100
Note: In this example, ‘dcoga.R’, and ‘pcoga.R’ are the R version functions for density, and distribution functions of convolution of gamma distributions. We do not put these two R functions as export functions in package ‘coga’, but you can still use them by ‘coga:::dcoga’, and ‘coga:::pcoga’.
The convolution of two gamma distributions is a special situation of convolution of gamma distributions. The functions ‘dcoga2dim’ and ‘pcoga2dim’ can solve this problem with higher efficiency (they are much more faster than the general functions, ‘dcoga’ and ‘pcoga’.)
grid <- seq(0, 15, length.out=100)
microbenchmark::microbenchmark(
dcoga(grid, shape=c(3,4), rate=c(2,3)),
dcoga2dim(grid, 3, 4, 2, 3),
pcoga(grid, shape=c(3,4), rate=c(2,3)),
pcoga2dim(grid, 3, 4, 2, 3))
## Unit: microseconds
## expr min lq
## dcoga(grid, shape = c(3, 4), rate = c(2, 3)) 16252.613 17328.591
## dcoga2dim(grid, 3, 4, 2, 3) 57.900 60.961
## pcoga(grid, shape = c(3, 4), rate = c(2, 3)) 37377.394 39449.186
## pcoga2dim(grid, 3, 4, 2, 3) 3815.546 3822.076
## mean median uq max neval
## 25420.42989 18585.6300 30307.958 108431.324 100
## 68.83258 70.4005 74.751 104.662 100
## 55977.01005 60603.5225 65257.359 72600.251 100
## 3856.40268 3831.1960 3839.320 4843.248 100
Please take care of that R functions dcoga
, pcoga
, and rcoga
in this package can handle different lengths of parameter shape
and rate
by recycling shorter parameter. That means that dcoga(3, c(2,3), c(3,4,5,3,4))
and dcoga(3, c(2,3,2,3,2), c(3,4,5,3,4))
will give the same result. If the length of the longer parameter is not a multiple of the length of shorter one, these three R functions will give a Warning message.
[1] Moschopoulos, Peter G. “The distribution of the sum of independent gamma random variables.” Annals of the Institute of Statistical Mathematics 37.1 (1985): 541-544.
[2] Mathai, A.M.: Storage capacity of a dam with gamma type inputs. Ann. Inst. Statist.Math. 34, 591-597 (1982).