The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Recall that a random variable \(X\) has (univariate) negative binomial law with parameters \(\kappa>0, 0<p<1\), i.e., \(X\sim \text{NB}(\kappa, p)\) if its probability mass function is given by \[ P(X=x) = {\kappa+x-1 \choose x} p^x(1-p)^{\kappa}, \quad x \in \{0,1,\ldots\}. \]
A random vector \({\bf X}=(X_1,\dots,X_n)'\) is said to follow the multivariate negative binomial distribution with parameters \(\kappa, p_1, \dots, p_n\) if its probability mass function is given by \[ P({\bf X}={\bf x})=\frac{\Gamma(x_1+\cdots+x_n+\kappa)}{x_1! \cdots x_n! \Gamma(\kappa)}p_1^{x_1}\cdots p_n^{x_n}(1-p_1-\cdots-p_n)^{\kappa}, \] where, for \(i=1,\dots,n\), \(x_i\in\{0,1,\dots\}\), \(0<p_i<1\) such that \(\sum_{i=1}^np_i<1\) and \(\kappa>0\).
We note that the function stats::rnbinom
can be used to simulate from the univariate negative binomial distribution. The trawl
package introduces the function Bivariate_NBsim
which generates samples from the bivariate negative binomial distribution. The simulation algorithm proceeds in two steps: First, we simulate \(X_1\) from the univariate negative binomial distribution NB(\(\kappa\),\(p_1/(1-p_2)\)). Second, we simulate \(X_2|X_1=x_1\) from the univariate negative binomial distribution NB(\(\kappa+x_1,p_2\)), see for instance (Dunn 1967).
###Example
set.seed(1)
<- 3
kappa<- 0.1
p1 <- 0.85
p2 <- p1+p2
p <-1-p1-p2
p0
<- 10000
N
#Simulate from the bivariate negative binomial distribution
<- trawl::Bivariate_NBsim(N,kappa,p1,p2)
y
#Compare the empirical and theoretical mean of the first component
::mean(y[,1])
base#> [1] 5.9653
*p1/(1-p)
kappa#> [1] 6
#Compare the empirical and theoretical variance of the first component
::var(y[,1])
stats#> [1] 18.0047
*p1*(1-p2)/(1-p)^2
kappa#> [1] 18
#Compare the empirical and theoretical mean of the second component
::mean(y[,2])
base#> [1] 50.9742
*p2/(1-p)
kappa#> [1] 51
#Compare the empirical and theoretical variance of the second component
::var(y[,2])
stats#> [1] 926.3358
*p2*(1-p1)/(1-p)^2
kappa#> [1] 918
#Compare the empirical and theoretical correlation between the two components
::cor(y[,1],y[,2])
stats#> [1] 0.7891007
*p2/(p0+p1)*(p0+p2))^(1/2)
(p1#> [1] 0.7141428
We say that a vector \({\bf X}=(X_1,\dots,X_n)'\) follows the multivariate logarithmic series distribution (LSD), see, e.g., (Patil and Bildikar 1967). \({\bf X} \sim \text{LSD}(p_1,\ldots,p_n)\), where \(0<p_i<1, p:=\sum_{i=1}^np_i <1\) if for \({\bf x}\in \mathbb{N}_0^n \setminus \{{\bf 0} \}\), if its probability mass function is given by \[
P({\bf X}={\bf x})=\frac{\Gamma(x_1+\cdots+x_n)}{x_1! \cdots x_n!}\frac{p_1^{x_1}\cdots p_n^{x_n}}{\{-\ln(1-p)\}}.
\] Note that each component \(X_i\) follows the modified univariate logarithmic distribution with parameters \(\tilde p_i = p_i/(1-p+p_i)\) and \(\delta_i= \ln(1-p+p_i)/\ln(1-p)\), i.e., \(X_i\sim\text{ModLSD}(\tilde p_i, \delta_i)\) with \[
P(X_i =x_i) = \left\{ \begin{array}{ll} \delta_i, & \text{for } x_i=0\\
(1-\delta_i) \frac{1}{x_i}\frac{\tilde p_i^{x_i}}{\{-\ln(1-\tilde p_i)\}}, & \text{for } x_i \in \mathbb{N}. \end{array} \right.
\] Simulations from the univariate LSD can be carried out using the function Runuran::urlogarithmic
. The trawl
package implements the functions Bivariate_LSDsim
and Trivariate_LSDsim
to simulate from both the bivariate and the trivariate logarithmic series distribution.
The function Bivariate_NBsim
can be used to simulate from the bivariate logarithmic series distribution. To this end, note that the probability mass function of a random vector \({\bf X}=(X_1,X_2)'\) following the bivariate logarithmic series distribution with parameters \(0<p_1, p_2<1\) with \(p:=p_1+p_2<1\) is given by \[P(X_1=x_1,X_2=x_2)=\frac{\Gamma(x_1+x_2)}{x_1!x_2!}
\frac{p_1^{x_1}p_2^{x_2}}{\{-\ln(1-p)\}},\] for \(x_1,x_2=0,1,2,\dots\) such that \(x_1+x_2>0\).
The simulation proceeds in two steps: First, \(X_1\) is simulated from the modified logarithmic distribution with parameters \(\tilde p_1=p_1/(1-p_2)\) and \(\delta_1=\ln(1-p_2)/\ln(1-p)\). Then we simulate \(X_2\) conditional on \(X_1\). We note that \(X_2|X_1=x_1\) follows the logarithmic series distribution with parameter \(p_2\) when \(x_1=0\), and the negative binomial distribution with parameters \((x_1,p_2)\) when \(x_1>0\).
Next we provide an example of a simulation from the bivariate LSD and we showcase the functions ModLSD_Mean
, ModLSD_Var
, BivLSD_Cor
and BivLSD_Cov
which compute the mean and the variance of the univariate modified LSD and the correlation and covariance of the bivariate LSD, respectively.
set.seed(1)
<-0.15
p1<-0.3
p2
<-10000
N
#Simulate N realisations from the bivariate LSD
<-trawl::Bivariate_LSDsim(N, p1, p2)
y
#Compute the empirical and theoretical mean of the first component
::mean(y[,1])
base#> [1] 0.4575
::ModLSD_Mean(base::log(1-p2)/base::log(1-p1-p2),p1/(1-p2))
trawl#> [1] 0.45619
#Compute the empirical and theoretical mean of the second component
::mean(y[,2])
base#> [1] 0.8995
::ModLSD_Mean(base::log(1-p1)/base::log(1-p1-p2),p2/(1-p1))
trawl#> [1] 0.91238
#Compute the empirical and theoretical variance of the first component
::var(y[,1])
stats#> [1] 0.3686306
::ModLSD_Var(base::log(1-p2)/base::log(1-p1-p2),p1/(1-p2))
trawl#> [1] 0.3724961
#Compute the empirical and theoretical variance of the second component
::var(y[,2])
stats#> [1] 0.5690567
::ModLSD_Var(base::log(1-p1)/base::log(1-p1-p2),p2/(1-p1))
trawl#> [1] 0.5776045
##Compute the empirical and theoretical correlation between the two components
::cor(y[,1],y[,2])
stats#> [1] -0.3961486
::BivLSD_Cor(p1,p2)
trawl#> [1] -0.3608673
##Compute the empirical and theoretical covariance between the two components
::cov(y[,1],y[,2])
stats#> [1] -0.1814394
::BivLSD_Cov(p1,p2)
trawl#> [1] -0.1673877
The function Trivariate_NBsim
can be used to simulate from the trivariate logarithmic series distribution. The simulation proceeds in two steps: First, \(X_1\) is simulated from the modified logarithmic distribution with parameters \(\tilde p_1=p_1/(1-p_2-p_3)\) and \(\delta_1=\ln(1-p_2-p_3)/\ln(1-p)\). Then we simulate \((X_2,X_3)'\) conditional on \(X_1\). We note that \((X_2,X_3)'|X_1=x_1\) follows the bivariate logarithmic series distribution with paramaters \((p_2,p_3)\) when \(x_1=0\), and the bivariate negative binomial distribution with parameters \((x_1,p_2,p_3)\) when \(x_1>0\).
set.seed(1)
<-0.15
p1<-0.25
p2<-0.55
p3
<- 10000
N
#Simulate N realisations from the bivariate LSD
<-trawl::Trivariate_LSDsim(N, p1, p2, p3)
y
#Compute the empirical and theoretical mean of the first component
::mean(y[,1])
base#> [1] 1.0152
::ModLSD_Mean(base::log(1-p2-p3)/base::log(1-p1-p2-p3),p1/(1-p2-p3))
trawl#> [1] 1.001425
#Compute the empirical and theoretical mean of the second component
::mean(y[,2])
base#> [1] 1.6857
::ModLSD_Mean(base::log(1-p1-p3)/base::log(1-p1-p2-p3),p2/(1-p1-p3))
trawl#> [1] 1.669041
#Compute the empirical and theoretical mean of the third component
::mean(y[,3])
base#> [1] 3.7275
::ModLSD_Mean(base::log(1-p1-p2)/base::log(1-p1-p2-p3),p3/(1-p1-p2))
trawl#> [1] 3.67189
#Compute the empirical and theoretical variance of the first component
::var(y[,1])
stats#> [1] 2.999069
::ModLSD_Var(base::log(1-p2-p3)/base::log(1-p1-p2-p3),p1/(1-p2-p3))
trawl#> [1] 3.002847
#Compute the empirical and theoretical variance of the second component
::var(y[,2])
stats#> [1] 7.255041
::ModLSD_Var(base::log(1-p1-p3)/base::log(1-p1-p2-p3),p2/(1-p1-p3))
trawl#> [1] 7.228548
#Compute the empirical and theoretical variance of the third component
::var(y[,3])
stats#> [1] 30.87613
::ModLSD_Var(base::log(1-p1-p2)/base::log(1-p1-p2-p3),p3/(1-p1-p2))
trawl#> [1] 30.5799
#Computing the bivariate covariances and correlations
#Cor(X1,X2):
<- base::log(1-p3)/base::log(1-p1-p2-p3)
delta <-p1/(1-p3)
hatp1 <-p2/(1-p3)
hatp2
::cov(y[,1],y[,2])
stats#> [1] 3.316309
::BivModLSD_Cov(delta,hatp1,hatp2)
trawl#> [1] 3.335704
::cor(y[,1],y[,2])
stats#> [1] 0.7109545
::BivModLSD_Cor(delta,hatp1,hatp2)
trawl#> [1] 0.6041258
#Cor(X1,X3):
<- log(1-p2)/log(1-p1-p2-p3)
delta <-p1/(1-p2)
hatp1 <-p3/(1-p2)
hatp2
::cov(y[,1],y[,3])
stats#> [1] 7.287671
::BivModLSD_Cov(delta,hatp1,hatp2)
trawl#> [1] 7.338549
::cor(y[,1],y[,3])
stats#> [1] 0.7573281
::BivModLSD_Cor(delta,hatp1,hatp2)
trawl#> [1] 0.7220044
#Cor(X2,X3):
<- log(1-p1)/log(1-p1-p2-p3)
delta <-p2/(1-p1)
hatp1 <-p3/(1-p1)
hatp2
::cov(y[,2],y[,3])
stats#> [1] 12.31899
::BivModLSD_Cov(delta,hatp1,hatp2)
trawl#> [1] 12.23092
::cor(y[,2],y[,3])
stats#> [1] 0.8230829
::BivModLSD_Cor(delta,hatp1,hatp2)
trawl#> [1] 0.7969081
##References
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.