Rats: a normal hierarchical model
This example is taken from section 6 of Gelfand
et al
(1990), and concerns 30 young rats whose weights were measured weekly for five weeks. Part of the data is shown below, where Y
ij
is the weight of the ith rat measured at age x
j
.
![[temp1]](temp1.bmp)
A plot of the 30 growth curves suggests some evidence of downward curvature.
The model is essentially a random effects linear growth curve
Y
i
j
~ Normal(
a
i
+
b
i
(x
j
- x
bar
),
t
c
)
a
i
~ Normal(
a
c
,
t
a
)
b
i
~ Normal(
b
c
,
t
b
)
where x
ba
r
= 22, and
t
represents the
precision
(1/variance) of a normal distribution. We note the absence of a parameter representing correlation between
a
i
and
b
i
unlike in Gelfand
et al
1990. However, see the
Birats
example in Volume 2 which does explicitly model the covariance between
a
i
and
b
i
. For now, we standardise the x
j
's around their mean to reduce dependence between
a
i
and
b
i
in their likelihood: in fact for the full balanced data, complete independence is achieved. (Note that, in general, prior independence does not force the posterior distributions to be independent).
a
c
,
t
a
,
b
c
,
t
b
,
t
c
are given independent ``noninformative'' priors. Interest particularly focuses on the intercept at zero time (birth), denoted
a
0
=
a
c
-
b
c
x
bar
.
Graphical model for rats example:
BUGS
language for rats example:
model
{
for( i in 1 : N ) {
for( j in 1 : T ) {
Y[i , j] ~ dnorm(mu[i , j],tau.c)
mu[i , j] <- alpha[i] + beta[i] * (x[j] - xbar)
}
alpha[i] ~ dnorm(alpha.c,alpha.tau)
beta[i] ~ dnorm(beta.c,beta.tau)
}
tau.c ~ dgamma(0.001,0.001)
sigma <- 1 / sqrt(tau.c)
alpha.c ~ dnorm(0.0,1.0E-6)
alpha.tau ~ dgamma(0.001,0.001)
beta.c ~ dnorm(0.0,1.0E-6)
beta.tau ~ dgamma(0.001,0.001)
alpha0 <- alpha.c - xbar * beta.c
}
Note the use of a very flat but conjugate prior for the population effects: a locally uniform prior could also have been used.
Data
( click to open )
Inits
( click to open )
(Note: the response data (Y) for the rats example can also be found in the file ratsy.odc in rectangular format. The covariate data (X) can be found in S-Plus format in file ratsx.odc. To load data from each of these files, focus the window containing the open data file before clicking on "load data" from the "Specification" dialog.)
Results
A 1000 update burn in followed by a further 10000 updates gave the parameter estimates:
These results may be compared with Figure 5 of Gelfand
et al
1990 --- we note that the mean gradient of independent fitted straight lines is 6.19.
Gelfand
et al
1990 also consider the problem of missing data, and delete the last observation of cases 6-10, the last two from 11-20, the last 3 from 21-25 and the last 4 from 26-30. The appropriate data file is obtained by simply replacing data values by NA (see below). The model specification is unchanged, since the distinction between observed and unobserved quantities is made in the data file and not the model specification.
Data
( click to open )
Gelfand
et al
1990 focus on the parameter estimates and the predictions for the final 4 observations on rat 26. These predictions are obtained automatically in
BUGS
by monitoring the relevant Y[] nodes. The following estimates were obtained:
We note that our estimate 6.58 of
b
c
is substantially greater than that shown in Figure 6 of Gelfand
et
al
1990. However, plotting the growth curves indicates some curvature with steeper gradients at the beginning: the mean of the estimated gradients of the reduced data is 6.66, compared to 6.19 for the full data. Hence we are inclined to believe our analysis. The observed weights for rat 26 were 207, 257, 303 and 345, compared to our predictions of 204, 250, 295 and 341.
Seeds: Random effect logistic regression
This example is taken from Table 3 of Crowder (1978), and concerns the proportion of seeds that germinated on each of 21 plates arranged according to a 2 by 2 factorial layout by seed and type of root extract. The data are shown below, where r
i
and n
i
are the number of germinated and the total number of seeds on the
i
th plate,
i
=1,...,N. These data are also analysed by, for example, Breslow: and Clayton (1993).
![[temp6]](temp6.bmp)
The model is essentially a random effects logistic, allowing for over-dispersion. If p
i
is the probability of germination on the
i
th plate, we assume
r
i
~ Binomial(p
i
, n
i
)
logit(p
i
) =
a
0
+
a
1
x
1i
+
a
2
x
2i
+
a
12
x
1i
x
2i
+ b
i
b
i
~ Normal(0,
t
)
where x
1i
, x
2i
are the seed type and root extract of the
i
th plate, and an interaction term
a
12
x
1i
x
2i
is included.
a
0
,
a
1
,
a
2
,
a
12
,
t
are given independent "noninformative" priors.
Graphical model for seeds example
![[temp7]](temp7.bmp)
BUGS
language for seeds example
model
{
for( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
b[i] ~ dnorm(0.0,tau)
logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0.0,1.0E-6)
alpha1 ~ dnorm(0.0,1.0E-6)
alpha2 ~ dnorm(0.0,1.0E-6)
alpha12 ~ dnorm(0.0,1.0E-6)
tau ~ dgamma(0.001,0.001)
sigma <- 1 / sqrt(tau)
}
Data
( click to open )
Inits
( click to open )
Results
A burn in of 1000 updates followed by a further 10000 updates gave the following parameter estimates:
We may compare simple logistic, maximum likelihood (from EGRET), penalized quasi-likelihood (PQL) Breslow and Clayton (1993) with the
BUGS
results
![[temp9]](temp9.bmp)
Heirarchical centering is an interesting reformulation of random effects models. Introduce the variables
m
i
=
a
0
+
a
1
x
1i
+
a
2
x
2i
+
a
12
x
1i
x
2i
b
i
=
m
i
+ b
i
the model then becomes
r
i
~ Binomial(p
i
, n
i
)
logit(p
i
) =
b
i
b
i
~ Normal(
m
i
,
t
)
The graphical model is shown below
![[temp10]](temp10.bmp)
This formulation of the model has two advantages: the squence of random numbers generated by the Gibbs sampler has better correlation properties and the time per update is reduced because the updating for the
a
parameters is now conjugate.