The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This document provides a brief tutorial to analyzing twin data using
the mets
package:
\(\newcommand{\cov}{\mathbb{C}\text{ov}} \newcommand{\cor}{\mathbb{C}\text{or}} \newcommand{\var}{\mathbb{V}\text{ar}} \newcommand{\E}{\mathbb{E}} \newcommand{\unitfrac}[2]{#1/#2} \newcommand{\n}{}\)
In the following we examine the heritability of Body Mass Indexkorkeila_bmi_1991 hjelmborg_bmi_2008, based on data on self-reported BMI-values from a random sample of 11,411 same-sex twins. First, we will load data
library(mets)
data("twinbmi")
head(twinbmi)
#> tvparnr bmi age gender zyg id num
#> 1 1 26.33289 57.51212 male DZ 1 1
#> 2 1 25.46939 57.51212 male DZ 1 2
#> 3 2 28.65014 56.62696 male MZ 2 1
#> 5 3 28.40909 57.73097 male DZ 3 1
#> 7 4 27.25089 53.68683 male DZ 4 1
#> 8 4 28.07504 53.68683 male DZ 4 2
The data is on long format with one subject per row.
tvparnr
: twin idbmi
: Body Mass Index (\(\mathrm{kg}/{\mathrm{m}^2}\))age
: Age (years)gender
: Gender factor
(male,female)zyg
: zygosity (MZ, DZ)We transpose the data allowing us to do pairwise analyses
twinwide <- fast.reshape(twinbmi, id="tvparnr",varying=c("bmi"))
head(twinwide)
#> tvparnr bmi1 age gender zyg id num bmi2
#> 1 1 26.33289 57.51212 male DZ 1 1 25.46939
#> 3 2 28.65014 56.62696 male MZ 2 1 NA
#> 5 3 28.40909 57.73097 male DZ 3 1 NA
#> 7 4 27.25089 53.68683 male DZ 4 1 28.07504
#> 9 5 27.77778 52.55838 male DZ 5 1 NA
#> 11 6 28.04282 52.52231 male DZ 6 1 22.30936
Next we plot the association within each zygosity group
We here show the log-transformed data which is slightly more symmetric and more appropiate for the twin analysis (see Figure @ref(fig:scatter1) and @ref(fig:scatter2))
The plots and raw association measures shows considerable stronger dependence in the MZ twins, thus indicating genetic influence of the trait
cor.test(mz[,1],mz[,2], method="spearman")
#>
#> Spearman's rank correlation rho
#>
#> data: mz[, 1] and mz[, 2]
#> S = 165457624, p-value < 2.2e-16
#> alternative hypothesis: true rho is not equal to 0
#> sample estimates:
#> rho
#> 0.6956209
cor.test(dz[,1],dz[,2], method="spearman")
#>
#> Spearman's rank correlation rho
#>
#> data: dz[, 1] and dz[, 2]
#> S = 2162514570, p-value < 2.2e-16
#> alternative hypothesis: true rho is not equal to 0
#> sample estimates:
#> rho
#> 0.4012686
Ńext we examine the marginal distribution (GEE model with working independence)
l0 <- lm(bmi ~ gender + I(age-40), data=twinbmi)
estimate(l0, id=twinbmi$tvparnr)
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) 23.3687 0.054534 23.2618 23.4756 0.000e+00
#> gendermale 1.4077 0.073216 1.2642 1.5512 2.230e-82
#> I(age - 40) 0.1177 0.004787 0.1083 0.1271 1.499e-133
library("splines")
l1 <- lm(bmi ~ gender*ns(age,3), data=twinbmi)
marg1 <- estimate(l1, id=twinbmi$tvparnr)
dm <- lava::Expand(twinbmi,
bmi=0,
gender=c("male"),
age=seq(33,61,length.out=50))
df <- lava::Expand(twinbmi,
bmi=0,
gender=c("female"),
age=seq(33,61,length.out=50))
plot(marg1, function(p) model.matrix(l1,data=dm)%*%p,
data=dm["age"], ylab="BMI", xlab="Age",
ylim=c(22,26.5))
plot(marg1, function(p) model.matrix(l1,data=df)%*%p,
data=df["age"], col="red", add=TRUE)
legend("bottomright", c("Male","Female"),
col=c("black","red"), lty=1, bty="n")
We can decompose the trait into the following variance components
\[\begin{align*} Y_i = A_i + D_i + C + E_i, \quad i=1,2 \end{align*}\]
Dissimilarity of MZ twins arises from unshared environmental effects only, \(\cor(E_1,E_2)=0\) and
\[\begin{align*} \cor(A_1^{MZ},A_2^{MZ}) = 1, \quad \cor(D_1^{MZ},D_2^{MZ}) = 1, \end{align*}\]
\[\begin{align*} \cor(A_1^{DZ},A_2^{DZ}) = 0.5, \quad \cor(D_1^{DZ},D_2^{DZ}) = 0.25, \end{align*}\]
\[\begin{align*} Y_i = A_i + C_i + D_i + E_i \end{align*}\]
\[\begin{align*} A_i \sim\mathcal{N}(0,\sigma_A^2), C_i \sim\mathcal{N}(0,\sigma_C^2), D_i \sim\mathcal{N}(0,\sigma_D^2), E_i \sim\mathcal{N}(0,\sigma_E^2) \end{align*}\]
\[\begin{gather*} \cov(Y_{1},Y_{2}) = \\ \begin{pmatrix} \sigma_A^2 & 2\Phi\sigma_A^2 \\ 2\Phi\sigma_A^2 & \sigma_A^2 \end{pmatrix} + \begin{pmatrix} \sigma_C^2 & \sigma_C^2 \\ \sigma_C^2 & \sigma_C^2 \end{pmatrix} + \begin{pmatrix} \sigma_D^2 & \Delta_{7}\sigma_D^2 \\ \Delta_{7}\sigma_D^2 & \sigma_D^2 \end{pmatrix} + \begin{pmatrix} \sigma_E^2 & 0 \\ 0 & \sigma_E^2 \end{pmatrix} \end{gather*}\]
Saturated model (different marginals in MZ and DZ twins and different marginals for twin 1 and twin 2):
Different marginals for MZ and DZ twins (but same marginals within a pair)
Same marginals but free correlation with MZ, DZ
lu <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="eqmarg")
estimate(lu)
#> Estimate Std.Err 2.5% 97.5% P-value
#> bmi.1@1 18.6037 0.251036 18.1116 19.0957 0.000e+00
#> bmi.1~age.1@1 0.1189 0.005635 0.1078 0.1299 9.177e-99
#> bmi.1~gendermale.1@1 1.3848 0.086573 1.2151 1.5544 1.376e-57
#> log(var)@1 2.4424 0.022095 2.3991 2.4857 0.000e+00
#> atanh(rhoMZ)@1 0.7803 0.036249 0.7092 0.8513 9.008e-103
#> atanh(rhoDZ)@2 0.2987 0.020953 0.2576 0.3397 4.288e-46
A formal test of genetic effects can be obtained by comparing the MZ and DZ correlation:
estimate(lu,lava::contr(5:6,6))
#> Estimate Std.Err 2.5% 97.5% P-value
#> [atanh(rhoMZ)@1] - [a.... 0.4816 0.04177 0.3997 0.5635 9.431e-31
#>
#> Null Hypothesis:
#> [atanh(rhoMZ)@1] - [atanh(rhoDZ)@2] = 0
We also consider the ACE model
ace0 <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="ace")
summary(ace0)
#> Estimate Std. Error Z value Pr(>|z|)
#> bmi 1.8599e+01 2.5576e-01 72.720 <2e-16
#> sd(A) 2.7270e+00 4.2658e-02 63.927 <2e-16
#> sd(C) -1.7119e-06 3.1064e-01 0.000 1
#> sd(E) 2.0276e+00 3.4787e-02 58.286 <2e-16
#> bmi~age 1.1892e-01 5.6246e-03 21.142 <2e-16
#> bmi~gendermale 1.3846e+00 8.8748e-02 15.601 <2e-16
#>
#> MZ-pairs DZ-pairs
#> 1483 2788
#>
#> Variance decomposition:
#> Estimate 2.5% 97.5%
#> A 0.64399 0.61793 0.67005
#> C 0.00000 0.00000 0.00000
#> E 0.35601 0.32995 0.38207
#>
#>
#> Estimate 2.5% 97.5%
#> Broad-sense heritability 0.64399 0.61793 0.67005
#>
#> Estimate 2.5% 97.5%
#> Correlation within MZ: 0.64399 0.61718 0.66931
#> Correlation within DZ: 0.32200 0.30890 0.33497
#>
#> 'log Lik.' -22019.66 (df=6)
#> AIC: 44051.32
#> BIC: 44089.47
[korkeila_bmi_1991] Korkeila, Kaprio, Rissanen & Koskenvuo, Effects of gender and age on the heritability of body mass index, Int J Obes, 15(10), 647-654 (1991). ↩︎
[hjelmborg_bmi_2008] Hjelmborg, Fagnani, Silventoinen, McGue, Korkeila, Christensen, Rissanen & Kaprio, Genetic influences on growth traits of BMI: a longitudinal study of adult twins, Obesity (Silver Spring), 16(4), 847-852 (2008). ↩︎
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.