Introduction to Multivariable Functional Mendelian Randomization

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Nicole Fontana

2026-02-05

Overview

This vignette introduces Multivariable Functional Mendelian Randomization (MV-FMR), a method to estimate time-varying causal effects of multiple correlated longitudinal exposures on health outcomes.

Key Features

Joint estimation of multiple time-varying exposures
Automatic component selection via cross-validation
Support for continuous and binary outcomes
Bootstrap inference for confidence intervals
Instrument strength diagnostics

When to Use MV-FMR

Use joint multivariable estimation (mvfmr()) when:

You have multiple exposures that may be correlated
Exposures may share horizontal pleiotropy (common genetic variants)
There may be mediation between exposures
You want to estimate independent causal effects of each exposure

Installation

# Install from GitHub
devtools::install_github("NicoleFontana/mvfmr")

library(mvfmr)
library(fdapace)
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.3.3

Example: Joint Estimation of Two Exposures

Step 1: Simulate Data

We generate genetic instruments and two longitudinal exposures:

set.seed(12345)

# Generate exposure data
sim_data <- getX_multi_exposure(
  N = 300,              # Sample size
  J = 25,               # Number of genetic instruments
  nSparse = 10          # Observations per subject
)

# Check dimensions
cat("Sample size:", nrow(sim_data$details$G), "\n")
#> Sample size: 300
cat("Number of instruments:", ncol(sim_data$details$G), "\n")
#> Number of instruments: 25

Step 2: Generate Outcome

We create an outcome with linear effect from X1 and quadratic effect from X2:

outcome_data <- getY_multi_exposure(
  sim_data,
  X1Ymodel = "2",     # Linear: β₁(t) = 0.02 * t
  X2Ymodel = "5",     # Late-age effect: β₂(t) = 0.1 * (t > 30)
  X1_effect = TRUE,
  X2_effect = TRUE,
  outcome_type = "continuous"
)

cat("Outcome summary:\n")
#> Outcome summary:
summary(outcome_data$Y)
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -111.070  -31.849   -5.190   -3.012   26.394  112.621

Step 3: Functional Principal Component Analysis

We apply FPCA to both exposures to extract principal components:

# FPCA for exposure 1
fpca1 <- FPCA(
  sim_data$X1$Ly_sim, 
  sim_data$X1$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

# FPCA for exposure 2
fpca2 <- FPCA(
  sim_data$X2$Ly_sim, 
  sim_data$X2$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

cat("FPCA completed:\n")
#> FPCA completed:
cat("  Exposure 1:", fpca1$selectK, "components selected\n")
#>   Exposure 1: 3 components selected
cat("  Exposure 2:", fpca2$selectK, "components selected\n")
#>   Exposure 2: 4 components selected

Step 4: Joint Multivariable Estimation

Now we perform joint estimation using mvfmr():

result_joint <- mvfmr(
  G = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_data$Y,
  outcome_type = "continuous",
  method = "gmm",
  max_nPC1 = 4,
  max_nPC2 = 4,
  improvement_threshold = 0.001,
  bootstrap = FALSE,
  n_cores = 1,
  true_effects = list(model1 = "2", model2 = "5"),
  X_true = list(X1_true = sim_data$details$X1, X2_true = sim_data$details$X2),
  verbose = FALSE
)

# View results
print(result_joint)
#> 
#> Functional Multivariable MR Result
#> ==================================
#> Exposures: 2 
#> Sample size: 300 
#> Outcome: continuous 
#> Method: gmm 
#> Components selected: nPC1 = 2 , nPC2 = 2 
#> 
#> Performance Metrics:
#>   Exposure 1 - MISE: 0.001899 , Coverage: 0.824 
#>   Exposure 2 - MISE: 0.001591 , Coverage: 0.922

Step 5: Visualize Time-Varying Effects

The estimated time-varying causal effects can be visualized using the built-in plot method:

# Plot both effects
plot(result_joint)

The solid colored lines show the estimated time-varying causal effects, with shaded bands representing 95% confidence intervals. The dashed red lines (when present) indicate the true time-varying effects used in the simulation.

Step 6: Extract Coefficients

# Estimated beta coefficients for basis functions
coef(result_joint)
#> [1] -3.7775154  1.5166093 -0.3552057  0.4624341

# Time-varying effects at each time point
head(result_joint$effects$effect1)
#> [1] 0.02665083 0.03882021 0.05180349 0.06570553 0.08061147 0.09654441
head(result_joint$effects$effect2)
#> [1] -0.04923079 -0.05503448 -0.06005135 -0.06404133 -0.06679739 -0.06817082

Step 7: Performance Metrics

Since we simulated data with known truth, we can evaluate performance:

cat("Performance Metrics:\n")
#> Performance Metrics:
cat("\nExposure 1 (Linear effect):\n")
#> 
#> Exposure 1 (Linear effect):
cat("  MISE:", round(result_joint$performance$MISE1, 6), "\n")
#>   MISE: 0.001899
cat("  Coverage:", round(result_joint$performance$Coverage1, 3), "\n")
#>   Coverage: 0.824

cat("\nExposure 2 (Quadratic effect):\n")
#> 
#> Exposure 2 (Quadratic effect):
cat("  MISE:", round(result_joint$performance$MISE2, 6), "\n")
#>   MISE: 0.001591
cat("  Coverage:", round(result_joint$performance$Coverage2, 3), "\n")
#>   Coverage: 0.922

MISE (Mean Integrated Squared Error): measures accuracy of effect estimation
Coverage: proportion of time points where true effect falls within 95% CI

Comparison: Joint vs. Separate Estimation

We can compare joint (multivariable) with separate (univariable) estimation:

result_separate <- mvfmr_separate(
  G1 = sim_data$details$G,
  G2 = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_data$Y,
  outcome_type = "continuous",
  method = "gmm",
  max_nPC1 = 4,
  max_nPC2 = 4,
  n_cores = 1,
  true_effects = list(model1 = "2", model2 = "5"),
  verbose = FALSE
)
#> [1] "Processing X1"
#> [1] "Processing X2"

print(result_separate)
#> 
#> Separate Univariable MR Results
#> ================================
#> Exposures: 2 
#> Separate instruments: TRUE 
#> Outcome: continuous 
#> Method: gmm 
#> 
#> Exposure 1:
#>   Components: 2 
#>   MSE: 0.005231 
#>   Coverage: 1 
#> 
#> Exposure 2:
#>   Components: 2 
#>   MSE: 0.075436 
#>   Coverage: 0.706

Performance Comparison

comparison <- data.frame(
  Method = rep(c("Joint (MV-FMR)", "Separate (U-FMR)"), each = 2),
  Exposure = rep(c("X1", "X2"), times = 2),
  MISE = c(
    result_joint$performance$MISE1,
    result_joint$performance$MISE2,
    result_separate$exposure1$performance$MISE,
    result_separate$exposure2$performance$MISE
  ),
  Coverage = c(
    result_joint$performance$Coverage1,
    result_joint$performance$Coverage2,
    result_separate$exposure1$performance$Coverage,
    result_separate$exposure2$performance$Coverage
  )
)

print(comparison)
#>             Method Exposure        MISE  Coverage
#> 1   Joint (MV-FMR)       X1 0.001899365 0.8235294
#> 2   Joint (MV-FMR)       X2 0.001591153 0.9215686
#> 3 Separate (U-FMR)       X1 0.005230941 1.0000000
#> 4 Separate (U-FMR)       X2 0.075436010 0.7058824

Key Insight: Joint estimation (MV-FMR) typically performs better when exposures are correlated or share pleiotropic instruments.

Instrument Strength Diagnostics

Check instrument strength using F-statistics:

# Calculate F-statistics
K_total <- result_joint$nPC_used$nPC1 + result_joint$nPC_used$nPC2

fstats <- IS(
  J = ncol(sim_data$details$G),
  K = K_total,
  PC = 1:K_total,
  datafull = cbind(
    sim_data$details$G,
    cbind(fpca1$xiEst[, 1:result_joint$nPC_used$nPC1], 
          fpca2$xiEst[, 1:result_joint$nPC_used$nPC2])
  ),
  Y = outcome_data$Y
)

print(fstats)
#>      PC         RR        FF       cFF   Qvalue df       pvalue
#> [1,]  1 0.06929360 0.8160015 0.5010741 227.2324 21 1.468923e-36
#> [2,]  2 0.06516010 0.7639326 0.5788612 227.2324 21 1.468923e-36
#> [3,]  3 0.08536319 1.0228984 0.4612395 227.2324 21 1.468923e-36
#> [4,]  4 0.09989841 1.2164033 1.0989657 227.2324 21 1.468923e-36

FF: Standard F-statistic for instrument strength
cFF: Conditional F-statistic (accounts for other exposures)
Qvalue: Hansen’s J overidentification test statistic
pvalue: P-value for Q-test (high p-value = instruments are valid)

Rule of thumb: cFF > 10 indicates strong instruments.

Binary Outcomes

MV-FMR also supports binary outcomes using the control function approach:

# Generate binary outcome
outcome_binary <- getY_multi_exposure(
  sim_data,
  X1Ymodel = "2",
  X2Ymodel = "5",
  outcome_type = "binary"
)

# Estimate with control function
result_binary <- mvfmr(
  G = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_binary$Y,
  outcome_type = "binary",
  method = "cf",  # Control function for binary
  max_nPC1 = 3,
  max_nPC2 = 3,
  n_cores = 1,
  verbose = FALSE
)

print(result_binary)

Next Steps

Learn More

Univariable FMR: See vignette("univariable-fmr") for single exposure analysis
Advanced examples: Check inst/examples/test_MV-FMR.R for complete scenarios
Manuscript: See paper for methodological details

Citation

If you use this package, please cite:

Fontana, N., Ieva, F., Zuccolo, L., Di Angelantonio, E., & Secchi, P. (2025). Unraveling time-varying causal effects of multiple exposures: integrating Functional Data Analysis with Multivariable Mendelian Randomization. arXiv preprint arXiv:2512.19064.

Session Info

sessionInfo()
#> R version 4.3.0 (2023-04-21)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Rome
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggplot2_3.5.2 fdapace_0.5.9 mvfmr_0.1.0  
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtable_0.3.6        shape_1.4.6         xfun_0.41          
#>  [4] bslib_0.6.1         htmlwidgets_1.6.4   lattice_0.21-8     
#>  [7] numDeriv_2016.8-1.1 vctrs_0.6.5         tools_4.3.0        
#> [10] generics_0.1.3      parallel_4.3.0      tibble_3.2.1       
#> [13] fansi_1.0.6         highr_0.10          cluster_2.1.4      
#> [16] pkgconfig_2.0.3     Matrix_1.6-4        data.table_1.14.10 
#> [19] checkmate_2.3.1     lifecycle_1.0.4     farver_2.1.1       
#> [22] compiler_4.3.0      stringr_1.5.1       progress_1.2.3     
#> [25] munsell_0.5.0       codetools_0.2-19    htmltools_0.5.7    
#> [28] sass_0.4.8          yaml_2.3.8          glmnet_4.1-8       
#> [31] htmlTable_2.4.2     Formula_1.2-5       pracma_2.4.4       
#> [34] pillar_1.9.0        crayon_1.5.3        jquerylib_0.1.4    
#> [37] MASS_7.3-58.4       cachem_1.0.8        Hmisc_5.1-1        
#> [40] iterators_1.0.14    rpart_4.1.19        foreach_1.5.2      
#> [43] tidyselect_1.2.0    digest_0.6.33       stringi_1.8.3      
#> [46] dplyr_1.1.4         labeling_0.4.3      splines_4.3.0      
#> [49] fastmap_1.1.1       grid_4.3.0          colorspace_2.1-0   
#> [52] cli_3.6.2           magrittr_2.0.3      base64enc_0.1-3    
#> [55] survival_3.5-7      utf8_1.2.4          withr_2.5.2        
#> [58] foreign_0.8-84      prettyunits_1.2.0   scales_1.3.0       
#> [61] backports_1.4.1     rmarkdown_2.25      nnet_7.3-18        
#> [64] gridExtra_2.3       hms_1.1.3           evaluate_0.23      
#> [67] knitr_1.45          doParallel_1.0.17   rlang_1.1.6        
#> [70] Rcpp_1.0.11         glue_1.6.2          pROC_1.18.5        
#> [73] rstudioapi_0.15.0   jsonlite_1.8.8      R6_2.5.1           
#> [76] plyr_1.8.9

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.