The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

mvfmr: Multivariable Functional Mendelian Randomization

arXiv

Citation

If you use this package, please cite:

Fontana, N., Ieva, F., Zuccolo, L., Di Angelantonio, E., & Secchi, P. (2025). Unraveling time-varying causal effects of multiple exposures: Integrating functional data analysis with multivariable Mendelian randomization. arXiv. https://arxiv.org/abs/2512.19064
@misc{fontana2025unravelingtimevaryingcausaleffects,
      title={Unraveling time-varying causal effects of multiple exposures: integrating Functional Data Analysis with Multivariable Mendelian Randomization}, 
      author={Nicole Fontana and Francesca Ieva and Luisa Zuccolo and Emanuele Di Angelantonio and Piercesare Secchi},
      year={2025},
      eprint={2512.19064},
      archivePrefix={arXiv},
      primaryClass={stat.AP},
      url={https://arxiv.org/abs/2512.19064}, 
}

Overview

The mvfmr package implements Multivariable Functional Mendelian Randomization methods to estimate time-varying causal effects of longitudinal exposures on health outcomes. The package supports:

Installation

Install the package:

install.packages("devtools") # Install devtools if not already installed
devtools::install_github("NicoleFontana/mvfmr") # Install mvfmr from GitHub

Required dependencies:

install.packages(c("fdapace", "ggplot2", "glmnet", "pROC", "parallel", "doParallel", "foreach", "progress", "dplyr", "gridExtra"))

Test Scripts and Simulations

The package includes comprehensive test scripts demonstrating different use cases:

1. Manuscript Simulations (tests_manuscript.R)

Reproduces main simulation scenarios from the manuscript: - Scenario 1-3: pleiotropy, null effects, mediation - Exposure effects: linear and quadratic - Performance comparison: MV-FMR vs U-FMR across scenarios - Evaluation: MISE, coverage rates

# Download and run manuscript simulations
source("https://raw.githubusercontent.com/NicoleFontana/mvfmr/demo/tests_manuscript.R")

2. Multivariable FMR Tutorial (test_MV-FMR.R)

Complete tutorial for using joint multivariable estimation: - Data simulation with two exposures - FPCA and component selection - Joint estimation with fmvmr() - Instrument diagnostics - Performance metrics and visualization - Binary outcome analysis - Comparison with univariable estimation

# Download and explore MV-FMR tutorial
source("https://raw.githubusercontent.com/NicoleFontana/mvfmr/demo/test_MV-FMR.R")

3. Univariable FMR Tutorial (test_U-FMR.R)

Complete tutorial for single exposure analysis: - Single exposure simulation - Univariable estimation with fmvmr_separate() - Instrument diagnostics - Performance metrics and visualization - Comparison of different exposure effects - Binary outcome analysis

# Download and explore U-FMR tutorial
source("https://raw.githubusercontent.com/NicoleFontana/mvfmr/demo/test_U-FMR.R")

Note: These scripts serve as templates for your own analyses. Download them, modify the parameters, and adapt to your specific research questions.

Quick Start

Example 1: Multivariable Functional MR (two exposures)

library(fdapace)

# Step 1: Simulate exposures data
set.seed(12345)
sim_data <- getX_multi_exposure(
  N = 1000,           # Sample size
  J = 50,             # Number of genetic instruments
  nSparse = 10       # Sparse observations per subject
)

# Step 2: Generate outcome 
outcome_data <- getY_multi_exposure(
  sim_data,
  X1Ymodel = "2",     # Linear effect for exposure 1
  X2Ymodel = "8",     # Quadratic effect for exposure 2
  X1_effect = TRUE,
  X2_effect = TRUE,
  outcome_type = "continuous"
)

# Step 3: Functional PCA for both the exposures
fpca1 <- FPCA(
  sim_data$X1$Ly_sim, 
  sim_data$X1$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

fpca2 <- FPCA(
  sim_data$X2$Ly_sim, 
  sim_data$X2$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

# Step 4: Joint estimation with MV-FMR
result <- fmvmr(
  G = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_data$Y,
  outcome_type = "continuous",
  method = "gmm",
  max_nPC1 = 10,
  max_nPC2 = 10,
  bootstrap = TRUE,
  n_bootstrap = 100
)

# View results
print(result)
summary(result)
plot(result)  # Visualize time-varying effects

# Extract coefficients and effects
coef(result)
result$effects$effect1  # Time-varying effect for exposure 1
result$effects$effect2  # Time-varying effect for exposure 2

Example 2: Univariable Functional MR (one exposure)

The package can also be used for single exposure analysis (U-FMR):

library(fdapace)

# Step 1: Simulate exposure data
set.seed(12345)
sim_data <- getX_multi_exposure(
  N = 1000,
  J = 50,
  nSparse = 10,
  shared_effect = FALSE
)

# Step 2: Generate outcome (only exposure 1 has effect)
outcome_data <- getY_multi_exposure(
  sim_data,
  X1Ymodel = "2",     # Linear effect for exposure 1
  X2Ymodel = "0",     # No effect for exposure 2
  X1_effect = TRUE,
  X2_effect = FALSE,
  outcome_type = "continuous"
)

# Step 3: FPCA for exposure 1 only
fpca1 <- FPCA(
  sim_data$X1$Ly_sim, 
  sim_data$X1$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

# Step 4: Univariable estimation for exposure 1 only
result <- fmvmr_separate(
  G1 = sim_data$details$G,  # Instruments for exposure 1
  G2 = NULL,                # No instruments for exposure 2
  fpca_results = list(fpca1),
  Y = outcome_data$Y,
  outcome_type = "continuous",
  method = "gmm",
  max_nPC1 = 10,
  max_nPC2 = 10
)

# View results for exposure 1
print(result)
coef(result, exposure = 1)
result$exposure1$effect  # Time-varying effect

Example 3: Univariable separate estimation of multiple exposures

Compare joint vs. univariable separate estimation:

# Joint estimation (MV-FMR)
result_joint <- fmvmr(
  G = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_data$Y
)

# Separate estimation (U-FMR for each exposure independently)
result_separate <- fmvmr_separate(
  G1 = sim_data$details$G,
  G2 = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_data$Y
)

# Compare performance
result_joint$performance
result_separate$exposure1$performance
result_separate$exposure2$performance

Example 4: Multvariable two-sample Functional MR

Use outcome GWAS summary statistics instead of individual-level outcome data.

library(fdapace)

# Step 1: Simulate exposure data (individual-level)
set.seed(12345)
sim_data <- getX_multi_exposure(
  N = 5000,           # Exposure sample size
  J = 30,             # Number of genetic instruments (SNPs)
  nSparse = 10
)

# Perform FPCA on longitudinal exposures
fpca1 <- FPCA(
  sim_data$X1$Ly_sim, 
  sim_data$X1$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

fpca2 <- FPCA(
  sim_data$X2$Ly_sim, 
  sim_data$X2$Lt_sim,
  list(dataType = 'Sparse', error = TRUE, verbose = FALSE)
)

# Step 2: Get outcome GWAS summary statistics (from separate study)
# Load by_outcome, sy_outcome, ny_outcome from a public GWAS

# Simulate obtaining summary statistics from a separate GWAS
# (This mimics what you'd get from a published GWAS)
by_outcome <- rnorm(30, mean = 0.02, sd = 0.01)  # SNP-outcome associations
sy_outcome <- runif(30, 0.005, 0.015)            # Standard errors
ny_outcome <- 100000                             # GWAS sample size

# Step 3: Two-sample MV-FMR estimation
result_twosample <- fmvmr_twosample(
  G_exposure = sim_data$details$G,        # Genotypes from exposure sample
  fpca_results = list(fpca1, fpca2),      # FPCA from exposures
  by_outcome = by_outcome,                # GWAS betas (from outcome study)
  sy_outcome = sy_outcome,                # GWAS standard errors
  ny_outcome = ny_outcome,                # GWAS sample size
  max_nPC1 = 3,
  max_nPC2 = 3,
  verbose = TRUE
)

# Step 4: View results
print(result_twosample)

# Extract time-varying effects
result_twosample$effects$effect1 
result_twosample$effects$effect2  

Main Functions

Data Simulation

getX_multi_exposure() - Generate genetic instruments and exposure data

getX_multi_exposure(
  N = 1000,                  # Sample size
  J = 50,                    # Number of genetic instruments
  nSparse = 10,              # Observations per subject
  shared_G_proportion = 0.15 # Proportion of shared instruments (0-1)
)

getX_multi_exposure_mediation() - Generate data with mediation

getX_multi_exposure_mediation(
  N = 1000,                  # Sample size
  J = 50,                    # Number of genetic instruments
  nSparse = 10,              # Observations per subject
  mediation_strength = 0.3,  # Strength of X1 → X2
  mediation_type = "linear"  # "linear", "nonlinear", "time_varying"
)

getY_multi_exposure() - Generate outcome with time-varying effects

getY_multi_exposure(
  RES,                         # Output from getX_multi_exposure()
  X1Ymodel = "2",              # Effect model for X1 (see below)
  X2Ymodel = "8",              # Effect model for X2
  X1_effect = TRUE,            # X1 has an effect on Y
  X2_effect = TRUE,            # X2 has an effect on Y
  outcome_type = "continuous"  # "continuous" or "binary"
)

Available effect models: - "0" - No effect (null) - "1" - Constant effect (β = 0.1) - "2" - Linear increasing (β(t) = 0.02×t) - "3" - Linear decreasing (β(t) = 0.5 - 0.02×t) - "4" - Early life effect (β(t) = 0.1 for t < 20) - "5" - Late life effect (β(t) = 0.1 for t > 30) - "6" - Early decreasing (β(t) = 0.05×(20-t) for t < 20) - "7" - Late increasing (β(t) = 0.05×(t-30) for t > 30) - "8" - Quadratic (β(t) = 0.002×t² - 0.11×t + 0.5) - "9" - Cubic (β(t) = -0.00002×t³ + 0.004×t² - 0.2×t + 1)

Estimation Functions

fmvmr() - Joint multivariable estimation

fmvmr(
  G,                                     # Genetic instrument matrix (N × J)
  fpca_results,                          # List of 2 FPCA objects
  Y,                                     # Outcome vector
  outcome_type = "continuous",           # Type of outcome: "continuous" for numeric outcomes, "binary" for 0/1 outcomes
  method = "gmm",                        # Estimation method: "gmm" (Generalized Method of Moments), "cf" (control function), or "cf-lasso" (control function with Lasso)
  nPC1_selected = NA,                    # Fixed number of principal components to retain for exposure 1 (NA = select automatically)
  max_nPC1 = NA,                         # Maximum number of principal components to retain for exposure 1 (NA = automatically determined)
  nPC2_selected = NA,                    # Fixed number of principal components to retain for exposure 2 (NA = select automatically)
  max_nPC2 = NA,                         # Maximum number of principal components to retain for exposure 2 (NA = automatically determined)
  improvement_threshold = 0.001,         # Minimum cross-validation improvement required to add an additional principal component
  bootstrap = FALSE,                     # Whether to compute confidence intervals using bootstrap resampling
  n_bootstrap = 100,                     # Number of bootstrap replicates (only used if bootstrap = TRUE)
  n_cores = parallel::detectCores() - 1, # Number of CPU cores to use for parallel computations
  verbose = TRUE                         # Print progress and diagnostic messages during computation
)

fmvmr_separate() - Separate univariable estimation

fmvmr_separate(
  G1,                    # Genetic instrument matrix for exposure 1
  G2,                    # Genetic instrument matrix for exposure 2, or NULL if only a single exposure is analyzed
  fpca_results,          # List of FPCA objects
  Y,                     # Outcome vector
  outcome_type = "continuous", # Type of outcome: "continuous" for numeric outcomes, "binary" for 0/1 outcomes
  method = "gmm",        # Estimation method: "gmm" (Generalized Method of Moments), "cf" (control function), or "cf-lasso" (control function with Lasso)
  nPC1_selected = NA,    # Fixed number of principal components to retain for exposure 1 (NA = select automatically)
  max_nPC1 = NA,         # Maximum number of principal components to retain for exposure 1 (NA = automatically determined)
  nPC2_selected = NA,    # Fixed number of principal components to retain for exposure 2 (NA = select automatically)
  max_nPC2 = NA,         # Maximum number of principal components to retain for exposure 2 (NA = automatically determined; ignored if G2 is NULL)
  improvement_threshold = 0.001,         # Minimum cross-validation improvement required to add an additional principal component
  bootstrap = FALSE,     # Whether to compute confidence intervals using bootstrap resampling
  n_bootstrap = 100,     # Number of bootstrap replicates (only used if bootstrap = TRUE)
  n_cores = parallel::detectCores() - 1, # Number of CPU cores to use for parallel computations
  verbose = TRUE                         # Print progress and diagnostic messages during computation
)

fmvmr_twosample() - Two-sample joint multivariable estimation

rfmvmr_twosample(
  G_exposure,            # Genetic instrument matrix from the exposure sample (N × J)
  fpca_results,          # List of 2 FPCA objects corresponding to the two exposures from the exposure data
  by_outcome,            # Vector of SNP-outcome effect estimates (betas) from the outcome GWAS, length J
  sy_outcome,            # Vector of standard errors for SNP-outcome effects, length J
  ny_outcome,            # Sample size of the outcome GWAS
  max_nPC1 = NA,         # Maximum number of principal components to retain for exposure 1 (NA = default)
  max_nPC2 = NA,         # Maximum number of principal components to retain for exposure 2 (NA = default)
  true_effects = NULL,   # For simulation studies: list containing true effects for exposure 1 and exposure 2 (e.g., list(model1, model2))
  verbose = TRUE         # Print progress messages and diagnostics during computation
)

fmvmr_separate_twosample() - Two-sample separate univariable estimation

rfmvmr_separate_twosample(
  G1_exposure,           # Genetic instrument matrix from exposure 1 (N × J1)
  G2_exposure = NULL,    # Genetic instrument matrix from exposure 2 (N × J2) or NULL for single exposure
  fpca_results,          # List of 2 FPCA objects
  by_outcome1,           # SNP-outcome betas for exposure 1 instruments
  by_outcome2 = NULL,    # SNP-outcome betas for exposure 2 or NULL
  sy_outcome1,           # Standard errors for exposure 1
  sy_outcome2 = NULL,    # Standard errors for exposure 2 or NULL
  ny_outcome,            # Outcome GWAS sample size
  max_nPC1 = NA,         # Maximum number of principal components to retain for exposure 1 (NA = automatically determined)
  max_nPC2 = NA,         # Maximum number of principal components to retain for exposure 2 (NA = automatically determined)
  true_effects = NULL,   # List containing true effects for exposure 1 and exposure 2 (simulation only)
  verbose = TRUE         # Print progress messages and diagnostics during computation
)

Utility Functions

IS() - Calculate instrument strength (F-statistics)

IS(
  J,                     # Number of genetic instruments
  K,                     # Number of exposures
  PC,                    # Vector of indices indicating which columns in datafull correspond to the principal components
  datafull,              # Data frame containing instruments (first J columns) and principal components (subsequent columns) [G, X]
  Y                      # Optional outcome vector; if provided, Q-statistic for overidentification is calculated
)

Methods

The package supports three estimation methods:

  1. GMM (Generalized Method of Moments) - For continuous outcomes
  2. Control Function (CF) - For binary outcomes
  3. Control Function (CF)-LASSO - Control function with LASSO regularization

Output Objects

fmvmr object (from fmvmr())

result <- fmvmr(...)
names(result)

Components: - coefficients - Estimated β coefficients for basis functions - vcov - Variance-covariance matrix - effects - List with effect1, effect2, time_grid - confidence_intervals - Upper and lower bounds - nPC_used - Selected components (nPC1, nPC2) - performance - MISE and coverage (only for simulations) - plots - ggplot2 objects for visualization

Methods: - print(), summary() - Display results - plot() - Visualize time-varying effects - coef() - Extract coefficients - vcov() - Extract variance-covariance matrix

fmvmr_separate object (from fmvmr_separate())

result <- fmvmr_separate(...)
names(result)

Components: - exposure1 - Results for exposure 1 - coefficients, vcov, effect, nPC_used, performance - exposure2 - Results for exposure 2 (if provided) - coefficients, vcov, effect, nPC_used, performance - plots - Visualization objects

Methods: - coef(result, exposure = 1) - Extract coefficients for specific exposure - vcov(result, exposure = 1) - Extract variance-covariance matrix

Binary Outcomes

For binary outcomes, use method = "cf" or method = "cf-lasso":

# Generate binary outcome
outcome_binary <- getY_multi_exposure(
  sim_data,
  X1Ymodel = "2",
  X2Ymodel = "8",
  outcome_type = "binary"
)

# Estimate with control function
result <- fmvmr(
  G = sim_data$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome_binary$Y,
  outcome_type = "binary",
  method = "cf"
)

Advanced Features

Component Selection

Automatic selection via cross-validation:

result <- fmvmr(
  G = G,
  fpca_results = list(fpca1, fpca2),
  Y = Y,
  max_nPC1 = 10,              # Search up to 10 components
  max_nPC2 = 10,
  improvement_threshold = 0.01 # Stop if improvement < 1%
)

# View selected components
result$nPC_used

Bootstrap Inference

result <- fmvmr(
  G = G,
  fpca_results = list(fpca1, fpca2),
  Y = Y,
  bootstrap = TRUE,
  n_bootstrap = 200  # Number of bootstrap replicates
)

# Bootstrap confidence intervals available in:
result$confidence_intervals

Parallel Processing

result <- fmvmr(
  G = G,
  fpca_results = list(fpca1, fpca2),
  Y = Y,
  n_cores = 4  # Use 4 cores for cross-validation
)

Mediation Analysis

# Generate data with X1 → X2 mediation
sim_mediation <- getX_multi_exposure_mediation(
  N = 1000,
  J = 50,
  mediation_strength = 0.5,
  mediation_type = "linear"
)

outcome <- getY_multi_exposure(
  sim_mediation,
  X1Ymodel = "2",  # Direct effect of X1
  X2Ymodel = "1",  # Effect of X2 (mediator)
  outcome_type = "continuous"
)

# Estimate with MV-FMR to capture mediation
result <- fmvmr(
  G = sim_mediation$details$G,
  fpca_results = list(fpca1, fpca2),
  Y = outcome$Y
)

Instrument Strength Diagnostics

Check instrument strength with F-statistics:

# After FPCA
K_total <- fpca1$selectK + fpca2$selectK

fstats <- IS(
  J = ncol(G),
  K = K_total,
  PC = 1:K_total,
  datafull = cbind(
    G,
    cbind(fpca1$xiEst[, 1:fpca1$selectK], 
          fpca2$xiEst[, 1:fpca2$selectK])
  )
)

# View conditional F-statistics (cFF)
print(fstats)

Performance Metrics

When true effects are provided (simulations):

This package extends the univariable functional Mendelian Randomization framework to the multivariable setting. Key related work:

TVMR Package (Univariable Functional MR)

Our implementation builds upon and extends the TVMR package by Tian et al.:

Tian, H., Mason, A. M., Liu, C., & Burgess, S. (2024). Estimating time‐varying exposure effects through continuous‐time modelling in Mendelian randomization. Statistics in Medicine, 43(26), 5006-5024. https://doi.org/10.1002/sim.10222

GitHub: https://github.com/HDTian/TVMR

Author

Nicole Fontana

License

[License information]

Getting Help

For questions and issues: - Open an issue on GitHub - Email: nicole.fontana@polimi.it

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.