The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The need for larger samples in human subjects research have led to a trend of aggregating data across multiple locations (sites). This trend is especially prevalent in neuroimaging research. However, while the larger samples promoted greater power to detect significant associations as well as better generalizability of results, multiple-site study designs also introduce heterogeneity in acquisition and processing, which might consequently impact study findings.
ComBat is used as the major harmonization technique in neuroimaging
and the ComBat Family further extends the original ComBat methodology to
enable flexible covariate modeling, leveraging efficient R
implementations of regression models. However, it remains a challenge to
evaluate potential batch effects as well as the performance of
harmonization. ComBatFamQC
provides a useful visualization
tool through Rshiny for interactive batch effect diagnostics before and
after harmonization. To streamline the harmonization process and improve
efficiency, transparency, and reproducibility, ComBatFamQC
also provides default harmonization methods by integrating the
ComBatFamily
package.
The ComBatFamQC
visualization includes three key
functions:
visual_prep
: provides all relevant statistical test
results for batch effect visualization and evaluation.combat_harm
: provides default harmonization methods
from the ComBatFamily package.comfam_shiny
: generate interactive visualization
through Rshiny.The ComBatFamQC
includes the following
harmonizationmethods:
Install the ComBatFamQC
package and read in the data set
to be harmonized. ADNI data is used in the vignette for illustration. To
be noticed, the data set should include at least the following
columns:
Three types of regression models can be considered for batch effect evaluation:
lm
: Linear regression model, which assumes that the
relationship between the variables is linear.gam
: Generalized additive model, which models the
relationship between the dependent variable and certain independent
variable as a smooth, non-linear function, typically using splines.lmer
: Linear mixed-effects model, which extends the
linear regression model to account for both fixed effects and random
effects. It is commonly used in longitudinal datasets.(Note: For Windows users, make sure to set
cores = 1
in visual_prep
function. The MDMR
test can be time-consuming, especially in large datasets. Users have the
option to disable the MDMR test by setting
mdmr = FALSE
.)
features <- colnames(adni)[c(43:104)]
covariates <- c("timedays", "AGE", "SEX", "DIAGNOSIS")
interaction <- c("timedays,DIAGNOSIS")
batch <- "manufac"
result_orig <- visual_prep(type = "lm", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = NULL, df = adni)
comfam_shiny(result_orig)
There are two export options: 1) generate a Quarto report (requires Quarto to be installed), and 2) generate a combined Excel file.
#library(quarto)
temp_dir <- tempfile()
dir.create(temp_dir)
diag_save(path = temp_dir, result = result_lmer, use_quarto = TRUE)
There are two types of harmonization scenarios users can choose from:
Specify parameters carefully based on the harmonization method to be applied.
features <- colnames(adni)[c(43:104)]
covariates <- c("timedays", "AGE", "SEX", "DIAGNOSIS")
interaction <- c("timedays,DIAGNOSIS")
batch <- "manufac"
## Harmonize using evaluation results as the inputs
combat_model <- combat_harm(result = result_orig, type = "lm", interaction = interaction, smooth = NULL, random = NULL, df = adni)
## Harmonize through specifying features, batch, covariates and df arguments
combat_model_copy <- combat_harm(type = "lm", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = NULL, df = adni)
## Expect to get the same harmonization results
identical(combat_model$harmonized_df, combat_model_copy$harmonized_df)
# save harmonized data
write.csv(combat_model$harmonized_df, file.path(temp_dir, "harmonized.csv"))
# save combat model
saveRDS(combat_model$combat.object, file.path(temp_dir, "combat_model.rds"))
# Clean up the temporary file
unlink(temp_dir, recursive = TRUE)
## Harmonize using evaluation results as the inputs
combat_model_lmer <- combat_harm(result = result_lmer, type = "lmer", interaction = interaction, smooth = NULL, random = "subid", df = adni)
## Harmonize through specifying features, batch, covariates and df arguments
combat_model_lmer_copy <- combat_harm(type = "lmer", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = "subid", df = adni)
## Expect to get the same harmonization results
identical(combat_model_lmer$harmonized_df, combat_model_lmer_copy$harmonized_df)
## Harmonize using evaluation results as the inputs
combat_model_gam <- combat_harm(result = result_gam, type = "gam", interaction = interaction, smooth = "AGE", smooth_int_type = "linear", df = adni)
## Harmonize through specifying features, batch, covariates and df arguments
combat_model_gam_copy <- combat_harm(type = "gam", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = "AGE", smooth_int_type = "linear", df = adni)
## Expect to get the same harmonization results
identical(combat_model_gam$harmonized_df, combat_model_gam_copy$harmonized_df)
## Harmonize using evaluation results as the inputs
covbat_model <- combat_harm(result = result_gam, type = "gam", interaction = interaction, smooth = "AGE", smooth_int_type = "linear", df = adni, family = "covfam")
## Harmonize through specifying features, batch, covariates and df arguments
covbat_model_copy <- combat_harm(type = "gam", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth_int_type = "linear", smooth = "AGE", df = adni, family = "covfam")
## Expect to get the same harmonization results
identical(covbat_model$harmonized_df, covbat_model_copy$harmonized_df)
Specify predict
parameter to be TRUE and
object
parameter to be saved ComBat model.
Specify reference
parameter to be saved reference data.
To be noticed, the reference data should have identical columns as the
new data and the new data should contain reference data as its sub
sample.
# harmonize reference data
reference_site <- adni %>% group_by(site) %>% summarize(count = n()) %>% arrange(desc(count)) %>% pull(site) %>% head(30)
reference_df <- adni %>% filter(site %in% reference_site)
features <- colnames(reference_df)[c(43:104)]
covariates <- c("timedays", "AGE", "SEX", "DIAGNOSIS")
interaction <- c("timedays,DIAGNOSIS")
batch <- "site"
ref_model <- combat_harm(type = "lmer", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = "subid", df = reference_df)
# harmonize new data to the reference data
harm_new <- combat_harm(type = "lmer", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = "subid", df = adni, reference = ref_model$harmonized_df)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.