The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Post-Harmonization Downstream Analysis

Introduction

After harmonization, multiple-sites batch effect get controlled (or ideally eliminated). Some studies require further investigation of life span age trend of brain structures or other significant variable effects on brain structures. ComBatFamQC provides a post harmonization tool to:

Set up

Import ComBatFamQC package and read in harmonized data set for age trend visualization. We use age_df data for age trend visualization in the vignette. To be noticed the read-in data set should be a data frame (not tibble).

library(ComBatFamQC)
data(age_df)

Life Span Age Trend Visualization

In this step, we need to generate a list of data sets for all ROIs. Each ROI’s data set contains four columns:

age_df <- data.frame(age_df)
features <- colnames(age_df)[c(6:56)]
age <- "age"
sex <- "sex"
icv <- "ICV_baseline"
age_df[[sex]] <- as.factor(age_df[[sex]])

Create a list of data sets for all ROIs

# Create sub_df for different features
sub_df_list <- lapply(seq_len(length(features)), function(i){
    sub_df <- age_df[,c(features[i], age, sex, icv)] %>% na.omit()
    colnames(sub_df) <- c(features[i], "age", "sex", "icv")
    return(sub_df)
  })

Create age trend estimation for all ROIs

# For MAC users
library(parallel)
age_list <- mclapply(seq_len(length(features)), function(w){
  age_sub <- age_list_gen (sub_df = sub_df_list[[w]],  lq = 0.25, hq = 0.75)
  return(age_sub)
}, mc.cores = detectCores()) 

# For Windows users
age_list <- mclapply(1:length(features), function(w){
  age_sub <- age_list_gen (sub_df = sub_df_list[[w]],  lq = 0.25, hq = 0.75)
  return(age_sub)
}, mc.cores = 1) 

names(age_list) <- features

quantile_type <- c(paste0("quantile_", 100*0.25), "median", paste0("quantile_", 100*0.75))

Launch Shiny App for Visualization

Users can choose to generate age trend plots using the ggplot package or the plotly package (if plotly is installed).

# plotly: interactive plot
ComBatFamQC::age_shiny(age_list, features, quantile_type, use_plotly = TRUE)
# ggplot: static plot
ComBatFamQC::age_shiny(age_list, features, quantile_type, use_plotly = FALSE)

Save Age Trend Table and GAMLSS Model

# Save age trend table
temp_dir <- tempfile()
dir.create(temp_dir)
age_save(path = temp_dir, age_list = age_list)

# Save GAMLSS Model
gamlss_model <- lapply(seq_len(length(age_list)), function(i){
        g_model <- age_list[[i]]$model
        return(g_model)})
names(gamlss_model) <- names(age_list)
saveRDS(gamlss_model, file = file.path(temp_dir, "gamlss_model.rds"))

Residual Generation

In this step, we would like to generate different sets of residuals removing specific covariates’ effects.

Get harmonized data set

features <- colnames(adni)[c(43:104)]
covariates <- c("timedays", "AGE", "SEX", "DIAGNOSIS")
interaction <- c("timedays,DIAGNOSIS")
batch <- "manufac"
combat_model <- combat_harm(type = "lm", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = NULL, df = adni)
harmonized_df <- combat_model$harmonized_df

Generate residual data set

Specify parameters carefully based on regression type and which covariates’ effects to remove.

Generate residuals from scratch

# generate residuals by removing timedays and DIAGNOSIS effects, while preserving AGE and SEX effects.
result_residual <- residual_gen(type = "lm", features = features, covariates = covariates, interaction = interaction, smooth = NULL, df = harmonized_df, rm = c("timedays", "DIAGNOSIS"))

# save residual data set
write.csv(result_residual$residual, file.path(temp_dir, "residual.csv"))

# save regression model
saveRDS(result_residual$model, file.path(temp_dir, "regression_model.rds"))

Generate residuals from existing model

result_residual <- residual_gen(df = harmonized_df, rm = c("timedays", "DIAGNOSIS"), model = TRUE, model_path = file.path(temp_dir, "regression_model.rds"))

# save residual data set
write.csv(result_residual$residual, file.path(temp_dir, "residual.csv"))
# Clean up the temporary file
unlink(temp_dir, recursive = TRUE)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.