The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

comorbidPGS

GitHub tag CRAN_Status_Badge

comorbidPGS is a tool for analysing an already computed Polygenic Score (PGS, also named PRS/GRS for binary outcomes) distribution to investigate shared genetic aetiology in multiple conditions.

comorbidPGS is under GPL-3 license, and is freely available for download.

Prerequisite

Installation

comorbidPGS is available on CRAN, you can download it using the following command:

install.packages("comorbidPGS")

If you prefer the latest stable development version, you can download it from GitHub with:

if (!require("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("VP-biostat/comorbidPGS")

Example

Building an Association Table of PGS

This is a basic example which shows you how to do basic association with the example dataset:

library(comorbidPGS)
#> 
#> Attaching package: 'comorbidPGS'
#> The following object is masked from 'package:graphics':
#> 
#>     assocplot

# use the demo dataset
dataset <- comorbidData
# NOTE: The dataset must have at least 3 different columns:
# - an ID column (the first one)
# - a PGS column (must be numeric, by default it is the column named "SCORESUM" or the second column if "SCORESUM" is not present)
# - a Phenotype column, can be factors, numbers or characters

# do an association of one PGS with one Phenotype
result_1 <- assoc(dataset, prs_col = "t2d_PGS", phenotype_col = "t2d")
PGS Phenotype Phenotype_type Statistical_method Covar N_cases N_controls N Effect SE lower_CI upper_CI P_value
t2d_PGS t2d Cases/Controls Binary logistic regression NA 730 9270 10000 1.688258 NA 1.561821 1.824931 0
# do multiple associations
assoc <- expand.grid(c("t2d_PGS", "ldl_PGS"), c("ethnicity","brc","t2d","log_ldl","sbp_cat"))
result_2 <- multiassoc(df = dataset, assoc_table = assoc, covar = c("age", "sex", "gen_array"))
#> Warning in phenotype_type(df = df, phenotype_col = phenotype_col): Phenotype
#> column log_ldl is continuous and not normal, please normalise prior association

#> Warning in phenotype_type(df = df, phenotype_col = phenotype_col): Phenotype
#> column log_ldl is continuous and not normal, please normalise prior association
PGS Phenotype Phenotype_type Statistical_method Covar N_cases N_controls N Effect SE lower_CI upper_CI P_value
2 t2d_PGS ethnicity 1 ~ 2 Categorical Multinomial logistic regression age+sex+gen_array 2142 6381 8523 0.9814174 NA 0.9345150 1.0306739 0.4528020
3 t2d_PGS ethnicity 1 ~ 3 Categorical Multinomial logistic regression age+sex+gen_array 1205 6381 7586 1.0178971 NA 0.9570931 1.0825640 0.5724292
4 t2d_PGS ethnicity 1 ~ 4 Categorical Multinomial logistic regression age+sex+gen_array 272 6381 6653 0.9434640 NA 0.8355980 1.0652542 0.3474694
21 ldl_PGS ethnicity 1 ~ 2 Categorical Multinomial logistic regression age+sex+gen_array 2142 6381 8523 0.9925623 NA 0.9451678 1.0423334 0.7648927
31 ldl_PGS ethnicity 1 ~ 3 Categorical Multinomial logistic regression age+sex+gen_array 1205 6381 7586 1.0083869 NA 0.9481215 1.0724830 0.7905175
41 ldl_PGS ethnicity 1 ~ 4 Categorical Multinomial logistic regression age+sex+gen_array 272 6381 6653 0.9760204 NA 0.8647226 1.1016433 0.6943783
1 t2d_PGS brc Cases/Controls Binary logistic regression age+sex+gen_array 402 5041 5443 1.0061678 NA 0.9087543 1.1140235 0.9057882
11 ldl_PGS brc Cases/Controls Binary logistic regression age+sex+gen_array 402 5041 5443 1.1037106 NA 0.9956370 1.2235153 0.0605407
12 t2d_PGS t2d Cases/Controls Binary logistic regression age+sex+gen_array 730 9270 10000 1.7359738 NA 1.6029867 1.8799938 0.0000000
13 ldl_PGS t2d Cases/Controls Binary logistic regression age+sex+gen_array 730 9270 10000 0.9823272 NA 0.9102411 1.0601223 0.6465580
14 t2d_PGS log_ldl Continuous Linear regression age+sex+gen_array NA NA 10000 0.0059961 0.0022747 0.0015378 0.0104544 0.0084010
15 ldl_PGS log_ldl Continuous Linear regression age+sex+gen_array NA NA 10000 0.0828545 0.0021183 0.0787027 0.0870064 0.0000000
16 t2d_PGS sbp_cat Ordered Categorical Ordinal logistic regression age+sex+gen_array NA NA 10000 1.0628744 NA 1.0236044 1.1036509 0.0015002
17 ldl_PGS sbp_cat Ordered Categorical Ordinal logistic regression age+sex+gen_array NA NA 10000 1.0078855 NA 0.9707330 1.0464598 0.6818849

Extension of association analysis: one-sample MR using the Wald Ratio and 2SLS methods

# MR using Wald ratio method
mr1 <- mr_ratio(df = dataset, prs_col = "ldl_PGS", exposure_col = "log_ldl", outcome_col = "sbp") 
#> Warning in phenotype_type(df = df, phenotype_col = exposure_col): Phenotype
#> column log_ldl is continuous and not normal, please normalise prior association
#> Warning in phenotype_type(df = df, phenotype_col = outcome_col): Phenotype
#> column sbp is continuous and not normal, please normalise prior association
PGS Exposure Outcome Method N_cases N_controls N MR_estimate SE F_stat
ldl_PGS ldl_PGS log_ldl sbp Ratio NA NA 10000 0.0321099 2.387691 1449.37
# MR using 2-stage least square method (2SLS)
mr2 <- mr_2sls(df = dataset, prs_col = "ldl_PGS", exposure_col = "log_ldl", outcome_col = "sbp") 
#> Warning in phenotype_type(df = df, phenotype_col = exposure_col): Phenotype
#> column log_ldl is continuous and not normal, please normalise prior association
#> Warning in phenotype_type(df = df, phenotype_col = outcome_col): Phenotype
#> column sbp is continuous and not normal, please normalise prior association
PGS Exposure Outcome Method N_cases N_controls N MR_estimate SE F_stat
value ldl_PGS log_ldl sbp 2SLS NA NA 10000 0.0321099 2.387532 1449.37

Examples of data visualisation using comorbidPGS

densityplot(dataset, prs_col = "ldl_PGS", phenotype_col = "sbp_cat")

# show multiple associations in a plot
assoplot <- assocplot(score_table = result_2)
assoplot$continuous_phenotype

assoplot$discrete_phenotype

NOTE: The score_table should have the assoc() output format

centileplot(dataset, prs_col = "brc_PGS", phenotype_col = "brc")
#> Warning in centileplot(dataset, prs_col = "brc_PGS", phenotype_col = "brc"):
#> The dataset has less than 10,000 individuals, centiles plot may not look good!
#> Use the argument decile = TRUE to adapt to small datasets

As those graphical functions use ggplot2, you can fully customize your plot:

library(ggplot2)
centileplot(dataset, prs_col = "t2d_PGS", phenotype_col = "t2d") + 
  scale_color_gradient(low = "green", high = "red")

decileboxplot(dataset, prs_col = "ldl_PGS", phenotype_col = "ldl")

Citation

To cite comorbidPGS in publications, please use:

Pascat V, Zudina L, Ulrich A, Maina JG, Kaakinen M, Pupko I, Bonnefond A, Demirkan A, Balkhiyarova Z, Froguel P, Prokopenko I (2024). “comorbidPGS: an R package assessing shared predisposition between Phenotypes using Polygenic Scores.” Human Heredity. doi: 10.1159/000539325.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.