The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

gwas2crispr: From GWAS to CRISPR-ready Files

Overview

gwas2crispr prepares genome-wide association study (GWAS) results for downstream clustered regularly interspaced short palindromic repeats (CRISPR) workflows.

The package retrieves significant single-nucleotide polymorphisms (SNPs) for an Experimental Factor Ontology (EFO) trait from the EMBL-EBI GWAS Catalog REST API v2 and returns CRISPR-ready outputs for the GRCh38/hg38 human genome build.

The main outputs are:

Installation

Install from CRAN:

install.packages("gwas2crispr")

Optional packages for FASTA output:

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install(c(
  "Biostrings",
  "GenomeInfoDb",
  "BSgenome.Hsapiens.UCSC.hg38"
))

Development version:

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")

devtools::install_github("leopard0ly/gwas2crispr")

Fetch GWAS associations

library(gwas2crispr)

gwas_data <- fetch_gwas(
  efo_id  = "EFO_0000707",
  p_cut   = 1e-6,
  verbose = FALSE
)

names(gwas_data)
head(gwas_data$associations)

Run without writing files

By default, no files are written.

res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = NULL,
  verbose    = FALSE
)

res$summary
head(res$snps_full)
head(res$bed)

Write files safely

To write output files, provide out_prefix. In examples, use tempdir().

out_prefix <- file.path(tempdir(), "lung")

res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = out_prefix,
  verbose    = FALSE
)

res$written

Expected output paths:

paste0(out_prefix, "_snps_full.csv")
paste0(out_prefix, "_snps_hg38.bed")
paste0(out_prefix, "_snps_flank300.fa")

The FASTA file is created only when the optional genome packages are available.

Output structure

names(res)

Common outputs:

res$summary
res$snps_full
res$bed
res$fasta
res$written

Session information

sessionInfo()
#> R version 4.4.3 (2025-02-28 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22621)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=C                  LC_CTYPE=Arabic_Libya.utf8   
#> [3] LC_MONETARY=Arabic_Libya.utf8 LC_NUMERIC=C                 
#> [5] LC_TIME=Arabic_Libya.utf8    
#> 
#> time zone: Africa/Tripoli
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.39     R6_2.6.1          fastmap_1.2.0     xfun_0.56        
#>  [5] cachem_1.1.0      knitr_1.51        htmltools_0.5.9   rmarkdown_2.30   
#>  [9] lifecycle_1.0.5   cli_3.6.5         sass_0.4.10       jquerylib_0.1.4  
#> [13] compiler_4.4.3    rstudioapi_0.18.0 tools_4.4.3       evaluate_1.0.5   
#> [17] bslib_0.10.0      yaml_2.3.10       otel_0.2.0        jsonlite_2.0.0   
#> [21] rlang_1.1.6

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.