The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

gwas2crispr: From GWAS to CRISPR-ready files (hg38)

Overview

gwas2crispr retrieves significant genome-wide association study (GWAS) SNPs for an Experimental Factor Ontology (EFO) trait, aggregates variant/gene/study metadata, and optionally exports CSV, BED, and FASTA files for downstream functional genomics and CRISPR guide design. The package targets GRCh38/hg38.

Key design for CRAN compliance: functions do not write by default. File writing happens only if you set out_prefix. In examples/tests/vignettes, write to tempdir().

Runtime prerequisites: the GWAS Catalog client gwasrapidd is required for data retrieval; Biostrings + BSgenome.Hsapiens.UCSC.hg38 are required only if you want FASTA output.

Core functions

This vignette does not run network calls or write files (global eval = FALSE) to keep CRAN checks deterministic.

Installation

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install(c("Biostrings", "BSgenome.Hsapiens.UCSC.hg38"))

install.packages("gwasrapidd")  # required for GWAS retrieval

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
devtools::install_github("leopard0ly/gwas2crispr")

Quick examples (primary + CRAN-safe)

A) Primary workflow — write outputs to the current working directory

library(gwas2crispr)

# Lung disease (EFO_0000707), GRCh38/hg38
run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = "lung"   # produces: lung_snps_full.csv / lung_snps_hg38.bed / lung_snps_flank300.fa
)

B) CRAN-safe — write into a temporary directory

library(gwas2crispr)

tmp <- tempdir()  # CRAN-safe target
res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = file.path(tmp, "lung"),  # writes here, not to user's home
  verbose    = FALSE
)

# Files written (list components or vector of paths, depending on return structure):
res$csv
res$bed
res$fasta  # present only if BSgenome/Biostrings are installed

CLI usage (optional)

Rscript "$(Rscript -e \"cat(system.file('scripts','gwas2crispr.R', package='gwas2crispr'))\")" \
  -e EFO_0000707 -p 1e-6 -f 300 -o "$(Rscript -e \"cat(tempdir())\")/lung"

The -o path in CLI should point to a temporary or user-chosen directory. Avoid writing to the package root when reproducing examples under CRAN-like conditions.

Session info

sessionInfo()
#> R version 4.4.3 (2025-02-28 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22621)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=C                  LC_CTYPE=Arabic_Libya.utf8   
#> [3] LC_MONETARY=Arabic_Libya.utf8 LC_NUMERIC=C                 
#> [5] LC_TIME=Arabic_Libya.utf8    
#> 
#> time zone: Africa/Tripoli
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.37     R6_2.6.1          fastmap_1.2.0     xfun_0.52        
#>  [5] cachem_1.1.0      knitr_1.50        htmltools_0.5.8.1 rmarkdown_2.29   
#>  [9] lifecycle_1.0.4   cli_3.6.5         sass_0.4.10       jquerylib_0.1.4  
#> [13] compiler_4.4.3    rstudioapi_0.17.1 tools_4.4.3       evaluate_1.0.4   
#> [17] bslib_0.9.0       yaml_2.3.10       rlang_1.1.6       jsonlite_2.0.0

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.