The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

bedrockbio

Open-Access Computational Biology Datasets

Description

Efficiently access a curated library of open-access computational biology datasets. Tables support predicate pushdown and projection to the cloud storage backend, enabling quick, iterative access to otherwise massive, unwieldy tables.

bedrockbio consists of three user-facing functions:

dplyr verbs (filter, select) can be used on the data frame returned by load_table to push down additional row filters and column selections to the storage backend.

Installation

Install from CRAN:

install.packages("bedrockbio")

Or install the current development version from GitHub:

# install.packages("pak")
pak::pak("bedrock-bio/bedrock-bio-client/r")

Examples

Load the package (and dplyr for downstream data frame manipulation):

library(bedrockbio)
library(dplyr)

List available tables:

list_tables()

Describe a table to see its metadata, citation, and columns:

describe_table("ukb_ppp.pqtls")

Lazily load a table with required partition filters, select columns, and collect the relevant subset into an in-memory data frame:

df <- load_table(
  "ukb_ppp.pqtls",
  ancestry = "EUR",
  protein_id = "A0FGR8",
  panel = "Inflammation"
) |>
  select(
    chromosome,
    position,
    effect_allele,
    other_allele,
    beta,
    neg_log_10_p_value
  ) |>
  collect()

Dataset Requests

To request the addition of a new table to the library, open an issue.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.