The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Compare, subset or stratify codelists

Introduction: Generate codelist subsets, exploring codelist utility functions

This vignette introduces a set of functions designed to manipulate and explore codelists within an OMOP CDM. Specifically, we will learn how to:

First of all, we will load the required packages and connect to a mock database.

library(DBI)
library(duckdb)
library(dplyr)
library(CDMConnector)
library(CodelistGenerator)

# Connect to the database and create the cdm object
con <- dbConnect(duckdb(), 
                      eunomiaDir("synpuf-1k", "5.3"))
cdm <- cdmFromCon(con = con, 
                  cdmName = "Eunomia Synpuf",
                  cdmSchema   = "main",
                  writeSchema = "main", 
                  achillesSchema = "main")

We will start by generating a codelist for acetaminophen using getDrugIngredientCodes()

acetaminophen <- getDrugIngredientCodes(cdm,
                                        name = "acetaminophen",
                                        nameStyle = "{concept_name}",
                                        type = "codelist")

Subsetting a Codelist

Subsetting a codelist will allow us to reduce a codelist to only those concepts that meet certain conditions.

Subset to Codes in Use

This function keeps only those codes observed in the database with at least a specified frequency (minimumCount) and in the table specified (table). Note that this function depends on ACHILLES tables being available in your CDM object.

acetaminophen_in_use <- subsetToCodesInUse(x = acetaminophen, 
                                           cdm, 
                                           minimumCount = 0,
                                           table = "drug_exposure")
acetaminophen_in_use # Only the first 5 concepts will be shown

Subset by Domain

We will now subset to those concepts that have domain = "Drug". Remember that, to see the domains available in the cdm, you can use getDomains(cdm).

acetaminophen_drug <- subsetOnDomain(acetaminophen_in_use, cdm, domain = "Drug")

acetaminophen_drug

We can use the negate argument to exclude concepts with a certain domain:

acetaminophen_no_drug <- subsetOnDomain(acetaminophen_in_use, cdm, domain = "Drug", negate = TRUE)

acetaminophen_no_drug

Subset on Dose Unit

We will now filter to only include concepts with specified dose units. Remember that you can use getDoseUnit(cdm) to explore the dose units available in your cdm.

acetaminophen_mg_unit <- subsetOnDoseUnit(acetaminophen_drug, cdm, c("milligram", "unit"))
acetaminophen_mg_unit

As before, we can use argument negate = TRUE to exclude instead.

Subset on route category

We will now subset to those concepts that do not have an “unclassified_route” or “transmucosal_rectal”:

acetaminophen_route <- subsetOnRouteCategory(acetaminophen_mg_unit, 
                                             cdm, c("transmucosal_rectal","unclassified_route"), 
                                             negate = TRUE)
acetaminophen_route

Stratify codelist

Instead of filtering, stratification allows us to split a codelist into subgroups based on defined vocabulary properties.

Stratify by Dose Unit

acetaminophen_doses <- stratifyByDoseUnit(acetaminophen, cdm, keepOriginal = TRUE)

acetaminophen_doses

Stratify by Route Category

acetaminophen_routes <- stratifyByRouteCategory(acetaminophen, cdm)

acetaminophen_routes

Compare codelists

Now we will compare two codelists to identify overlapping and unique codes.

acetaminophen <- getDrugIngredientCodes(cdm, 
                                           name = "acetaminophen", 
                                           nameStyle = "{concept_name}",
                                           type = "codelist_with_details")
hydrocodone <- getDrugIngredientCodes(cdm, 
                                      name = "hydrocodone", 
                                      doseUnit = "milligram", 
                                      nameStyle = "{concept_name}",
                                      type = "codelist_with_details")

Compare the two sets:

comparison <- compareCodelists(acetaminophen$acetaminophen, hydrocodone$hydrocodone)

comparison |> glimpse()

comparison |> filter(codelist == "Both")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.