---
title: "Case Study: GLP-1 and related incretin therapies"
author: "Steven Smith"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteEncoding{UTF-8}
  %\VignetteIndexEntry{Case Study: GLP-1 and related incretin therapies}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
# Vignettes use precomputed example data by default.
# To rebuild examples with live RxNorm/RxClass API calls, set:
# Sys.setenv(RXREF_BUILD_VIGNETTES_ONLINE = "true")

online_env <- identical(
  tolower(Sys.getenv("RXREF_BUILD_VIGNETTES_ONLINE")),
  "true"
)

has_net <- tryCatch({
  requireNamespace("curl", quietly = TRUE) && curl::has_internet()
}, error = function(e) FALSE)

run_live <- online_env && has_net

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(rxref)
library(dplyr)

read_rxref_example <- function(file) {
  path <- system.file("extdata", file, package = "rxref")
  if (!nzchar(path)) {
    stop(
      "The example data file '", file, "' was not found. ",
      "Reinstall rxref or rebuild the vignette with ",
      "RXREF_BUILD_VIGNETTES_ONLINE=true."
    )
  }
  readRDS(path)
}
```

## Case

Suppose we need to identify users of GLP-1 receptor agonists and related
incretin-based therapies from EHR prescribing data, pharmacy claims data,
or both. To accomplish this, we need a medication list that includes
relevant RxNorm product concepts and, when available, corresponding NDCs.

This vignette walks through two approaches:

1. a transparent step-by-step workflow that starts with a curated list of
   ingredient names; and
2. a compact workflow using `search_drug()`.

The examples use precomputed data by default so the vignette can be built
without querying the live RxNorm API. To rebuild the examples with live
API calls, set:

```{r eval = FALSE}
Sys.setenv(RXREF_BUILD_VIGNETTES_ONLINE = "true")
```

## Defining the ingredient list

For this example, we start with a prespecified list of GLP-1 receptor
agonists and related incretin-based therapies:

- exenatide
- liraglutide
- lixisenatide
- dulaglutide
- albiglutide
- semaglutide
- tirzepatide

Tirzepatide is included here because many applied studies group it with
GLP-1-based incretin therapies, although it is a dual GIP/GLP-1 receptor
agonist rather than a GLP-1 receptor agonist alone.

```{r define-ingredients}
glp1.names <- c(
  "semaglutide",
  "exenatide",
  "liraglutide",
  "lixisenatide",
  "dulaglutide",
  "albiglutide",
  "tirzepatide"
)
```

## Option 1: Step-by-step medication list construction

### Identify ingredient RxCUIs

First, we use `find_ingredients()` to identify ingredient-level RxCUIs.
For this example, we retain concepts with TTY = `"IN"`, corresponding to
RxNorm ingredient concepts. You can see available TTY values and
descriptions with `tty_catalogue()`.

```{r ings}
if (run_live) {
  glp1.ings <- find_ingredients(glp1.names) |>
    filter(tty == "IN") |>
    distinct(
      input,
      ingredient_rxcui = rxcui,
      ingredient_name = name,
      ingredient_tty = tty
    )
} else {
  glp1.ings <- read_rxref_example("glp1_ings.rds")
}

glp1.ings
```

### Expand ingredients to product RxCUIs

Next, we use `products_for_ingredients()` to identify product concepts
related to the ingredient RxCUIs.

By default, this workflow focuses on active RxNorm concepts. This is
usually appropriate for current medication list construction. For studies
covering older calendar periods, users may want to include historical
RxNorm concepts as well.

```{r tty-sets}
product_ttys("default")
product_ttys("extended_product")
```

The default product TTY set is intended to capture product concepts that
are commonly useful for medication list construction and NDC mapping. A
broader product-related set is available with
`product_ttys("extended_product")`. Users can also supply their own
character vector of TTYs.

In this example, we include combination products. That means products
containing a GLP-1-related ingredient plus one or more other ingredients
may be retained.

```{r prods}
if (run_live) {
  glp1.prods <- products_for_ingredients(
    glp1.ings$ingredient_rxcui,
    ttys = product_ttys("default"),
    include_combos = TRUE,
    concept_status = "active"
  )
} else {
  glp1.prods <- read_rxref_example("glp1_prods.rds")
}

glp1.prods |>
  head(30)
```

Combination products may appear when `include_combos = TRUE`. Depending
on the function, ingredient fields for combination products may be
summarized across ingredients, while product-level rows remain one row
per product concept. Users should inspect ingredient fields and
`n_ingredients` before deciding whether to include or exclude fixed-dose
combination products.

### Include historical RxNorm concepts when needed

For historical studies, formulary reconstruction, or claims data spanning
older calendar periods, users may want to include both active and
historical RxNorm concepts.

```{r historical-products, eval = FALSE}
glp1.prods_historical <- products_for_ingredients(
  glp1.ings$ingredient_rxcui,
  ttys = product_ttys("default"),
  include_combos = TRUE,
  concept_status = "active_and_historical"
)
```

Historical concepts can be useful when reconstructing medication exposure
during older study periods. However, some historical concepts may have
less complete clinical attribute information than active concepts. Users
should review route, dose form, ingredient count, and NDC mappings
carefully when including historical concepts.

### Map product RxCUIs to NDCs

Next, we identify NDCs associated with the product RxCUIs. Not all RxCUIs
map to NDCs, so some product concepts may not have corresponding NDC
values.

```{r ndcs}
if (run_live) {
  glp1.ndc.map <- map_rxcui_to_ndc(
    unique(glp1.prods$product_rxcui),
    status = "ACTIVE"
  )
} else {
  glp1.ndc.map <- read_rxref_example("glp1_ndc_map.rds")
}

glp1.ndcs <- glp1.ndc.map |>
  left_join(
    glp1.prods,
    by = c("rxcui" = "product_rxcui")
  ) |>
  left_join(
    glp1.ings |>
      select(ingredient_rxcui, ingredient_name),
    by = "ingredient_rxcui"
  ) |>
  distinct(
    ingredient_rxcui,
    ingredient_name,
    product_rxcui = rxcui,
    ndc11,
    ndc_status,
    name,
    tty
  ) |>
  arrange(ingredient_name, product_rxcui, ndc11)

glp1.ndcs |>
  head(30)
```

At this point, we have a product-level and NDC-level medication list that
can be used to query EHR prescribing data, pharmacy dispensing data, or
pharmacy claims data.

## Option 2: Use `search_drug()` for a compact workflow

The same goal can often be accomplished in one step with `search_drug()`.
This function combines ingredient searching, product expansion, optional
route filtering, and optional NDC mapping.

Suppose we want NDCs for the same ingredient list, and we want to include
active, obsolete, and unspecified NDCs.

```{r search}
if (run_live) {
  alt.glp1.ndcs <- search_drug(
    term = glp1.names,
    return = "ndc",
    concept_status = "active",
    ndc_status = c("ACTIVE", "OBSOLETE", "UNSPECIFIED")
  )
} else {
  alt.glp1.ndcs <- read_rxref_example("alt_glp1_ndc.rds")
}

alt.glp1.ndcs |>
  arrange(ingredient_name, product_rxcui, ndc11) |>
  head(30)
```

Here, `concept_status` controls whether active or historical RxNorm
concepts are considered. The `ndc_status` argument controls which NDC
status categories are returned.

For example, to include historical RxNorm concepts and broader NDC status
categories, use:

```{r search-historical, eval = FALSE}
search_drug(
  term = glp1.names,
  return = "ndc",
  concept_status = "active_and_historical",
  ndc_status = c("ACTIVE", "OBSOLETE", "UNSPECIFIED")
)
```

This can be useful for studies spanning older calendar periods, but users
should carefully inspect the resulting concepts and NDCs before finalizing
an exposure definition.

## Comparing the two approaches

The step-by-step approach is more verbose, but it makes each decision
explicit:

1. identify ingredient RxCUIs;
2. expand ingredients to product concepts;
3. decide whether to include combination products;
4. decide whether to include active concepts only or active and historical
   concepts;
5. map product RxCUIs to NDCs.

The `search_drug()` approach is more compact and is useful for common
workflows where users want a product list or NDC list from one or more
drug names.

To compare the NDCs from the step-by-step workflow against the compact
workflow:

```{r check}
glp1.ndcs |>
  filter(!is.na(ndc11)) |>
  arrange(ingredient_name, product_rxcui, ndc11) |>
  head(30)
```

## Choosing active versus historical concepts

For many current medication lists, `concept_status = "active"` is a good
default. This limits the workflow to active RxNorm concepts.

Historical concepts may be appropriate when:

- the study period includes older calendar years;
- medication exposure is being reconstructed from historical claims;
- users need to capture products that may no longer be active in RxNorm;
- obsolete NDCs are intentionally included.

However, active and historical RxNorm concepts should not be confused
with NDC status. These are separate choices:

- `concept_status` controls which RxNorm concepts are considered.
- `ndc_status` controls which NDC status categories are returned.

For example:

```{r status-example, eval = FALSE}
search_drug(
  term = "semaglutide",
  return = "ndc",
  concept_status = "active_and_historical",
  ndc_status = c("ACTIVE", "OBSOLETE", "UNSPECIFIED")
)
```

## Practical considerations

Medication list construction often requires study-specific decisions.
Before using the resulting list in an analysis, users should consider:

- whether to include fixed-dose combination products;
- whether to include branded products, clinical products, packs, or all
  product-related TTYs;
- whether the study period requires historical RxNorm concepts;
- whether active NDCs only are sufficient, or obsolete/unspecified NDCs
  should also be included;
- whether route, dose form, or strength restrictions are needed;
- whether the final list should be reviewed clinically.

For strict reproducibility, users should save the final product list, NDC
list, and package/API versions used to construct the medication exposure
definition.