The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
In this vignette, we will explore the OmopSketch functions
designed to provide information about the number of counts of concepts
in tables. Specifically, there are two key functions that facilitate
this, summariseConceptIdCounts() and
tableConceptIdCounts(). The former one creates a summary
statistics results with the number of counts per each concept in the
clinical table, and the latter one displays the result in a table.
Let’s see an example of the previous functions. To start with, we will load essential packages and create a mock cdm using the R package omock.
library(OmopSketch)
library(dplyr)
library(omock)
cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
#> ℹ Reading GiBleed tables.
#> ℹ Adding drug_strength table.
#> ℹ Creating local <cdm_reference> object.
#> ℹ Inserting <cdm_reference> into duckdb.
cdm
#>
#> ── # OMOP CDM reference (duckdb) of GiBleed ────────────────────────────────────
#> • omop tables: care_site, cdm_source, concept, concept_ancestor, concept_class,
#> concept_relationship, concept_synonym, condition_era, condition_occurrence,
#> cost, death, device_exposure, domain, dose_era, drug_era, drug_exposure,
#> drug_strength, fact_relationship, location, measurement, metadata, note,
#> note_nlp, observation, observation_period, payer_plan_period, person,
#> procedure_occurrence, provider, relationship, source_to_concept_map, specimen,
#> visit_detail, visit_occurrence, vocabulary
#> • cohort tables: -
#> • achilles tables: -
#> • other tables: -
We now use the summariseConceptIdCounts() function from
the OmopSketch package to retrieve counts for each concept id and name,
as well as for each source concept id and name, across the clinical
tables.
summariseConceptIdCounts(cdm = cdm, omopTableName = "drug_exposure") |>
select(group_level, variable_name, variable_level, estimate_name, estimate_value, additional_name, additional_level) |>
glimpse()
#> Rows: 113
#> Columns: 7
#> $ group_level <chr> "drug_exposure", "drug_exposure", "drug_exposure", "d…
#> $ variable_name <chr> "Naproxen sodium 220 MG Oral Tablet", "Diphenhydramin…
#> $ variable_level <chr> "1115171", "40232448", "19075601", "19129655", "19079…
#> $ estimate_name <chr> "count_records", "count_records", "count_records", "c…
#> $ estimate_value <chr> "1159", "105", "363", "488", "35", "6", "27", "7", "5…
#> $ additional_name <chr> "source_concept_id &&& source_concept_name", "source_…
#> $ additional_level <chr> "1115171 &&& Naproxen sodium 220 MG Oral Tablet", "40…
By default, the function returns the number of records
(estimate_name == "count_records") for each concept_id. To
include counts by person, you can set the countBy argument
to "person" or to c("record", "person") to
obtain both record and person counts.
summariseConceptIdCounts(
cdm = cdm,
omopTableName = "drug_exposure",
countBy = c("record", "person")
) |>
select(variable_name, estimate_name, estimate_value)
#> # A tibble: 226 × 3
#> variable_name estimate_name estimate_value
#> <chr> <chr> <chr>
#> 1 zoster vaccine, live count_records 2125
#> 2 zoster vaccine, live count_subjec… 1140
#> 3 Acetaminophen 160 MG Oral Tablet count_records 2158
#> 4 Acetaminophen 160 MG Oral Tablet count_subjec… 1428
#> 5 Penicillin V Potassium 500 MG Oral Tablet count_records 1087
#> 6 Penicillin V Potassium 500 MG Oral Tablet count_subjec… 856
#> 7 Acetaminophen 325 MG / Oxycodone Hydrochloride … count_records 306
#> 8 Acetaminophen 325 MG / Oxycodone Hydrochloride … count_subjec… 306
#> 9 varicella virus vaccine count_records 422
#> 10 varicella virus vaccine count_subjec… 301
#> # ℹ 216 more rows
Further stratification can be applied using the
interval, sex, and ageGroup
arguments. The interval argument supports “overall” (no time
stratification), “years”, “quarters”, or “months”.
summariseConceptIdCounts(
cdm = cdm,
omopTableName = "condition_occurrence",
countBy = "person",
interval = "years",
sex = TRUE,
ageGroup = list("<=50" = c(0, 50), ">50" = c(51, Inf))
) |>
select(group_level, strata_level, variable_name, estimate_name, additional_level) |>
glimpse()
#> Rows: 28,358
#> Columns: 5
#> $ group_level <chr> "condition_occurrence", "condition_occurrence", "cond…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "Acute bronchitis", "Polyp of colon", "Laceration of …
#> $ estimate_name <chr> "count_subjects", "count_subjects", "count_subjects",…
#> $ additional_level <chr> "260139 &&& Acute bronchitis", "4285898 &&& Polyp of …
We can also filter the clinical table to a specific time window by setting the dateRange argument.
summarisedResult <- summariseConceptIdCounts(
cdm = cdm,
omopTableName = "condition_occurrence",
dateRange = as.Date(c("1990-01-01", "2010-01-01"))
)
summarisedResult |>
settings() |>
glimpse()
#> Rows: 1
#> Columns: 10
#> $ result_id <int> 1
#> $ result_type <chr> "summarise_concept_id_counts"
#> $ package_name <chr> "OmopSketch"
#> $ package_version <chr> "1.0.0"
#> $ group <chr> "omop_table"
#> $ strata <chr> ""
#> $ additional <chr> "source_concept_id &&& source_concept_name"
#> $ min_cell_count <chr> "0"
#> $ study_period_end <chr> "2010-01-01"
#> $ study_period_start <chr> "1990-01-01"
Finally, you can restrict concept counts to a subset of subjects via
the sample argument: provide an integer to randomly select
that many person_ids from the person table, or
a character string naming a cohort table to limit counts to
its subject_ids.
summariseConceptIdCounts(
cdm = cdm,
omopTableName = "condition_occurrence",
sample = 50
) |>
select(group_level, variable_name, estimate_name) |>
glimpse()
#> Rows: 66
#> Columns: 3
#> $ group_level <chr> "condition_occurrence", "condition_occurrence", "conditi…
#> $ variable_name <chr> "Acute bronchitis", "Facial laceration", "Laceration of …
#> $ estimate_name <chr> "count_records", "count_records", "count_records", "coun…
Finally, concept counts can be visualised using
tableConceptIdCounts(). By default, it generates an
interactive reactable
table, but DT datatables are
also supported.
result <- summariseConceptIdCounts(
cdm = cdm,
omopTableName = "measurement",
countBy = "record"
)
tableConceptIdCounts(result = result, type = "reactable")
tableConceptIdCounts(result = result, type = "datatable")
The display argument in tableConceptIdCounts() controls
which concept counts are shown. Available options include
display = "overall". It is the default option and it shows
both standard and source concept counts.
tableConceptIdCounts(result = result, display = "overall")
If display = "standard" the table shows only
standard concept_id and concept_name counts.
tableConceptIdCounts(result = result, display = "standard")
If display = "source" the table shows only
source concept_id and concept_name counts.
tableConceptIdCounts(result = result, display = "source")
If display = "missing source" the table shows only
counts for concept ids that are missing a corresponding source concept
id.
tableConceptIdCounts(result = result, display = "missing source")
#> Warning in max(dplyr::pull(dplyr::tally(dplyr::group_by(result,
#> dplyr::across(-c("estimate_value")))), : no non-missing arguments to max;
#> returning -Inf
If display = "missing standard" the table shows only
counts for source concept ids that are missing a mapped standard concept
id.
tableConceptIdCounts(result = result, display = "missing standard")
#> Warning in max(dplyr::pull(dplyr::tally(dplyr::group_by(result,
#> dplyr::across(-c("estimate_value")))), : no non-missing arguments to max;
#> returning -Inf
You can use the tableTopConceptCounts() function to
display the most frequent concepts in a OMOP CDM table in formatted
table. By default, the function returns a gt table, but you can also choose
from other output formats, including flextable, datatable, and reactable.
result <- summariseConceptIdCounts(
cdm = cdm,
omopTableName = "drug_exposure",
countBy = "record"
)
tableTopConceptCounts(result = result, type = "gt")
| Top |
Cdm name
|
|---|---|
| GiBleed | |
| drug_exposure | |
| 1 | Standard: Acetaminophen 325 MG Oral Tablet (1127433) Source: Acetaminophen 325 MG Oral Tablet (1127433) 9365 |
| 2 | Standard: poliovirus vaccine, inactivated (40213160) Source: poliovirus vaccine, inactivated (40213160) 7977 |
| 3 | Standard: tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227) Source: tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227) 7430 |
| 4 | Standard: Aspirin 81 MG Oral Tablet (19059056) Source: Aspirin 81 MG Oral Tablet (19059056) 4380 |
| 5 | Standard: Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671) Source: Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671) 3851 |
| 6 | Standard: hepatitis A vaccine, adult dosage (40213296) Source: hepatitis A vaccine, adult dosage (40213296) 3211 |
| 7 | Standard: Acetaminophen 160 MG Oral Tablet (1127078) Source: Acetaminophen 160 MG Oral Tablet (1127078) 2158 |
| 8 | Standard: zoster vaccine, live (40213260) Source: zoster vaccine, live (40213260) 2125 |
| 9 | Standard: Acetaminophen 21.7 MG/ML / Dextromethorphan Hydrobromide 1 MG/ML / doxylamine succinate 0.417 MG/ML Oral Solution (40229134) Source: Acetaminophen 21.7 MG/ML / Dextromethorphan Hydrobromide 1 MG/ML / doxylamine succinate 0.417 MG/ML Oral Solution (40229134) 1993 |
| 10 | Standard: hepatitis B vaccine, adult dosage (40213306) Source: hepatitis B vaccine, adult dosage (40213306) 1916 |
By default, the function shows the top 10 concepts. You can change
this using the top argument:
tableTopConceptCounts(result = result, top = 5)
| Top |
Cdm name
|
|---|---|
| GiBleed | |
| drug_exposure | |
| 1 | Standard: Acetaminophen 325 MG Oral Tablet (1127433) Source: Acetaminophen 325 MG Oral Tablet (1127433) 9365 |
| 2 | Standard: poliovirus vaccine, inactivated (40213160) Source: poliovirus vaccine, inactivated (40213160) 7977 |
| 3 | Standard: tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227) Source: tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227) 7430 |
| 4 | Standard: Aspirin 81 MG Oral Tablet (19059056) Source: Aspirin 81 MG Oral Tablet (19059056) 4380 |
| 5 | Standard: Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671) Source: Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671) 3851 |
If your summary includes both record and person counts, you must
specify which type to display using the countBy
argument:
result <- summariseConceptIdCounts(
cdm = cdm,
omopTableName = "drug_exposure",
countBy = c("record", "person")
)
tableTopConceptCounts(result = result, countBy = "person")
| Top |
Cdm name
|
|---|---|
| GiBleed | |
| drug_exposure | |
| 1 | Standard: tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227) Source: tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227) 2660 |
| 2 | Standard: Acetaminophen 325 MG Oral Tablet (1127433) Source: Acetaminophen 325 MG Oral Tablet (1127433) 2580 |
| 3 | Standard: poliovirus vaccine, inactivated (40213160) Source: poliovirus vaccine, inactivated (40213160) 2140 |
| 4 | Standard: Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671) Source: Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671) 2021 |
| 5 | Standard: Aspirin 81 MG Oral Tablet (19059056) Source: Aspirin 81 MG Oral Tablet (19059056) 1927 |
| 6 | Standard: celecoxib (1118084) Source: celecoxib 200 MG Oral Capsule [Celebrex] (44923712) 1844 |
| 7 | Standard: hepatitis A vaccine, adult dosage (40213296) Source: hepatitis A vaccine, adult dosage (40213296) 1737 |
| 8 | Standard: hepatitis B vaccine, adult dosage (40213306) Source: hepatitis B vaccine, adult dosage (40213306) 1560 |
| 9 | Standard: Acetaminophen 160 MG Oral Tablet (1127078) Source: Acetaminophen 160 MG Oral Tablet (1127078) 1428 |
| 10 | Standard: Acetaminophen 21.7 MG/ML / Dextromethorphan Hydrobromide 1 MG/ML / doxylamine succinate 0.417 MG/ML Oral Solution (40229134) Source: Acetaminophen 21.7 MG/ML / Dextromethorphan Hydrobromide 1 MG/ML / doxylamine succinate 0.417 MG/ML Oral Solution (40229134) 1393 |
Finally, disconnect from the mock CDM.
cdmDisconnect(cdm = cdm)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.