The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
For this example we are going to generate a candidate codelist for dementia, only looking for codes in the condition domain. Let’s first load some libraries
CodelistGenerator works with a cdm_reference
to the
vocabularies tables of the OMOP CDM using the CDMConnector package.
# example with postgres database connection details
db <- DBI::dbConnect(RPostgres::Postgres(),
dbname = Sys.getenv("server"),
port = Sys.getenv("port"),
host = Sys.getenv("host"),
user = Sys.getenv("user"),
password = Sys.getenv("password")
)
# create cdm reference
cdm <- CDMConnector::cdm_from_con(
con = db,
cdm_schema = Sys.getenv("vocabulary_schema")
)
It is important to note that the results from CodelistGenerator will be specific to a particular version of the OMOP CDM vocabularies. We can see the version of the vocabulary being used like so
#> [1] "vocabVersion"
The simplest approach to identifying potential codes is to take a high-level code and include all its descendants.
codesFromDescendants <- tbl(
db,
sql(paste0(
"SELECT * FROM ",
vocabularyDatabaseSchema,
".concept_ancestor"
))
) |>
filter(ancestor_concept_id == "4182210") |>
select("descendant_concept_id") |>
rename("concept_id" = "descendant_concept_id") |>
left_join(tbl(db, sql(paste0(
"SELECT * FROM ",
vocabularyDatabaseSchema,
".concept"
)))) |>
select(
"concept_id", "concept_name",
"domain_id", "vocabulary_id"
) |>
collect()
codesFromDescendants |>
glimpse()
#> Rows: 151
#> Columns: 4
#> $ concept_id <int> 35610098, 4043241, 4139421, 37116466, 4046089, 44782559,…
#> $ concept_name <chr> "Predominantly cortical dementia", "Familial Alzheimer's…
#> $ domain_id <chr> "Condition", "Condition", "Condition", "Condition", "Con…
#> $ vocabulary_id <chr> "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOME…
This looks to pick up most relevant codes. But, this approach misses codes that are not a descendant of 4182210. For example, codes such as “Wandering due to dementia” (37312577; https://athena.ohdsi.org/search-terms/terms/37312577) and “Anxiety due to dementia” (37312031; https://athena.ohdsi.org/search-terms/terms/37312031) are not picked up.
To try and include all such terms that could be included we can use CodelistGenerator.
First, let’s do a simple search for a single keyword of “dementia”, including descendants of the identified codes.
dementiaCodes1 <- getCandidateCodes(
cdm = cdm,
keywords = "dementia",
domains = "Condition",
includeDescendants = TRUE
)
dementiaCodes1|>
glimpse()
#> Rows: 187
#> Columns: 6
#> $ concept_id <int> 374326, 374888, 375791, 376085, 376094, 376095, 37694…
#> $ found_from <chr> "From initial search", "From initial search", "From i…
#> $ concept_name <chr> "Arteriosclerotic dementia with depression", "Dementi…
#> $ domain_id <chr> "Condition", "Condition", "Condition", "Condition", "…
#> $ vocabulary_id <chr> "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SN…
#> $ standard_concept <chr> "standard", "standard", "standard", "standard", "stan…
What is the difference between this code list and the one from 4182210 and its descendants?
codeComparison |>
group_by(codelist) |>
tally()
#> # A tibble: 2 × 2
#> codelist n
#> <chr> <int>
#> 1 Both 151
#> 2 Only codelist 2 36
What are these extra codes picked up by CodelistGenerator?
codeComparison |>
filter(codelist == "Only codelist 2") |>
glimpse()
#> Rows: 36
#> Columns: 3
#> $ concept_id <int> 4041685, 4043378, 4044415, 4046091, 4092747, 4187091, 425…
#> $ concept_name <chr> "Amyotrophic lateral sclerosis with dementia", "Frontotem…
#> $ codelist <chr> "Only codelist 2", "Only codelist 2", "Only codelist 2", …
Perhaps we want to see what ICD10CM codes map to our candidate code list. We can get these by running
icdMappings <- getMappings(
cdm = cdm,
candidateCodelist = dementiaCodes1,
nonStandardVocabularies = "ICD10CM"
)
icdMappings |>
glimpse()
#> Rows: 191
#> Columns: 7
#> $ standard_concept_id <int> 372610, 374341, 374888, 374888, 374888, 374…
#> $ standard_concept_name <chr> "Postconcussion syndrome", "Huntington's ch…
#> $ standard_vocabulary_id <chr> "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SN…
#> $ non_standard_concept_id <int> 45571706, 35207314, 1568088, 1568089, 37402…
#> $ non_standard_concept_name <chr> "Postconcussional syndrome", "Huntington's …
#> $ non_standard_concept_code <chr> "F07.81", "G10", "F02", "F02.8", "F02.811",…
#> $ non_standard_vocabulary_id <chr> "ICD10CM", "ICD10CM", "ICD10CM", "ICD10CM",…
readMappings <- getMappings(
cdm = cdm,
candidateCodelist = dementiaCodes1,
nonStandardVocabularies = "Read"
)
readMappings |>
glimpse()
#> Rows: 93
#> Columns: 7
#> $ standard_concept_id <int> 372610, 372610, 372610, 372610, 372610, 372…
#> $ standard_concept_name <chr> "Postconcussion syndrome", "Postconcussion …
#> $ standard_vocabulary_id <chr> "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SN…
#> $ non_standard_concept_id <int> 45446542, 45446553, 45453190, 45459905, 455…
#> $ non_standard_concept_name <chr> "Post-concussion syndrome", "[X]Post-trauma…
#> $ non_standard_concept_code <chr> "E2A2.00", "Eu06212", "E2A2.11", "E2A2.12",…
#> $ non_standard_vocabulary_id <chr> "READ", "READ", "READ", "READ", "READ", "RE…
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.