The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Working with the socialrisk Package

Wyatt P. Bensken

2023-02-15

Introduction

The goal of socialrisk is to create an efficient way to identify social risk from administrative health care data using ICD-10 diagnosis codes.

Load Sample Data

We’ve created a sample dataset of ICD-10 administrative data which we can load in.

i10_wide
#>    patient_id    sex date_of_serv    dx1    dx2    dx3    dx4    dx5 visit_type
#> 1        1001   male   2020-02-14   E876   Z560  Z6372   Z654   E440         ip
#> 2        1001   male   2021-05-15   J189   Z644   A408    I10   G309         ip
#> 3        1001   male   2021-01-10   I119   Z628    I10   <NA>   <NA>         ot
#> 4        1001   male   2021-04-02   G309   K731   Z591   <NA>   <NA>         ot
#> 5        1001   male   2021-05-06   E039    I10   J189   <NA>   <NA>         ot
#> 6        1001   male   2021-06-04   J189   Z604   F329   <NA>   <NA>         ot
#> 7        1001   male   2021-10-01  E0800   G309    I10   <NA>   <NA>         ot
#> 8        1001   male   2021-11-05  I6011    I10   F329   R930   <NA>         ot
#> 9        1001   male   2022-02-01   M546   G309    I10  I6011   <NA>         ot
#> 10       1001   male   2022-03-15  E0800    I10   J189   F329   <NA>         ot
#> 11       1002 female   2020-01-09   G459   Z598   E840   <NA>   <NA>         ip
#> 12       1002 female   2020-03-23   E840   Z591   <NA>   <NA>   <NA>         ot
#> 13       1002 female   2020-09-07   E119   Z558   <NA>   <NA>   <NA>         ot
#> 14       1002 female   2020-12-05   E840   E119   <NA>   <NA>   <NA>         ot
#> 15       1002 female   2022-03-25   F419   E119   G459   <NA>   <NA>         ot
#> 16       1003   male   2020-02-15  F3010  F1910    I10 G40909   R296         ip
#> 17       1003   male   2020-03-31  F3010   Z562   E109   <NA>   <NA>         ot
#> 18       1003   male   2020-12-31   K762   R569   Z576   <NA>   <NA>         ot
#> 19       1003   male   2021-12-22   E109   R569  F1910  F4310   <NA>         ot
#> 20       1003   male   2021-12-25 G40909  F1910   R569   <NA>   <NA>         ot
#> 21       1003   male   2022-08-28   K762   Z564   <NA>   <NA>   <NA>         ot
#> 22       1003   male   2022-09-05   E109   K762  F4310   <NA>   <NA>         ot
#> 23       1004 female   2021-01-09 C50111  F1020   F330   <NA>   <NA>         ot
#> 24       1004 female   2021-04-15 C50111   F330   <NA>   <NA>   <NA>         ot
#> 25       1004 female   2021-06-08   F329 C50111  F1020   <NA>   <NA>         ot
#> 26       1005 female   2020-01-27  K4000   G839  R1030   R251 G43909         ip
#> 27       1005 female   2020-11-13 G43909  K4000   G839   <NA>   <NA>         ot
#> 28       1005 female   2021-12-07    J22   G839 G43909   <NA>   <NA>         ot
#> 29       1005 female   2021-12-26  B2790    J22   G839   <NA>   <NA>         ot
#>    hcpcs icd_version
#> 1  E2201          10
#> 2  E2201          10
#> 3  E2201          10
#> 4  E2201          10
#> 5  E2201          10
#> 6  E2201          10
#> 7  E2201          10
#> 8  E2201          10
#> 9  E2201          10
#> 10 E2201          10
#> 11 E0159          10
#> 12 E0159          10
#> 13 E0159          10
#> 14 E0159          10
#> 15 E0159          10
#> 16 E1353          10
#> 17 E1353          10
#> 18 E1353          10
#> 19 E1353          10
#> 20 E1353          10
#> 21 E1353          10
#> 22 E1353          10
#> 23 A7047          10
#> 24 A7047          10
#> 25 A7047          10
#> 26 K0669          10
#> 27 K0669          10
#> 28  <NA>          10
#> 29  <NA>          10

Preparing the Data

We use the built-in clean_data() function to specify the: dataset, patient id, current data format (wide or long), and the prefix of the diagnoses variables.

data <- clean_data(dat = i10_wide,
                   id = patient_id,
                   style = "wide",
                   prefix_dx = "dx")
#> # A tibble: 10 × 2
#>    patient_id dx   
#>    <fct>      <chr>
#>  1 1001       E876 
#>  2 1001       Z560 
#>  3 1001       Z6372
#>  4 1001       Z654 
#>  5 1001       E440 
#>  6 1001       J189 
#>  7 1001       Z644 
#>  8 1001       A408 
#>  9 1001       I10  
#> 10 1001       G309

Social Risk

Now, we can run our various social risk functions, with varying taxonomies.

Centers for Medicare and Medicaid Services (CMS)

cms <- socialrisk(dat = data, id = patient_id, dx = dx, taxonomy = "cms")
#> # A tibble: 5 × 12
#>   patient_id any_social_risk number_domains z55_education z56_employment
#>   <fct>                <dbl>          <dbl>         <dbl>          <dbl>
#> 1 1001                     1              7             0              1
#> 2 1002                     1              2             1              0
#> 3 1003                     1              2             0              1
#> 4 1004                     0              0             0              0
#> 5 1005                     0              0             0              0
#> # … with 7 more variables: z57_occupation <dbl>, z59_housing <dbl>,
#> #   z60_social <dbl>, z62_upbringing <dbl>, z63_family <dbl>,
#> #   z64_psychosocial <dbl>, z65_psychosocial_other <dbl>

Missouri Hospital Association

mha <- socialrisk(dat = data, id = patient_id, dx = dx, taxonomy = "mha")
#> # A tibble: 5 × 8
#>   patient_id any_social_risk number_domains employment family housing
#>   <fct>                <dbl>          <dbl>      <dbl>  <dbl>   <dbl>
#> 1 1001                     1              5          1      1       1
#> 2 1002                     1              2          0      0       1
#> 3 1003                     1              1          1      0       0
#> 4 1004                     0              0          0      0       0
#> 5 1005                     0              0          0      0       0
#> # … with 2 more variables: psychosocial <dbl>, ses <dbl>

SIREN - UCSF

siren <- socialrisk(dat = data, id = patient_id, dx = dx, taxonomy = "siren")
#> Note: The SIREN Compendium assigns multiple domains to each code, resulting in non-mutally exclusive groups.
#> # A tibble: 5 × 19
#>   patient_id any_social_risk number_domains access education employment finances
#>   <fct>                <dbl>          <dbl>  <dbl>     <dbl>      <dbl>    <dbl>
#> 1 1001                     1              5      0         0          1        0
#> 2 1002                     1              6      1         1          0        1
#> 3 1003                     1              1      0         0          1        0
#> 4 1004                     0              0      0         0          0        0
#> 5 1005                     0              0      0         0          0        0
#> # … with 12 more variables: food <dbl>, housing <dbl>, immigration <dbl>,
#> #   incarceration <dbl>, language <dbl>, race_eth <dbl>, safety <dbl>,
#> #   soc_connect <dbl>, stress <dbl>, transportation <dbl>, utilities <dbl>,
#> #   veteran <dbl>

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.