The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The package provides functionalities to tidy a summarised result to obtain a dataframe with which is easier to do subsequent calculations.
In this line, the split
functions, described in
split and unite functions allow to interact with
name-level columns.
For the estimates, we have the pivotEstimates
function,
and for the settings addSettings
. Finally the
tidy
method accommodates the split and pivot
functionalities in the same function.
First, let’s load relevant libraries and create a mock summarised result table.
library(visOmopResults)
library(dplyr)
result <- mockSummarisedResult()
result |> glimpse()
#> Rows: 126
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1"…
#> $ strata_name <chr> "overall", "age_group &&& sex", "age_group &&& sex", …
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&& Male", "<40 &&& …
#> $ variable_name <chr> "number subjects", "number subjects", "number subject…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "count", "count", "count", "count", "count", "count",…
#> $ estimate_type <chr> "integer", "integer", "integer", "integer", "integer"…
#> $ estimate_value <chr> "9337847", "4006478", "2868369", "7818476", "9065176"…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
The function pivotEstimates
adds columns containing the
estimates values for each combination of columns in
pivotEstimatesBy
. For instance, in the following example we
use the columns variable_name, variable_level, and
estimate_name to pivot the estimates.
result |>
pivotEstimates(pivotEstimatesBy = c("variable_name", "variable_level", "estimate_name")) |>
glimpse()
#> Rows: 18
#> Columns: 15
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mo…
#> $ group_name <chr> "cohort_name", "cohort_name", "coho…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "c…
#> $ strata_name <chr> "overall", "age_group &&& sex", "ag…
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&…
#> $ additional_name <chr> "overall", "overall", "overall", "o…
#> $ additional_level <chr> "overall", "overall", "overall", "o…
#> $ `number subjects_NA_count` <int> 9337847, 4006478, 2868369, 7818476,…
#> $ age_NA_mean <dbl> 30.49621, 27.51317, 19.64153, 84.40…
#> $ age_NA_sd <dbl> 3.3287556, 4.6797953, 3.8420378, 7.…
#> $ Medications_Amoxiciline_count <int> 21944, 70846, 27309, 44353, 34557, …
#> $ Medications_Amoxiciline_percentage <dbl> 12.759029, 81.434293, 99.356778, 49…
#> $ Medications_Ibuprofen_count <int> 2795, 1362, 94596, 12537, 66965, 25…
#> $ Medications_Ibuprofen_percentage <dbl> 30.713166, 8.628628, 59.166925, 83.…
The argument nameStyle
is to customise the names of the
new columns. It uses the glue package syntax. For instance:
result |>
pivotEstimates(pivotEstimatesBy = "estimate_name",
nameStyle = "{toupper(estimate_name)}") |>
glimpse()
#> Rows: 72
#> Columns: 14
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1"…
#> $ strata_name <chr> "overall", "age_group &&& sex", "age_group &&& sex", …
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&& Male", "<40 &&& …
#> $ variable_name <chr> "number subjects", "number subjects", "number subject…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ COUNT <int> 9337847, 4006478, 2868369, 7818476, 9065176, 2211710,…
#> $ MEAN <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ SD <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ PERCENTAGE <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
The function addSettings
adds a new column for each of
the settings in the summarised result, if any:
mockSummarisedResult() |>
addSettings() |>
glimpse()
#> Rows: 126
#> Columns: 16
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock…
#> $ group_name <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1"…
#> $ strata_name <chr> "overall", "age_group &&& sex", "age_group &&& sex", …
#> $ strata_level <chr> "overall", "<40 &&& Male", ">=40 &&& Male", "<40 &&& …
#> $ variable_name <chr> "number subjects", "number subjects", "number subject…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "count", "count", "count", "count", "count", "count",…
#> $ estimate_type <chr> "integer", "integer", "integer", "integer", "integer"…
#> $ estimate_value <chr> "2703410", "3101646", "4285343", "2451643", "6496595"…
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ result_type <chr> "mock_summarised_result", "mock_summarised_result", "…
#> $ package_name <chr> "visOmopResults", "visOmopResults", "visOmopResults",…
#> $ package_version <chr> "0.3.0", "0.3.0", "0.3.0", "0.3.0", "0.3.0", "0.3.0",…
Finally, the method tidy
incorporates the splitting pf
name-level columns and pivotting of estimates and settings. By default,
it splits group, strata and additional, pivots estimates by the columns
“estimate_name” and also pivots the settings.
result <- mockSummarisedResult()
result |>
tidy() |>
glimpse()
#> Rows: 72
#> Columns: 14
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock"…
#> $ cohort_name <chr> "cohort1", "cohort1", "cohort1", "cohort1", "cohort1",…
#> $ age_group <chr> "overall", "<40", ">=40", "<40", ">=40", "overall", "o…
#> $ sex <chr> "overall", "Male", "Male", "Female", "Female", "Male",…
#> $ variable_name <chr> "number subjects", "number subjects", "number subjects…
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ count <int> 3397666, 5378334, 1665180, 7493291, 1764428, 6818035, …
#> $ mean <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ sd <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ percentage <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ result_type <chr> "mock_summarised_result", "mock_summarised_result", "m…
#> $ package_name <chr> "visOmopResults", "visOmopResults", "visOmopResults", …
#> $ package_version <chr> "0.3.0", "0.3.0", "0.3.0", "0.3.0", "0.3.0", "0.3.0", …
Which column pairs to split can be customised with the split
arguments, while pivotEstimatesBy
and
nameStyle
are for pivotting estimates. If
pivotEstimatesBy
is NULL
or
character()
, estimates will not be modified. Settings will
always be pivotted if present.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.