At the surface, counting sounds pretty simple, right? You just want
to know how many occurrences of something there are. Well -
unfortunately, it’s not that easy. And in clinical reports, there’s
quite a bit of nuance that goes into the different types of frequency
tables that need to be created. Fortunately, we’ve added a good bit of
flexibility into group_count()
to help you get what you
need when creating these reports, whether you’re creating a demographics
table, adverse events, or lab results.
Let’s start with a basic example. This table demonstrates the distribution of subject disposition across treatment groups. Additionally, we’re sorting by descending total occurrences using the “Total” group.
<- tplyr_table(adsl, TRT01P, where = SAFFL == "Y") %>%
t add_total_group() %>%
add_treat_grps(Treated = c("Xanomeline Low Dose", "Xanomeline High Dose")) %>%
add_layer(
group_count(DCDECOD) %>%
set_order_count_method("bycount") %>%
set_ordering_cols(Total)
%>%
) build() %>%
arrange(desc(ord_layer_1)) %>%
select(starts_with("row"), var1_Placebo, `var1_Xanomeline Low Dose`,
`var1_Xanomeline High Dose`, var1_Treated, var1_Total)
kable(t)
row_label1 | var1_Placebo | var1_Xanomeline Low Dose | var1_Xanomeline High Dose | var1_Treated | var1_Total |
---|---|---|---|---|---|
COMPLETED | 58 ( 67.4%) | 25 ( 29.8%) | 27 ( 32.1%) | 52 ( 31.0%) | 110 ( 43.3%) |
ADVERSE EVENT | 8 ( 9.3%) | 44 ( 52.4%) | 40 ( 47.6%) | 84 ( 50.0%) | 92 ( 36.2%) |
WITHDRAWAL BY SUBJECT | 9 ( 10.5%) | 10 ( 11.9%) | 8 ( 9.5%) | 18 ( 10.7%) | 27 ( 10.6%) |
STUDY TERMINATED BY SPONSOR | 2 ( 2.3%) | 2 ( 2.4%) | 3 ( 3.6%) | 5 ( 3.0%) | 7 ( 2.8%) |
PROTOCOL VIOLATION | 2 ( 2.3%) | 1 ( 1.2%) | 3 ( 3.6%) | 4 ( 2.4%) | 6 ( 2.4%) |
LACK OF EFFICACY | 3 ( 3.5%) | 0 ( 0.0%) | 1 ( 1.2%) | 1 ( 0.6%) | 4 ( 1.6%) |
DEATH | 2 ( 2.3%) | 1 ( 1.2%) | 0 ( 0.0%) | 1 ( 0.6%) | 3 ( 1.2%) |
PHYSICIAN DECISION | 1 ( 1.2%) | 0 ( 0.0%) | 2 ( 2.4%) | 2 ( 1.2%) | 3 ( 1.2%) |
LOST TO FOLLOW-UP | 1 ( 1.2%) | 1 ( 1.2%) | 0 ( 0.0%) | 1 ( 0.6%) | 2 ( 0.8%) |
Another exceptionally important consideration within count layers is whether you should be using distinct counts, non-distinct counts, or some combination of both. Adverse event tables are a perfect example. Often, you’re concerned about how many subjects had an adverse event in particular instead of just the number of occurrences of that adverse event. Similarly, the number occurrences of an event isn’t necessarily relevant when compared to the total number of adverse events that occurred. For this reason, what you likely want to look at is instead the number of subjects who experienced an event compared to the total number of subjects in that treatment group.
‘Tplyr’ allows you to focus on these distinct counts and distinct
percents within some grouping variable, like subject. Additionally, you
can mix and match with the distinct counts with non-distinct counts in
the same row too. The set_distinct_by()
function sets the
variables used to calculate the distinct occurrences of some value using
the specified distinct_by
variables.
<- tplyr_table(adae, TRTA) %>%
t add_layer(
group_count(AEDECOD) %>%
set_distinct_by(USUBJID) %>%
set_format_strings(f_str("xxx (xx.xx%) [xxx]", distinct_n, distinct_pct, n))
%>%
) build() %>%
head()
kable(t)
row_label1 | var1_Placebo | var1_Xanomeline High Dose | var1_Xanomeline Low Dose | ord_layer_index | ord_layer_1 |
---|---|---|---|---|---|
ACTINIC KERATOSIS | 0 ( 0.00%) [ 0] | 1 ( 2.38%) [ 1] | 0 ( 0.00%) [ 0] | 1 | 1 |
ALOPECIA | 1 ( 4.76%) [ 1] | 0 ( 0.00%) [ 0] | 0 ( 0.00%) [ 0] | 1 | 2 |
BLISTER | 0 ( 0.00%) [ 0] | 1 ( 2.38%) [ 2] | 5 (11.90%) [ 8] | 1 | 3 |
COLD SWEAT | 1 ( 4.76%) [ 3] | 0 ( 0.00%) [ 0] | 0 ( 0.00%) [ 0] | 1 | 4 |
DERMATITIS ATOPIC | 1 ( 4.76%) [ 1] | 0 ( 0.00%) [ 0] | 0 ( 0.00%) [ 0] | 1 | 5 |
DERMATITIS CONTACT | 0 ( 0.00%) [ 0] | 0 ( 0.00%) [ 0] | 1 ( 2.38%) [ 2] | 1 | 6 |
You may have seen tables before like the one above. This display
shows the number of subjects who experienced an adverse event, the
percentage of subjects within the given treatment group who experienced
that event, and then the total number of occurrences of that event.
Using set_distinct_by()
triggered the derivation of
distinct_n
and distinct_pct
in addition to the
n
and pct
created within
group_count
. The display of the values is then controlled
by the f_str()
call in
set_format_strings()
.
Certain summary tables present counts within groups. One example could be in a disposition table where a disposition reason of “Other” summarizes what those other reasons were. A very common example is an Adverse Event table that displays counts for body systems, and then the events within those body systems. This is again a nuanced situation - there are two variables being summarized: The body system counts, and the advert event counts.
One way to approach this would be creating two summaries. One
summarizing the body system, and another summarizing the preferred terms
by body system, and then merging the two together. But we don’t want you
to have to do that. Instead, we handle this complexity for you. This is
done in group_count()
by submitting two target variables
with dplyr::vars()
. The first variable should be your
grouping variable that you want summarized, which we refer to as the
“Outside” variable, and the second should have the narrower scope, which
we call the “Inside” variable.
The example below demonstrates how to do a nested summary. Look at
the first row - here row_label1
and row_label2
are both “CARDIAC DISORDERS”. This line is the summary for
AEBODSYS.
In the rows below that, row_label1
continues on with the value “CARDIAC DISORDERS”, but
row_label2
changes. These are the summaries for
AEDECOD
.
tplyr_table(adae, TRTA) %>%
add_layer(
group_count(vars(AEBODSYS, AEDECOD))
%>%
) build() %>%
head() %>%
kable()
row_label1 | row_label2 | var1_Placebo | var1_Xanomeline High Dose | var1_Xanomeline Low Dose | ord_layer_index | ord_layer_1 | ord_layer_2 |
---|---|---|---|---|---|---|---|
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | 47 (100.0%) | 111 (100.0%) | 118 (100.0%) | 1 | 1 | Inf |
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | ACTINIC KERATOSIS | 0 ( 0.0%) | 1 ( 0.9%) | 0 ( 0.0%) | 1 | 1 | 1 |
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | ALOPECIA | 1 ( 2.1%) | 0 ( 0.0%) | 0 ( 0.0%) | 1 | 1 | 2 |
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | BLISTER | 0 ( 0.0%) | 2 ( 1.8%) | 8 ( 6.8%) | 1 | 1 | 3 |
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | COLD SWEAT | 3 ( 6.4%) | 0 ( 0.0%) | 0 ( 0.0%) | 1 | 1 | 4 |
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | DERMATITIS ATOPIC | 1 ( 2.1%) | 0 ( 0.0%) | 0 ( 0.0%) | 1 | 1 | 5 |
This accomplishes what we needed, but it’s not exactly the presentation you might hope for. We have a solution for this as well.
tplyr_table(adae, TRTA) %>%
add_layer(
group_count(vars(AEBODSYS, AEDECOD)) %>%
set_nest_count(TRUE) %>%
set_indentation("--->")
%>%
) build() %>%
head() %>%
kable()
row_label1 | var1_Placebo | var1_Xanomeline High Dose | var1_Xanomeline Low Dose | ord_layer_index | ord_layer_1 | ord_layer_2 |
---|---|---|---|---|---|---|
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | 47 (100.0%) | 111 (100.0%) | 118 (100.0%) | 1 | 1 | Inf |
—>ACTINIC KERATOSIS | 0 ( 0.0%) | 1 ( 0.9%) | 0 ( 0.0%) | 1 | 1 | 1 |
—>ALOPECIA | 1 ( 2.1%) | 0 ( 0.0%) | 0 ( 0.0%) | 1 | 1 | 2 |
—>BLISTER | 0 ( 0.0%) | 2 ( 1.8%) | 8 ( 6.8%) | 1 | 1 | 3 |
—>COLD SWEAT | 3 ( 6.4%) | 0 ( 0.0%) | 0 ( 0.0%) | 1 | 1 | 4 |
—>DERMATITIS ATOPIC | 1 ( 2.1%) | 0 ( 0.0%) | 0 ( 0.0%) | 1 | 1 | 5 |
By using set_nest_count()
, this triggers ‘Tplyr’ to drop
row_label1, and indent all of the AEDECOD values within row_label2. The
columns are renamed appropriately as well. The default indentation used
will be 3 spaces, but as you can see here - you can set the indentation
however you like. This let’s you use tab strings for different
language-specific output types, stick with spaces, indent wider or
smaller - whatever you wish. All of the existing order variables remain,
so this has no impact on your ability to sort the table.
There’s a lot more to counting! So be sure to check out our vignettes on sorting, shift tables, and denominators.