The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
table_categorical() builds publication-ready categorical
tables suitable for APA-style reporting in social science and data
science research. With by, it produces grouped
cross-tabulation tables with chi-squared \(p\)-values, effect sizes, confidence
intervals, and multi-level headers. Without by, it produces
one-way frequency-style tables for the selected variables. Export to gt,
tinytable, flextable, Excel, or Word. This vignette walks through the
main features.
For grouped tables, provide a data frame, one or more selected variables, and a grouping variable:
table_categorical(
sochealth,
select = c(smoking, physical_activity, dentist_12m),
by = education
)
#> Categorical table by education
#>
#> Variable │ Lower secondary n Lower secondary % Upper secondary n
#> ───────────────────┼─────────────────────────────────────────────────────────
#> smoking │
#> No │ 179 69.6 415
#> Yes │ 78 30.4 112
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │
#> No │ 177 67.8 310
#> Yes │ 84 32.2 229
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> dentist_12m │
#> No │ 113 43.3 174
#> Yes │ 148 56.7 365
#>
#> Variable │ Upper secondary % Tertiary n Tertiary % Total n
#> ───────────────────┼────────────────────────────────────────────────────
#> smoking │
#> No │ 78.7 332 84.9 926
#> Yes │ 21.3 59 15.1 249
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │
#> No │ 57.5 163 40.8 650
#> Yes │ 42.5 237 59.2 550
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> dentist_12m │
#> No │ 32.3 67 16.8 354
#> Yes │ 67.7 333 83.2 846
#>
#> Variable │ Total % p Cramer's V
#> ───────────────────┼────────────────────────────
#> smoking │ <.001 .14
#> No │ 78.8
#> Yes │ 21.2
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │ <.001 .21
#> No │ 54.2
#> Yes │ 45.8
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> dentist_12m │ <.001 .22
#> No │ 29.5
#> Yes │ 70.5The default output is "default", which prints a styled
ASCII table to the console. Use output = "data.frame" to
get a plain numeric data frame suitable for further processing.
Omit by to build a frequency-style table for the
selected variables:
table_categorical(
sochealth,
select = c(smoking, physical_activity),
output = "default"
)
#> Categorical table
#>
#> Variable │ n %
#> ─────────────────────┼───────────────
#> smoking │
#> No │ 926 78.8
#> Yes │ 249 21.2
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │
#> No │ 650 54.2
#> Yes │ 550 45.8table_categorical() supports several output formats. The
table below summarizes the options:
| Format | Description |
|---|---|
"default" |
Styled ASCII table in the console (default) |
"data.frame" |
Wide data frame, one row per modality |
"long" |
Long data frame, one row per modality x group |
"gt" |
Formatted gt table |
"tinytable" |
Formatted tinytable |
"flextable" |
Formatted flextable |
"excel" |
Excel file (requires excel_path) |
"clipboard" |
Copy to clipboard |
"word" |
Word document (requires word_path) |
The "gt" format produces a table with APA-style borders,
column spanners, and proper alignment:
Use output = "data.frame" for a wide numeric data frame
(one row per modality), or output = "long" for a long
format (one row per modality x group):
table_categorical(
sochealth,
select = smoking,
by = education,
output = "data.frame"
)
#> Variable Level Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking No 179 69.6 415
#> 2 smoking Yes 78 30.4 112
#> Upper secondary % Tertiary n Tertiary % Total n Total % p
#> 1 78.7 332 84.9 926 78.8 2.012877e-05
#> 2 21.3 59 15.1 249 21.2 2.012877e-05
#> Cramer's V
#> 1 0.1356677
#> 2 0.1356677By default, table_categorical() uses variable names as
row headers. Use the labels argument to provide
human-readable labels. Two forms are accepted (matching
table_continuous() and
table_continuous_lm()):
data – the recommended form. Only listed columns are
relabelled; others fall back to the column name.select – the legacy spicy < 0.11.0 form, kept for
backward compatibility.table_categorical() picks the association measure per
row variable based on the variable type
(assoc_measure = "auto", the default):
by) -> phi,tau_b,V.When the chosen measures differ across rows, the column header
collapses to "Effect size" and an APA-style
Note. line documents which measure was used for each
variable.
Override with a single string for uniform application, or with a named vector to mix measures per row:
# Uniform: same measure for every row variable
table_categorical(
sochealth,
select = smoking,
by = education,
assoc_measure = "lambda",
output = "tinytable"
)# Per-row: pick the right measure for each variable.
# `smoking` x `education` is 2x3 (binary x ordered) -> Cramer's V;
# `self_rated_health` x `education` is ordered x ordered -> Tau-b.
# The mixed result collapses the header to "Effect size" and adds an
# APA `Note.` line documenting the per-row measure.
table_categorical(
sochealth,
select = c(smoking, self_rated_health),
by = education,
assoc_measure = c(
smoking = "cramer_v",
self_rated_health = "tau_b"
),
output = "tinytable"
)Add confidence intervals with assoc_ci = TRUE. In
rendered formats (gt, tinytable,
flextable, word), the CI is shown inline:
pkgdown_dark_gt(
table_categorical(
sochealth,
select = c(smoking, physical_activity),
by = education,
assoc_ci = TRUE,
output = "gt"
)
)In data formats ("data.frame", "long",
"excel", "clipboard"), separate
CI lower and CI upper columns are added:
table_categorical(
sochealth,
select = smoking,
by = education,
assoc_ci = TRUE,
output = "data.frame"
)
#> Variable Level Lower secondary n Lower secondary % Upper secondary n
#> 1 smoking No 179 69.6 415
#> 2 smoking Yes 78 30.4 112
#> Upper secondary % Tertiary n Tertiary % Total n Total % p
#> 1 78.7 332 84.9 926 78.8 2.012877e-05
#> 2 21.3 59 15.1 249 21.2 2.012877e-05
#> Cramer's V CI lower CI upper
#> 1 0.1356677 0.07909264 0.1913716
#> 2 0.1356677 0.07909264 0.1913716Pass survey weights with the weights argument. Use
rescale = TRUE so the total weighted N matches the
unweighted N:
By default, rows with missing values are dropped
(drop_na = TRUE). Set drop_na = FALSE to
display them as a “(Missing)” category:
Use levels_keep to display only specific modalities. The
order you specify controls the display order, which is useful for
placing “(Missing)” first to highlight missingness:
Control the number of digits for percentages, p-values, and the association measure:
pkgdown_dark_gt(
table_categorical(
sochealth,
select = smoking,
by = education,
percent_digits = 2,
p_digits = 4,
v_digits = 3,
output = "gt"
)
)p_digits drives both the displayed precision of the
p column and the small-p threshold
(p_digits = 3 -> <.001,
p_digits = 4 -> <.0001), matching
table_continuous() and
table_continuous_lm().
By default (align = "decimal") numeric columns are
aligned on the decimal mark, the standard scientific-publication
convention used by SPSS, SAS, LaTeX siunitx, and the native
primitives of gt::cols_align_decimal() and
tinytable::style_tt(align = "d"). Engines without a native
primitive (flextable, word,
clipboard, ASCII print) get the alignment via leading /
trailing space padding, with flextable / word
switching the body font to Consolas so character widths
match.
Pass align = "auto" to revert to the legacy uniform
right-alignment used in spicy < 0.11.0:
table_categorical(
sochealth,
select = c(smoking, physical_activity),
by = sex,
align = "auto"
)
#> Categorical table by sex
#>
#> Variable │ Female n Female % Male n Male % Total n Total % p
#> ───────────────────┼────────────────────────────────────────────────────────────
#> smoking │ .713
#> No │ 475 78.4 451 79.3 926 78.8
#> Yes │ 131 21.6 118 20.7 249 21.2
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │ .832
#> No │ 334 53.9 316 54.5 650 54.2
#> Yes │ 286 46.1 264 45.5 550 45.8
#>
#> Variable │ Phi
#> ───────────────────┼─────
#> smoking │ .01
#> No │
#> Yes │
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌
#> physical_activity │ .01
#> No │
#> Yes │"center" and "right" apply literal
alignment.
table_categorical() returns an object that can be
coerced to a plain data.frame / tbl_df
(stripping the spicy formatting attributes) or piped into
broom::tidy() / broom::glance() for any
downstream tidyverse-stats workflow:
out <- table_categorical(
sochealth,
select = c(smoking, physical_activity),
by = sex
)
#> Categorical table by sex
#>
#> Variable │ Female n Female % Male n Male % Total n Total % p
#> ───────────────────┼────────────────────────────────────────────────────────────
#> smoking │ .713
#> No │ 475 78.4 451 79.3 926 78.8
#> Yes │ 131 21.6 118 20.7 249 21.2
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
#> physical_activity │ .832
#> No │ 334 53.9 316 54.5 650 54.2
#> Yes │ 286 46.1 264 45.5 550 45.8
#>
#> Variable │ Phi
#> ───────────────────┼─────
#> smoking │ .01
#> No │
#> Yes │
#> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌
#> physical_activity │ .01
#> No │
#> Yes │
# One row per (variable x level x group) with broom-style columns
# (outcome, level, group, n, proportion). The synthetic Total
# margin is excluded so each observation is counted once.
broom::tidy(out)
#> # A tibble: 8 × 5
#> outcome level group n proportion
#> <chr> <chr> <chr> <int> <dbl>
#> 1 smoking No Female 475 0.784
#> 2 smoking No Male 451 0.793
#> 3 smoking Yes Female 131 0.216
#> 4 smoking Yes Male 118 0.207
#> 5 physical_activity No Female 334 0.539
#> 6 physical_activity No Male 316 0.545
#> 7 physical_activity Yes Female 286 0.461
#> 8 physical_activity Yes Male 264 0.455
# One row per outcome with the omnibus chi-squared test and the
# chosen association measure (test_type, statistic, df, p.value,
# assoc_type, assoc_value, assoc_ci_lower / assoc_ci_upper, n_total).
broom::glance(out)
#> # A tibble: 2 × 10
#> outcome test_type statistic df p.value assoc_type assoc_value
#> <chr> <chr> <dbl> <int> <dbl> <chr> <dbl>
#> 1 physical_activity chi_squared 0.0452 1 0.832 Phi 0.00614
#> 2 smoking chi_squared 0.136 1 0.713 Phi 0.0107
#> # ℹ 3 more variables: assoc_ci_lower <dbl>, assoc_ci_upper <dbl>, n_total <int>For Excel export, provide a file path:
table_categorical(
sochealth,
select = c(smoking, physical_activity, dentist_12m),
by = education,
output = "excel",
excel_path = "my_table.xlsx"
)For Word, use output = "word":
table_categorical(
sochealth,
select = c(smoking, physical_activity, dentist_12m),
by = education,
output = "word",
word_path = "my_table.docx"
)You can also copy directly to the clipboard for pasting into a spreadsheet or a text editor:
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.