The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Tutorial: tbl_regression

Last Updated: September 13, 2020

Introduction

The tbl_regression() function takes a regression model object in R and returns a formatted table of regression model results that is publication-ready. It is a simple way to summarize and present your analysis results using R! Like tbl_summary(), tbl_regression() creates highly customizable analytic tables with sensible defaults.

This vignette will walk a reader through the tbl_regression() function, and the various functions available to modify and make additions to an existing formatted regression table.

animated

Behind the scenes: tbl_regression() uses broom::tidy() to perform the initial model formatting, and can accommodate many different model types (e.g. lm(), glm(), survival::coxph(), survival::survreg() and others are supported models known to work with {gtsummary}). It is also possible to specify your own function to tidy the model results if needed.

Setup

Before going through the tutorial, install and load {gtsummary}.

# install.packages("gtsummary")
library(gtsummary)

Example data set

In this vignette we’ll be using the trial data set which is included in the {gtsummary package}.

Variable Class Label

trt

character Chemotherapy Treatment

age

numeric Age

marker

numeric Marker Level (ng/mL)

stage

factor T Stage

grade

factor Grade

response

integer Tumor Response

death

integer Patient Died

ttdeath

numeric Months to Death/Censor
Includes mix of continuous, dichotomous, and categorical variables

Basic Usage

The default output from tbl_regression() is meant to be publication ready.

# build logistic regression model
m1 <- glm(response ~ age + stage, trial, family = binomial)

# view raw model results
summary(m1)$coefficients
#>                Estimate Std. Error    z value   Pr(>|z|)
#> (Intercept) -1.48622424 0.62022844 -2.3962530 0.01656365
#> age          0.01939109 0.01146813  1.6908683 0.09086195
#> stageT2     -0.54142643 0.44000267 -1.2305071 0.21850725
#> stageT3     -0.05953479 0.45042027 -0.1321761 0.89484501
#> stageT4     -0.23108633 0.44822835 -0.5155549 0.60616530
tbl_regression(m1, exponentiate = TRUE)
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.091
T Stage
    T1
    T2 0.58 0.24, 1.37 0.2
    T3 0.94 0.39, 2.28 0.9
    T4 0.79 0.33, 1.90 0.6
1 OR = Odds Ratio, CI = Confidence Interval

Note the sensible defaults with this basic usage (that can be customized later):

Customize Output

There are four primary ways to customize the output of the regression model table.

  1. Modify tbl_regression() function input arguments
  2. Add additional data/information to a summary table with add_*() functions
  3. Modify summary table appearance with the {gtsummary} functions
  4. Modify table appearance with {gt} package functions

Modifying function arguments

The tbl_regression() function includes many arguments for modifying the appearance.

Argument Description

label=

modify variable labels in table

exponentiate=

exponentiate model coefficients

include=

names of variables to include in output. Default is all variables

show_single_row=

By default, categorical variables are printed on multiple rows. If a variable is dichotomous and you wish to print the regression coefficient on a single row, include the variable name(s) here.

conf.level=

confidence level of confidence interval

intercept=

indicates whether to include the intercept

estimate_fun=

function to round and format coefficient estimates

pvalue_fun=

function to round and format p-values

tidy_fun=

function to specify/customize tidier function

{gtsummary} functions to add information

The {gtsummary} package has built-in functions for adding to results from tbl_regression(). The following functions add columns and/or information to the regression table.

Function Description

add_global_p()

adds the global p-value for a categorical variables

add_glance_source_note()

adds statistics from `broom::glance()` as source note

add_vif()

adds column of the variance inflation factors (VIF)

add_q()

add a column of q values to control for multiple comparisons

{gtsummary} functions to format table

The {gtsummary} package comes with functions specifically made to modify and format summary tables.

Function Description

modify_header()

update column headers

modify_footnote()

update column footnote

modify_spanning_header()

update spanning headers

modify_caption()

update table caption/title

bold_labels()

bold variable labels

bold_levels()

bold variable levels

italicize_labels()

italicize variable labels

italicize_levels()

italicize variable levels

bold_p()

bold significant p-values

{gt} functions to format table

The {gt} package is packed with many great functions for modifying table output—too many to list here. Review the package’s website for a full listing.

To use the {gt} package functions with {gtsummary} tables, the regression table must first be converted into a {gt} object. To this end, use the as_gt() function after modifications have been completed with {gtsummary} functions.

m1 %>%
  tbl_regression(exponentiate = TRUE) %>%
  as_gt() %>%
  gt::tab_source_note(gt::md("*This data is simulated*"))
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.091
T Stage
    T1
    T2 0.58 0.24, 1.37 0.2
    T3 0.94 0.39, 2.28 0.9
    T4 0.79 0.33, 1.90 0.6
This data is simulated
1 OR = Odds Ratio, CI = Confidence Interval

Example

There are formatting options available, such as adding bold and italics to text. In the example below,
- Coefficients are exponentiated to give odds ratios
- Global p-values for Stage are reported - Large p-values are rounded to two decimal places
- P-values less than 0.10 are bold - Variable labels are bold
- Variable levels are italicized

# format results into data frame with global p-values
m1 %>%
  tbl_regression(
    exponentiate = TRUE,
    pvalue_fun = ~ style_pvalue(.x, digits = 2),
  ) %>%
  add_global_p() %>%
  bold_p(t = 0.10) %>%
  bold_labels() %>%
  italicize_levels()
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.087
T Stage 0.62
    T1
    T2 0.58 0.24, 1.37
    T3 0.94 0.39, 2.28
    T4 0.79 0.33, 1.90
1 OR = Odds Ratio, CI = Confidence Interval

Univariate Regression

The tbl_uvregression() function produces a table of univariate regression models. The function is a wrapper for tbl_regression(), and as a result, accepts nearly identical function arguments. The function’s results can be modified in similar ways to tbl_regression().

trial %>%
  select(response, age, grade) %>%
  tbl_uvregression(
    method = glm,
    y = response,
    method.args = list(family = binomial),
    exponentiate = TRUE,
    pvalue_fun = ~ style_pvalue(.x, digits = 2)
  ) %>%
  add_global_p() %>% # add global p-value
  add_nevent() %>% # add number of events of the outcome
  add_q() %>% # adjusts global p-values for multiple testing
  bold_p() %>% # bold p-values under a given threshold (default 0.05)
  bold_p(t = 0.10, q = TRUE) %>% # now bold q-values under the threshold of 0.10
  bold_labels()
#> add_q: Adjusting p-values with
#> `stats::p.adjust(x$table_body$p.value, method = "fdr")`
Characteristic N Event N OR1 95% CI1 p-value q-value2
Age 183 58 1.02 1.00, 1.04 0.091 0.18
Grade 193 61 0.93 0.93
    I
    II 0.95 0.45, 2.00
    III 1.10 0.52, 2.29
1 OR = Odds Ratio, CI = Confidence Interval
2 False discovery rate correction for multiple testing

Setting Default Options

The {gtsummary} regression functions and their related functions have sensible defaults for rounding and formatting results. If you, however, would like to change the defaults there are a few options. The default options can be changed using the {gtsummary} themes function set_gtsummary_theme(). The package includes pre-specified themes, and you can also create your own. Themes can control baseline behavior, for example, how p-values are rounded, coefficients are rounded, default headers, confidence levels, etc. For details on creating a theme and setting personal defaults, visit the themes vignette.

Supported Models

Below is a listing of known and tested models supported by tbl_regression(). If a model follows a standard format and has a tidier, it’s likely to be supported as well, even if not listed below.

Model Details

biglm::bigglm()

biglmm::bigglm()

brms::brm()

broom.mixed package required

cmprsk::crr()

Limited support. It is recommended to use tidycmprsk::crr() instead.

fixest::feglm()

May fail with R <= 4.0.

fixest::femlm()

May fail with R <= 4.0.

fixest::feNmlm()

May fail with R <= 4.0.

fixest::feols()

May fail with R <= 4.0.

gam::gam()

geepack::geeglm()

glmmTMB::glmmTMB()

broom.mixed package required

lavaan::lavaan()

Limited support for categorical variables

lfe::felm()

lme4::glmer.nb()

broom.mixed package required

lme4::glmer()

broom.mixed package required

lme4::lmer()

broom.mixed package required

logitr::logitr()

Requires logitr >= 0.8.0

MASS::glm.nb()

MASS::polr()

mgcv::gam()

Use default tidier broom::tidy() for smooth terms only, or gtsummary::tidy_gam() to include parametric terms

mice::mira

Limited support. If mod is a mira object, use tidy_plus_plus(mod, tidy_fun = function(x, ...) mice::pool(x) %>% mice::tidy(...))

multgee::nomLORgee()

Experimental support. Use tidy_multgee() as tidy_fun.

multgee::ordLORgee()

Experimental support. Use tidy_multgee() as tidy_fun.

nnet::multinom()

ordinal::clm()

Limited support for models with nominal predictors.

ordinal::clmm()

Limited support for models with nominal predictors.

parsnip::model_fit

Supported as long as the type of model and the engine is supported.

plm::plm()

rstanarm::stan_glm()

broom.mixed package required

stats::aov()

Reference rows are not relevant for such models.

stats::glm()

stats::lm()

stats::nls()

Limited support

survey::svycoxph()

survey::svyglm()

survey::svyolr()

survival::clogit()

survival::coxph()

survival::survreg()

tidycmprsk::crr()

VGAM::vglm()

Limited support. It is recommended to use tidy_parameters() as tidy_fun.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.