The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Regression tables with huxreg

Huxtable includes the function huxreg to build a table of regressions.

You call huxreg with a list of models. These models can be of any class which has a tidy method defined in the broom package. The method should return a list of regression coefficients with names term, estimate, std.error and p.value. That covers most standard regression packages.

Let’s start by running some regressions to predict a diamond’s price.

data(diamonds, package = "ggplot2")
diamonds <- diamonds[1:100,]

lm1 <- lm(price ~ carat + depth, diamonds)
lm2 <- lm(price ~ depth + factor(color, ordered = FALSE), diamonds)
lm3 <- lm(log(price) ~ carat + depth, diamonds)

Now, we use huxreg to display the regression output side by side.

huxreg(lm1, lm2, lm3)
(1)(2)(3)
(Intercept)981.607    900.067 6.269 ***
(720.175)   (2431.815)(0.782)   
carat4328.324 ***     3.531 ***
(136.755)        (0.149)   
depth-27.785 *  -6.804 -0.019    
(11.656)   (39.293)(0.013)   
factor(color, ordered = FALSE)E        449.490         
        (239.388)        
factor(color, ordered = FALSE)F        391.705         
        (290.880)        
factor(color, ordered = FALSE)G        583.111         
        (308.513)        
factor(color, ordered = FALSE)H        126.916         
        (256.367)        
factor(color, ordered = FALSE)I        -47.220         
        (253.092)        
factor(color, ordered = FALSE)J        -123.430         
        (269.157)        
N100        100     100        
R20.912    0.123 0.854    
logLik-675.703    -790.788 6.822    
AIC1359.405    1599.576 -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05.

The basic output includes estimates, standard errors and summary statistics.

Some of those variable names are hard to read. We can change them by providing a named vector of variables in the coefs argument.

color_names <- grep("factor", names(coef(lm2)), value = TRUE)
names(color_names) <- gsub(".*)(.)", "Color: \\1", color_names)

huxreg(lm1, lm2, lm3, coefs = c("Carat" = "carat", "Depth" = "depth", color_names))
(1)(2)(3)
Carat4328.324 ***     3.531 ***
(136.755)        (0.149)   
Depth-27.785 *  -6.804 -0.019    
(11.656)   (39.293)(0.013)   
Color: E        449.490         
        (239.388)        
Color: F        391.705         
        (290.880)        
Color: G        583.111         
        (308.513)        
Color: H        126.916         
        (256.367)        
Color: I        -47.220         
        (253.092)        
Color: J        -123.430         
        (269.157)        
N100        100     100        
R20.912    0.123 0.854    
logLik-675.703    -790.788 6.822    
AIC1359.405    1599.576 -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05.

Or, since the output from huxreg is just a huxtable, we could just edit its contents directly.

diamond_regs <- huxreg(lm1, lm2, lm3)
diamond_regs[seq(8, 18, 2), 1] <- paste("Color:", LETTERS[5:10])

# prints the same as above

Of course, we aren’t limited to just changing names. We can also make our table prettier. Let’s put our footnote in italic, add a caption, and highlight the cell background of significant coefficients. All of these are just standard huxtable commands.

suppressPackageStartupMessages(library(dplyr))

diamond_regs |> 
      map_background_color(-1, -1, by_regex(
        "\\*" = "yellow"
      )) |> 
      set_italic(final(1), 1) |> 
      set_caption("Linear regressions of diamond prices")
Linear regressions of diamond prices
(1)(2)(3)
(Intercept)981.607    900.067 6.269 ***
(720.175)   (2431.815)(0.782)   
carat4328.324 ***     3.531 ***
(136.755)        (0.149)   
depth-27.785 *  -6.804 -0.019    
(11.656)   (39.293)(0.013)   
Color: E        449.490         
        (239.388)        
Color: F        391.705         
        (290.880)        
Color: G        583.111         
        (308.513)        
Color: H        126.916         
        (256.367)        
Color: I        -47.220         
        (253.092)        
Color: J        -123.430         
        (269.157)        
N100        100     100        
R20.912    0.123 0.854    
logLik-675.703    -790.788 6.822    
AIC1359.405    1599.576 -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05.

By default, standard errors are shown below coefficient estimates. To display them in a column to the right, use error_pos = "right":

huxreg(lm1, lm3, error_pos = "right")
(1)(2)
(Intercept)981.607    (720.175)6.269 ***(0.782)
carat4328.324 ***(136.755)3.531 ***(0.149)
depth-27.785 *  (11.656)-0.019    (0.013)
N100             100             
R20.912         0.854         
logLik-675.703         6.822         
AIC1359.405         -5.644         
*** p < 0.001; ** p < 0.01; * p < 0.05.

This will give column headings a column span of 2.

To display standard errors in the same cell as estimates, use error_pos = "same":

huxreg(lm1, lm3, error_pos = "same")
(1)(2)
(Intercept)981.607 (720.175)6.269 *** (0.782)
carat4328.324 *** (136.755)3.531 *** (0.149)
depth-27.785 * (11.656)-0.019 (0.013)
N100     100     
R20.912 0.854 
logLik-675.703 6.822 
AIC1359.405 -5.644 
*** p < 0.001; ** p < 0.01; * p < 0.05.

You can change the default column headings by naming the model arguments:

huxreg("Price" = lm1, "Log price" = lm3)
PriceLog price
(Intercept)981.607    6.269 ***
(720.175)   (0.782)   
carat4328.324 ***3.531 ***
(136.755)   (0.149)   
depth-27.785 *  -0.019    
(11.656)   (0.013)   
N100        100        
R20.912    0.854    
logLik-675.703    6.822    
AIC1359.405    -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05.

To display a particular row of summary statistics, use the statistics parameter. This should be a character vector. Valid values are anything returned from your models by broom::glance:

gl <- as_hux(broom::glance(lm1))

gl |> 
      restack_down(cols = 3, on_remainder = "fill") |> 
      set_bold(odds, everywhere)
r.squaredadj.r.squaredsigma
0.912   0.91    211       
statisticp.valuedf
504       5.65e-522       
logLikAICBIC
-676       1.36e+031.37e+03
deviancedf.residualnobs
4.33e+0697100

Another value you can use is "nobs", which returns the number of observations from the regression. If the statistics vector has names, these will be used for row headings:

huxreg(lm1, lm3, statistics = c("N. obs." = "nobs", 
      "R squared" = "r.squared", "F statistic" = "statistic",
      "P value" = "p.value"))
(1)(2)
(Intercept)981.607    6.269 ***
(720.175)   (0.782)   
carat4328.324 ***3.531 ***
(136.755)   (0.149)   
depth-27.785 *  -0.019    
(11.656)   (0.013)   
N. obs.100        100        
R squared0.912    0.854    
F statistic504.082    283.881    
P value0.000    0.000    
*** p < 0.001; ** p < 0.01; * p < 0.05.

By default, huxreg displays significance stars. You can alter the symbols used and significance levels with the stars parameter, or set stars = NULL to turn off significance stars completely.

huxreg(lm1, lm3, stars = c(`*` = 0.1, `**` = 0.05, `***` = 0.01)) # a little boastful?
(1)(2)
(Intercept)981.607    6.269 ***
(720.175)   (0.782)   
carat4328.324 ***3.531 ***
(136.755)   (0.149)   
depth-27.785 ** -0.019    
(11.656)   (0.013)   
N100        100        
R20.912    0.854    
logLik-675.703    6.822    
AIC1359.405    -5.644    
*** p < 0.01; ** p < 0.05; * p < 0.1.

You aren’t limited to displaying standard errors of the estimates. If you prefer, you can display t statistics or p values, using the error_format option. Any column from tidy can be used by putting it in curly brackets:

# Another useful column: p.value
huxreg(
        lm1, lm3, 
        error_format = "[{statistic}]", 
        note         = "{stars}. T statistics in brackets."
      )
(1)(2)
(Intercept)981.607    6.269 ***
[1.363]   [8.016]   
carat4328.324 ***3.531 ***
[31.650]   [23.773]   
depth-27.785 *  -0.019    
[-2.384]   [-1.499]   
N100        100        
R20.912    0.854    
logLik-675.703    6.822    
AIC1359.405    -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05. T statistics in brackets.

Here we also changed the footnote, using note. If note contains the string "{stars}" it will be replaced by a description of the significance stars used. If you don’t want a footnote, just set note = NULL.

Alternatively, you can display confidence intervals. Use ci_level to set the confidence level for the interval, then use {conf.low} and {conf.high} in error_format:

huxreg(lm1, lm3, ci_level = .99, error_format = "({conf.low} -- {conf.high})")
(1)(2)
(Intercept)981.607    6.269 ***
(-910.629 -- 2873.844)   (4.214 -- 8.324)   
carat4328.324 ***3.531 ***
(3969.004 -- 4687.643)   (3.140 -- 3.921)   
depth-27.785 *  -0.019    
(-58.411 -- 2.842)   (-0.052 -- 0.014)   
N100        100        
R20.912    0.854    
logLik-675.703    6.822    
AIC1359.405    -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05.

To change number formatting, set the number_format parameter. This works the same as the number_format property for a huxtable - if it is numeric, numbers will be rounded to that many decimal places; if it is character, it will be taken as a format to the base R sprintf function. huxreg tries to be smart and to format summary statistics like nobs as integers.

huxreg(lm1, lm3, number_format = 2)
(1)(2)
(Intercept)981.61    6.27 ***
(720.17)   (0.78)   
carat4328.32 ***3.53 ***
(136.75)   (0.15)   
depth-27.78 *  -0.02    
(11.66)   (0.01)   
N100       100       
R20.91    0.85    
logLik-675.70    6.82    
AIC1359.41    -5.64    
*** p < 0.001; ** p < 0.01; * p < 0.05.

Lastly, if you want to bold all significant coefficients, set the parameter bold_signif to a maximum significance level:

huxreg(lm1, lm3, bold_signif = 0.05)
(1)(2)
(Intercept)981.607    6.269 ***
(720.175)   (0.782)   
carat4328.324 ***3.531 ***
(136.755)   (0.149)   
depth-27.785 *  -0.019    
(11.656)   (0.013)   
N100        100        
R20.912    0.854    
logLik-675.703    6.822    
AIC1359.405    -5.644    
*** p < 0.001; ** p < 0.01; * p < 0.05.

Altering data

Sometimes, you want to report different statistics for a model. For example, you might want to use robust standard errors.

One way to do this is to pass a tidy-able test object into huxreg. The function coeftest in the “lmtest” package has tidy methods defined:

library(lmtest)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Attaching package: 'lmtest'
## The following object is masked from 'package:git2r':
## 
##     reset
library(sandwich)
lm_robust <- coeftest(lm1, vcov = vcovHC, save = TRUE)
huxreg("Normal SEs" = lm1, "Robust SEs" = lm_robust)
Normal SEsRobust SEs
(Intercept)981.607    981.607    
(720.175)   (1117.654)   
carat4328.324 ***4328.324 ***
(136.755)   (293.929)   
depth-27.785 *  -27.785    
(11.656)   (17.995)   
N100        100        
R20.912    0.912    
logLik-675.703    -675.703    
AIC1359.405    1359.405    
*** p < 0.001; ** p < 0.01; * p < 0.05.

If that is not possible, you can compute statistics yourself and add them to your model using the tidy_override function:

lm_fixed <- tidy_override(lm1, p.value = c(0.5, 0.2, 0.06))
huxreg("Normal p values" = lm1, "Supplied p values" = lm_fixed)
Normal p valuesSupplied p values
(Intercept)981.607    981.607 
(720.175)   (720.175)
carat4328.324 ***4328.324 
(136.755)   (136.755)
depth-27.785 *  -27.785 
(11.656)   (11.656)
N100        100     
R20.912    0.912 
logLik-675.703    -675.703 
AIC1359.405    1359.405 
*** p < 0.001; ** p < 0.01; * p < 0.05.

You can override any statistics returned by tidy or glance.

If you want to completely replace the output of tidy, use the tidy_replace() function. For example, here’s how to print different coefficients for a multinomial model.

mnl <- nnet::multinom(gear ~ mpg, mtcars)
## # weights:  9 (4 variable)
## initial  value 35.155593 
## iter  10 value 23.131901
## final  value 23.129234 
## converged
tidied <- broom::tidy(mnl)
models <- list()
models[["4 gears"]] <- tidy_replace(mnl, tidied[tidied$y.level == 4, ])
models[["5 gears"]] <- tidy_replace(mnl, tidied[tidied$y.level == 5, ])
huxreg(models, statistics = "AIC")
4 gears5 gears
(Intercept)-9.502 **-7.691 *
(3.262)  (3.232) 
mpg0.475 **0.358 *
(0.168)  (0.168) 
AIC54.258   54.258  
*** p < 0.001; ** p < 0.01; * p < 0.05.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.