The purpose of the first table in a medical paper is most often to describe your population. In an RCT the table frequently compares the baseline characteristics between the randomized groups, while an observational study will often compare exposed with unexposed. In this vignette I will show how I use the functions to quickly generate a descriptive table.

We will use the mtcars dataset and compare the groups with automatic transmission to those without. First we will start by loading the dataset and labeling.

data(mtcars)
# For labelling we use the label()
# function from the Hmisc package
library(Hmisc)

label(mtcars$mpg) <- "Gas"
units(mtcars$mpg) <- "Miles/(US) gallon"

label(mtcars$wt) <- "Weight"
units(mtcars$wt) <- "10<sup>3</sup> kg" # not sure the unit is correct

mtcars$am <- factor(mtcars$am, levels=0:1, labels=c("Automatic", "Manual"))
label(mtcars$am) <- "Transmission"

mtcars$gear <- factor(mtcars$gear)
label(mtcars$gear) <- "Gears"

# Make up some data for making it slightly more interesting
mtcars$col <- factor(sample(c("red", "black", "silver"),
                     size=NROW(mtcars), replace=TRUE))
label(mtcars$col) <- "Car color"

The basics of getDescriptionStatsBy

The function getDescriptionStatsBy is a simple way to do basic descriptive statistics. Mandatory inputs are the first two, x and by:

library(Gmisc)
getDescriptionStatsBy(x = mtcars$mpg, 
                      by = mtcars$am)
## Gas 
##     Automatic            Manual              
## Gas "17.1 (&plusmn;3.8)" "24.4 (&plusmn;6.2)"

If we prefer median we can simply specify the statistic used with continuous variables:

getDescriptionStatsBy(x = mtcars$mpg, 
                      by = mtcars$am,
                      continuous_fn = describeMedian)
## Gas 
##     Automatic            Manual              
## Gas "17.3 (14.9 - 19.2)" "22.8 (21.0 - 30.4)"

As I always have the same by-variable and have several customization, I often wire a small wrapper:

getTable1Stats <- function(x, digits = 0, ...){
  getDescriptionStatsBy(x = x, 
                        by = mtcars$am,
                        digits = digits,
                        continuous_fn = describeMedian,
                        header_count = TRUE,
                        ...)
  
}
getTable1Stats(mtcars$mpg)
## Gas 
##     Automatic<br /> No. 19 Manual<br /> No. 13
## Gas "17 (15 - 19)"         "23 (21 - 30)"

The dot-argument (…) is useful for adding additional customization:

getTable1Stats(mtcars$mpg, use_units = TRUE)
## Gas 
##     Automatic<br /> No. 19 Manual<br /> No. 13 units              
## Gas "17 (15 - 19)"         "23 (21 - 30)"      "Miles/(US) gallon"

Combining with the htmlTable

Lists are convenient and allow a lot of customization, together with the mergeDesc it is easy to generate a table one:

t1 <- list()
t1[["Gas"]] <-
  getTable1Stats(mtcars$mpg)
  
t1[["Weight&dagger;"]] <-
  getTable1Stats(mtcars$wt)

t1[["Color"]] <- 
  getTable1Stats(mtcars$col)

# If we want to use the labels set in the beginning
# we add an element without a name
t1 <- c(t1,
        list(getTable1Stats(mtcars$gear)))

htmlTable(mergeDesc(t1),
          css.rgroup = "",
          caption  = "Basic descriptive statistics from the mtcars dataset",
          tfoot = "&dagger; The weight is in 10<sup>3</sup> kg")
Basic descriptive statistics from the mtcars dataset
Automatic
No. 19
Manual
No. 13
Gas 17 (15 - 19) 23 (21 - 30)
Weight† 4 (3 - 4) 2 (2 - 3)
Color
  black 6 (32%) 6 (46%)
  red 10 (53%) 3 (23%)
  silver 3 (16%) 4 (31%)
Gears
  3 15 (79%) 0 (0%)
  4 4 (21%) 8 (62%)
  5 0 (0%) 5 (38%)
† The weight is in 103 kg

Event though p-values are discouraged in the table 1, they are not uncommon. I have therefore added basic statistics consisting of Fisher’s exact test for proportions and Wilcoxon rank sum test for continuous values. Note that the Wilcoxon only works with two groups.

# A little more advanced input
mtcars$mpg_w_missing <- mtcars$mpg
mtcars$mpg_w_missing[sample(1:NROW(mtcars), size=5)] <- NA
mtcars$wt_w_missing <- mtcars$wt
mtcars$wt_w_missing[sample(1:NROW(mtcars), size=8)] <- NA

t1 <- list()
t1[["Gas"]] <-
  getTable1Stats(mtcars$mpg_w_missing, statistics=TRUE)
  
t1[["Weight&dagger;"]] <-
  getTable1Stats(mtcars$wt, statistics=TRUE)

t1[["Color"]] <- 
  getTable1Stats(mtcars$col, statistics=TRUE)

# If we want to use the labels set in the beginning
# we add an element without a name
t1 <- c(t1,
        list(getTable1Stats(mtcars$gear, statistics=TRUE)))

htmlTable(mergeDesc(t1),
          css.rgroup = "",
          caption  = "Basic descriptive statistics from the mtcars dataset",
          tfoot = "&dagger; The weight is in 10<sup>3</sup> kg")
Basic descriptive statistics from the mtcars dataset
Automatic
No. 19
Manual
No. 13
P-value
Gas
  Median (IQR) 17 (15 - 19) 26 (20 - 30) 0.003
  Missing 3 (16%) 2 (15%)
Weight† 4 (3 - 4) 2 (2 - 3) < 0.0001
Color
  black 6 (32%) 6 (46%) 0.25
  red 10 (53%) 3 (23%)
  silver 3 (16%) 4 (31%)
Gears
  3 15 (79%) 0 (0%) < 0.0001
  4 4 (21%) 8 (62%)
  5 0 (0%) 5 (38%)
† The weight is in 103 kg