Introduction to forestploter

Alimu Dayimu

2022-03-01

Forest plots are commonly used in the medical research publications, especially in meta-analysis. And it can also be used to report the coefficients and confidence intervals (CIs) of the regression models.

There are lots of packages out there can be used to create draw a forest plot. The most popular one is forestpot. Packages specialised for the meta-analysis, like meta, metafor and rmeta. Some other packages, like ggforestplot, tried to use ggplot2 to draw a forest plot, they are not available on the CRAN yet.

The main differences of the forestploter from the other packages are:

Basic forest plot

The layout of the forest plot is determined by the dataset provided.

Text in the forest plot

The first step is to provide a data.frame will be used in the forest plot. Column names of the data will be drawn as the header and contents inside the data will be displayed in the forest plot. One or multiple blank columns without any content (blanks) should be provided to draw confidence interval. Space to draw the CI is determined by the width of this column. Increase the number of space in the column to give more space to draw CI.

First we need to get the data ready to plot.

library(grid)
library(forestploter)

# Read provided sample example data
dt <- read.csv(system.file("extdata", "example_data.csv", package = "forestploter"))

# Keep needed columns
dt <- dt[,1:6]

# indent the subgroup if there is a number in the placebo column
dt$Subgroup <- ifelse(is.na(dt$Placebo), 
                      dt$Subgroup,
                      paste0("   ", dt$Subgroup))

# NA to blank or NA will be transformed to carachter.
dt$Treatment <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$Placebo <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
dt$se <- (log(dt$hi) - log(dt$est))/1.96

# Add blank column for the forest plot to display CI.
# Adjust the column width with space. 
dt$` ` <- paste(rep(" ", 20), collapse = " ")

# Create confidence interval column to display
dt$`HR (95% CI)` <- ifelse(is.na(dt$se), "",
                             sprintf("%.2f (%.2f to %.2f)",
                                     dt$est, dt$low, dt$hi))
head(dt)
#>          Subgroup Treatment Placebo      est        low       hi        se
#> 1    All Patients       781     780 1.869694 0.13245636 3.606932 0.3352463
#> 2             Sex                         NA         NA       NA        NA
#> 3            Male       535     548 1.449472 0.06834426 2.830600 0.3414741
#> 4          Female       246     232 2.275120 0.50768005 4.042560 0.2932884
#> 5             Age                         NA         NA       NA        NA
#> 6          <65 yr       297     333 1.509242 0.67029394 2.348190 0.2255292
#>                                                   HR (95% CI)
#> 1                                         1.87 (0.13 to 3.61)
#> 2                                                            
#> 3                                         1.45 (0.07 to 2.83)
#> 4                                         2.28 (0.51 to 4.04)
#> 5                                                            
#> 6                                         1.51 (0.67 to 2.35)

The data we have above will be the basic layout of the forest plot. The example below shows how to draw a simple forest plot by applying a theme. A footnote was added as a demonstration.

# Define theme
tm <- forest_theme(base_size = 10,
                   refline_col = "red",
                   footnote_col = "#636363",
                   footnote_fontface = "italic")

p <- forest(dt[,c(1:3, 8:9)],
            est = dt$est,
            lower = dt$low, 
            upper = dt$hi,
            sizes = dt$se,
            ci_column = 4,
            ref_line = 1,
            arrow_lab = c("Placebo Better", "Treatment Better"),
            xlim = c(0, 4),
            ticks_at = c(0.5, 1, 2, 3),
            footnote = "This is the demo data. Please feel free to change\nanything you want.",
            theme = tm)

# Print plot
plot(p)

Editing forest plot

The package has some functionality to modify the forestplot. Below is the functions to edit various aspects of the plot:

# Change text color in row 3
g <- edit_plot(p, row = 3, gp = gpar(col = "red", fontface = "italic"))

# Bold grouping text
g <- edit_plot(g,
               row = c(2, 5, 10, 13, 17, 20),
               gp = gpar(fontface = "bold"))

# Edit background of row 5
g <- edit_plot(g, row = 5, which = "background",
               gp = gpar(fill = "darkolivegreen1"))

# Insert text at top
g <- insert_text(g,
                 text = "Treatment group",
                 col = 2:3,
                 part = "header",
                 gp = gpar(fontface = "bold"))

# Add underline at the bottom of the header
g <- add_underline(g, part = "header")


# Insert text
g <- insert_text(g,
                 text = "This is a long text. Age and gender summarised above.\nBMI is next",
                 row = 10,
                 just = "left",
                 gp = gpar(cex = 0.6, col = "green", fontface = "italic"))

plot(g)

The add_text simply put the text in the plot without adding any rows to the plot. Adding a blank row to the data before drawing a forest plot and use add_text function to add text to the row have the same effect as insert_text.

Multiple CI columns

If drawing CI to multiple columns is desired, one only need to provide a vector of the position of the columns to be drawn in the data. As seen in the example below, the CI will be drawn in the column 3 and 5. The first and second elements in est, lower and upper will be drawn in column 3 and column 5.

For a more complex example, one may want to draw CI by groups. The solution is simple, just provide all the values sequently to est, lower and upper. Which means, the first n elements in the est, lower and upper are considered as same group, same for next n elements. The n is determined by the length of ci_column. As it is shown in the example below, est_gp1 and est_gp2 will be drawn in column 3 and column 5 as normal, considered as group 1. But est_gp3 and est_gp4 will be considered as group 2.

This is an example of multiple CI columns and groups:

dt <- read.csv(system.file("extdata", "example_data.csv", package = "forestploter"))
# indent the subgroup if there is a number in the placebo column
dt$Subgroup <- ifelse(is.na(dt$Placebo), 
                      dt$Subgroup,
                      paste0("   ", dt$Subgroup))

# NA to blank or NA will be transformed to carachter.
dt$`n` <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$`n ` <- ifelse(is.na(dt$Placebo), "", dt$Placebo)

# Add two blank column for CI
dt$`CVD outcome` <- paste(rep(" ", 20), collapse = " ")
dt$`COPD outcome` <- paste(rep(" ", 20), collapse = " ")

# Set-up theme
tm <- forest_theme(base_size = 10,
                   refline_lty = "solid",
                   ci_pch = c(15, 18),
                   ci_col = c("#377eb8", "#4daf4a"),
                   footnote_col = "blue",
                   legend_name = "Group",
                   legend_value = c("Trt 1", "Trt 2"),
                   vertline_lty = c("dashed", "dotted"),
                   vertline_col = c("#d6604d", "#bababa"))

p <- forest(dt[,c(1, 19, 21, 20, 22)],
            est = list(dt$est_gp1,
                       dt$est_gp2,
                       dt$est_gp3,
                       dt$est_gp4),
            lower = list(dt$low_gp1,
                         dt$low_gp2,
                         dt$low_gp3,
                         dt$low_gp4), 
            upper = list(dt$hi_gp1,
                         dt$hi_gp2,
                         dt$hi_gp3,
                         dt$hi_gp4),
            ci_column = c(3, 5),
            ref_line = 1,
            vert_line = c(0.5, 2),
            nudge_y = 0.2,
            theme = tm)

plot(p)

Saving plot

One can use the base method or use ggsave function to save plot. For the ggsave function, please don’t ignore the plot parameter. The width and height should be tuned to get a desired plot. You can also set autofit=TRUE in the print or plot function to autofit the plot, but this may change not be compact as it should be.

# Base method
png('rplot.png', res = 300, width = 7.5, height = 7.5, units = "in")
p
dev.off()

# ggsave function
ggplot2::ggsave(filename = "rplot.png", plot = p,
                dpi = 300,
                width = 7.5, height = 7.5, units = "in")