Simplevis

David Hodge

2021-07-01

library(dplyr)
library(simplevis)
library(palmerpenguins)
library(ggplot2)

Purpose

simplevis is a package of ggplot2 wrapper functions that aims to make visualisation easier with less brainpower required.

Visualisation family types

simplevis supports the following families of visualisation type:

bar

plot_data <- storms %>%
  group_by(year) %>%
  summarise(wind = mean(wind))

gg_bar(plot_data, 
       x_var = year, 
       y_var = wind)

point

gg_point(iris, 
         x_var = Sepal.Width, 
         y_var = Sepal.Length)

line

plot_data <- storms %>%
  group_by(year) %>%
  summarise(wind = mean(wind))

gg_line(plot_data, 
        x_var = year, 
        y_var = wind)

hbar (i.e horizontal bar)

plot_data <- ggplot2::diamonds %>%
  group_by(cut) %>%
  summarise(price = mean(price))

gg_hbar(plot_data, 
        x_var = price, 
        y_var = cut)

density

gg_density(penguins, 
           x_var = body_mass_g)

boxplot

gg_boxplot(storms, 
           x_var = year, 
           y_var = wind)

sf (short for simple features map)

gg_sf(example_sf_point, 
      borders = nz)

Colouring, facetting, neither or both

Each visualisation family generally has 4 functions.

The function name specifies whether or not a visualisation is to be coloured by a variable (*_col()), facetted by a variable (*_facet()), or neither (*()) or both of these (*_col_facet()).

Colouring by a variable means that different values of a selected variable are to have different colours. Facetting means that different values of a selected variable are to have their facet.

A *() function such gg_point() requires only a dataset, an x variable and a y variable.

gg_point(penguins, 
         x_var = bill_length_mm, 
         y_var = body_mass_g)

A *_col() function such gg_point_col() requires only a dataset, an x variable, a y variable, and a colour variable.

gg_point_col(penguins, 
             x_var = bill_length_mm, 
             y_var = body_mass_g, 
             col_var = sex)

A *_facet() function such gg_point_facet() requires only a dataset, an x variable, a y variable, and a facet variable.

gg_point_facet(penguins, 
               x_var = bill_length_mm, 
               y_var = body_mass_g, 
               facet_var = species)

A *_col_facet() function such gg_point_col_facet() requires only a dataset, an x variable, a y variable, a colour variable, and a facet variable.

gg_point_col_facet(penguins, 
                   x_var = bill_length_mm, 
                   y_var = body_mass_g, 
                   col_var = sex, 
                   facet_var = species)

Data is generally plotted with a stat of identity, which means data is plotted as is.

For boxplot, there is adefault stat of boxplot, which means the y_var will be transformed to boxplot statistics.

For density, the stat of the x_var based on the density prefixed arguments that inform the density calculation.

Generally, an x_var and a y_var is required. However, y_var is not required for density*() functions. Neither x_var or y_var are required for gg_sf*() (or leaflet_sf*()) functions.

Titles

Defaults titles are:

gg_point_col(penguins, 
             x_var = bill_length_mm, 
             y_var = body_mass_g, 
             col_var = species)

You can customise titles with title, subtitle, x_title, y_title and caption arguments.

gg_point_col(penguins, 
             x_var = bill_length_mm, 
             y_var = body_mass_g, 
             col_var = species, 
             title = "Adult penguin mass by bill length and species",
             subtitle = "Palmer station, Antarctica",
             x_title = "Bill length (mm)", 
             y_title = "Body mass (g)",
             col_title = "Penguin species",
             caption = "Source: Gorman KB, Williams TD, Fraser WR (2014)")

You can also request no x_title using x_title = "" or likewise for y_title and col_title.

Colour palettes

Change the colour palette by supplying a vector of colours to the pal argument.

gg_point(iris, 
         x_var = Sepal.Width, 
         y_var = Sepal.Length, 
         pal = "#e7298a")

Scales

simplevis makes it easy to make easy scale transformations.

These use consistent prefixes based on x_*, y_*, col_* or facet_*, and as such the autocomplete can help identify what you need.

Some examples of transformations available are:

plot_data <- storms %>%
  group_by(year, status) %>%
  summarise(wind = mean(wind))

gg_bar_col(plot_data, 
        x_var = year, 
        y_var = wind, 
        col_var = status,
        position = "stack",
        x_pretty_n = 4,
        x_labels = function(x) stringr::str_sub(x, 3, 4),
        y_labels = scales::comma_format(accuracy = 0.1), 
        y_zero = T, 
        y_pretty_n = 10,
        y_gridlines_minor = T,
        y_expand = ggplot2::expansion(mult = c(0.025, 0.025)))

gg_point_col(penguins, 
             x_var = bill_length_mm, 
             y_var = body_mass_g, 
             col_var = sex, 
             col_na = FALSE)

bar*() and hbar*() plots support a "stack" position as well as the default "dodge".

plot_data <- penguins %>% 
  group_by(sex, species) %>% 
  summarise(count = n())

gg_hbar_col(plot_data, 
        x_var = count, 
        y_var = species, 
        col_var = sex,
        position = "stack")

sf maps

simplevis provides simple feature (sf) maps (i.e. maps with point, line or polygon features).

These functions work in the same way as the ggplot2 graph functions, but with the following noteworthy differences:

A couple of example sf objects are provided with the package for learning purposes: example_sf_point and example_sf_polygon.

gg_sf_col(example_sf_point, 
          col_var = trend_category)

The borders argument allows for the user to provide an sf object as context to the map (e.g. a coastline or administrative bounrdaries). An sf object of the New Zealand coastline has been provided for learning purposes with the package.

gg_sf_col(example_sf_point, 
          col_var = trend_category,
          borders = nz)

leaflet wrappers

simplevis also provides a leaflet_sf() and leaflet_sf_col() function, which work in a similar way as a bonus.

leaflet_sf_col(example_sf_point, 
               col_var = trend_category)

Supported variable classes

variable types supported by the different groups of functions are outlined below.

A stat of identity refers to the value being plotted as it is. A stat of boxplot refers to boxplot statistics being calculated from the data, and these plotted.

tibble::tribble(
  ~family, ~data, ~x_var, ~y_var, ~col_var, ~facet_var, ~stat,
  "bar", "tibble or data.frame", "Any*", "Numeric", "Categorical or numeric", "Categorical", "identity",
  "hbar", "tibble or data.frame", "Numeric", "Any*", "Categorical or numeric", "Categorical", "identity",
  "line", "tibble or data.frame", "Any", "Numeric", "Categorical or numeric", "Categorical", "identity",
  "point", "tibble or data.frame", "Any", "Numeric", "Categorical or numeric", "Categorical", "identity",
  "density", "tibble or data.frame", "Numeric", NA, "Categorical",  "Categorical", "density",
  "boxplot", "tibble or data.frame", "Any*", "Numeric", "Categorical", "Categorical", "boxplot (or identity)",
  "sf", "sf", NA, NA, "Categorical or numeric", "Categorical", "identity",
  ) %>% 
  DT::datatable()

Where:

Working with pipe

penguins %>% 
  gg_density_col(x_var = body_mass_g, 
                 col_var = species)

Output objects and adding layers

All gg_* and leaflet_* wrapper functions produce ggplot or leaflet objects.

This means layers can be added to the functions in the same way you would a ggplot2 or leaflet object.

Note you need to add all aesthetics to any additional layers.

gg_point_col(penguins, 
             x_var = bill_length_mm, 
             y_var = body_mass_g, 
             col_var = species) +
  geom_smooth(aes(x = bill_length_mm, y = body_mass_g, col = species))

This means you can facet by more than one variable, provided that you are not using a position of “stack”.

plot_data <- penguins %>% 
  group_by(species, sex, island) %>% 
  summarise(body_mass_g = mean(body_mass_g, na.rm = TRUE))

gg_bar(plot_data, 
       x_var = sex, 
       y_var = body_mass_g, 
       width = 0.66, 
       x_na = FALSE, 
       y_pretty_n = 3) +
  facet_grid(rows = vars(species), 
             cols = vars(island), 
             labeller = as_labeller(snakecase::to_sentence_case))

ggplotly interactive visualisation

All ggplot objects can be converted into interactive html objects using ggplotly. You can simply wrap the plot object in plotly::ggplotly().

The plotly_camera function removes plotly widgets other than the camera to keep things tidy.

plot <- gg_point_col(penguins, 
                     x_var = bill_length_mm, 
                     y_var = body_mass_g, 
                     col_var = species) 

plotly::ggplotly(plot) %>% 
  plotly_camera()

simplevis also offers more customisability for making tooltips(i.e. hover values) in ggplotly (i.e. hover values).

A variable can be added to the text_var in the gg_* function. This variable is then used in the ggplotly tooltip when tooltip = text is added to the ggplotly function.

simplevis provides a mutate_text function which can produce a variable that is a string or variable names and values for a tooltip. Note this function converts column names to sentence case using the snakecase::to_sentence_case function.

The mutate_text function uses all variables in the dataset by defalut, but a subset can be used if desired.

plot <- gg_point_col(penguins, 
                     x_var = bill_length_mm, 
                     y_var = body_mass_g, 
                     col_var = species) 

plotly::ggplotly(plot) %>% 
  plotly_camera()

plot_data <- penguins %>% 
  mutate_text()

plot <- gg_point_col(plot_data, 
                     x_var = bill_length_mm, 
                     y_var = body_mass_g, 
                     col_var = species, 
                     text_var = text, 
                     font_family = "Helvetica")

plotly::ggplotly(plot, tooltip = "text") %>% 
  plotly_camera()

Further information

For further information, see the articles on the website.