---
title: "Narrating Business Charts with ggmemo"
output: rmarkdown::html_vignette
description: >
  Learn how to annotate ggplot2 charts with callout labels and change
  annotations. Covers callouts, percent change, absolute change,
  percentage points, multiple annotations, time series, customization,
  and common mistakes.
vignette: >
  %\VignetteIndexEntry{Narrating Business Charts with ggmemo}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4.5,
  dpi = 150
)
```

A ggplot2 chart shows data. An annotated chart tells a story — it
directs the reader's attention to the data points that matter and
quantifies what changed. ggmemo adds that storytelling layer with
two functions:

* `annotate_callout()` points at a data row with an arrow and label.
* `annotate_change()` draws a color-coded arrow between two rows and
   labels the delta.

Both return standard ggplot2 layers that you add to a plot with `+`.

This vignette walks through a quarterly revenue dataset from a bare
chart to a fully narrated one.

## The data

```{r setup}
library(ggplot2)
library(ggmemo)

revenue <- data.frame(
  quarter = factor(c("Q1", "Q2", "Q3", "Q4"),
                   levels = c("Q1", "Q2", "Q3", "Q4")),
  revenue = c(120, 145, 132, 158)
)
```

## A chart without annotations

This bar chart is accurate, but it doesn't guide the reader. Which
quarter matters? Is the trend good or bad? The viewer has to figure
that out on their own.

```{r bare-chart}
p <- ggplot(revenue, aes(x = quarter, y = revenue)) +
  geom_col(fill = "steelblue", width = 0.6) +
  labs(title = "2024 Quarterly Revenue ($K)", x = NULL, y = NULL) +
  theme_minimal()
p
```

## Calling out a key data point

`annotate_callout()` points at a specific row in your data with an
arrow and label. You identify the row with a filter expression — the
same syntax you use in `dplyr::filter()`:

```{r callout}
p +
  annotate_callout(
    revenue,
    where = quarter == "Q4",
    label = "Record quarter",
    position = "top-left"
  )
```

The `position` argument controls where the label sits relative to
the data point. Options are `"top-right"` (the default),
`"top-left"`, `"bottom-right"`, and `"bottom-left"`.

## Showing the change between two points

`annotate_change()` draws a color-coded arrow between two rows and
labels the midpoint with the computed delta — green for increases,
red for decreases:

```{r change-percent}
p +
  annotate_change(
    revenue,
    from = quarter == "Q1",
    to = quarter == "Q4",
    value = revenue
  )
```

### Format options

The `format` argument controls how the delta is displayed. The
default is `"percent"`. Other options:

**Absolute difference** — shows the raw numeric change:

```{r change-absolute}
p +
  annotate_change(
    revenue,
    from = quarter == "Q1",
    to = quarter == "Q4",
    value = revenue,
    format = "absolute"
  )
```

**Percentage points** — useful when the data is already expressed as
a rate or percentage (e.g., savings rate, market share). Using
`"percent"` on rate data gives a misleading percent-of-percent;
`"points"` gives the straightforward difference:

```{r change-points}
rates <- data.frame(
  year = 2020:2023,
  rate = c(3.5, 8.1, 5.4, 3.7)
)

ggplot(rates, aes(x = year, y = rate)) +
  geom_line() +
  geom_point() +
  annotate_change(
    rates,
    from = year == 2020,
    to = year == 2021,
    value = rate,
    format = "points"
  ) +
  labs(title = "Unemployment Rate", y = "Rate (%)") +
  theme_minimal()
```

## Putting it all together

You can combine both functions on one chart. The callout names a
moment; the change annotation quantifies what happened:

```{r combined}
ggplot(revenue, aes(x = quarter, y = revenue)) +
  geom_col(fill = "steelblue", width = 0.6) +
  annotate_callout(
    revenue,
    where = quarter == "Q4",
    label = "Record quarter",
    position = "top-left"
  ) +
  annotate_change(
    revenue,
    from = quarter == "Q1",
    to = quarter == "Q4",
    value = revenue
  ) +
  labs(title = "2024 Quarterly Revenue ($K)", x = NULL, y = NULL) +
  theme_minimal()
```

### Multiple change annotations

You can stack several `annotate_change()` calls to show
quarter-over-quarter movement across the full series:

```{r multiple}
ggplot(revenue, aes(x = quarter, y = revenue)) +
  geom_col(fill = "grey70", width = 0.6) +
  annotate_change(revenue, from = quarter == "Q1",
                  to = quarter == "Q2", value = revenue) +
  annotate_change(revenue, from = quarter == "Q2",
                  to = quarter == "Q3", value = revenue) +
  annotate_change(revenue, from = quarter == "Q3",
                  to = quarter == "Q4", value = revenue) +
  labs(title = "Quarter-over-Quarter Changes", x = NULL, y = NULL) +
  theme_minimal()
```

### Time series

ggmemo works with Date x-axes. Here's a savings rate time series
with a callout at the all-time low and a change annotation showing
the recovery. Note the use of `nudge` to manually position the
callout label — this overrides the automatic heuristic, which can
miss on wide data frames with many numeric columns (see
[Nudge](#nudge) below):

```{r time-series}
ggplot(economics, aes(x = date, y = psavert)) +
  geom_line(colour = "grey40") +
  annotate_callout(
    economics,
    where = date == as.Date("2005-07-01"),
    label = "All-time low",
    nudge = c(365, 1)
  ) +
  annotate_change(
    economics,
    from = date == as.Date("2005-07-01"),
    to = date == as.Date("2012-12-01"),
    value = psavert,
    format = "points"
  ) +
  labs(
    title = "U.S. Personal Savings Rate",
    subtitle = "Recovery after the 2005 low",
    x = NULL, y = "Savings rate (%)"
  ) +
  theme_minimal()
```

## Customization

### Colors

`annotate_change()` uses dark green for increases, dark red for
decreases, and grey for no change by default. You can supply your
own palette with the `colors` argument — a named vector with
entries `up`, `down`, and `flat`:

```{r custom-colors}
p +
  annotate_change(
    revenue,
    from = quarter == "Q1",
    to = quarter == "Q4",
    value = revenue,
    colors = c(up = "#1B9E77", down = "#D95F02", flat = "#7570B3")
  )
```

### Arrow styling

`annotate_change()` supports `arrow_type`, `arrow_pad`, and
`curvature` for controlling the arrow shape. `annotate_callout()`
accepts `arrow = NULL` to drop the arrow entirely and show just the
label:

```{r arrow-styling}
p +
  annotate_callout(
    revenue,
    where = quarter == "Q4",
    label = "Record quarter",
    position = "top-left",
    arrow = NULL
  ) +
  annotate_change(
    revenue,
    from = quarter == "Q3",
    to = quarter == "Q4",
    value = revenue,
    arrow_type = "closed",
    arrow_pad = 0.2,
    curvature = -0.3
  )
```

### Label styling

Both functions accept `...`, which passes additional arguments
through to the underlying ggplot2 layer. Use this to override
defaults like text size, background fill, or text colour:

```{r custom-style}
p +
  annotate_callout(
    revenue,
    where = quarter == "Q4",
    label = "Record quarter",
    position = "top-left",
    size = 5,
    fill = "lightyellow",
    colour = "grey30"
  )
```

### Nudge {#nudge}

`annotate_callout()` automatically computes how far to offset the
label from the data point based on the data ranges. This works well
for simple two-column data frames. For wider data frames with many
numeric columns (like `ggplot2::economics`), the heuristic may pick
the wrong column's range and produce a label that's too far away or
too close.

You can override the heuristic by passing `nudge = c(x, y)` in
data units:

```{r nudge}
ggplot(economics, aes(x = date, y = unemploy)) +
  geom_line() +
  annotate_callout(
    economics,
    where = date == as.Date("2009-10-01"),
    label = "Peak unemployment",
    nudge = c(800, 1000)
  ) +
  theme_minimal()
```

Alternatively, you can pass a two-column subset of the data so the
heuristic has less to guess:
```r
annotate_callout(
  economics[, c("date", "unemploy")],
  where = date == as.Date("2009-10-01"),
  label = "Peak unemployment"
)
```

## Common mistakes

These are the most common issues that come up when getting started
with ggmemo.

### Character columns need `factor()`

If your x-axis column is a character vector (common after
`read.csv()`), `annotate_change()` will error and suggest converting
it. When you do, always specify `levels` to preserve the order in
your data — plain `factor()` sorts alphabetically:

```r
# Alphabetical — probably not what you want
data$month <- factor(data$month)

# Preserves data order
data$month <- factor(data$month, levels = unique(data$month))
```

### Date-like strings need `as.Date()`

CSV files store dates as strings. If your date column looks like
`"2024-01-15"` but is class `character`, convert it before plotting:

```r
data$date <- as.Date(data$date)
```

ggmemo will detect date-like strings and suggest this in the error
message.

### Use `colour`, not `color`

ggplot2 uses British spelling internally. American `color` works in
most contexts, but it can produce a "Duplicated aesthetics" warning
when the function already sets a default `colour`. Using `colour`
avoids the warning:

```r
annotate_callout(..., colour = "red")
```

### `size` controls text, not the label box

The `size` argument sets text size (in mm, matching ggplot2
conventions). To adjust the padding around the text inside the label
box, use `label.padding`:

```r
annotate_callout(..., size = 5, label.padding = unit(0.4, "lines"))
```

## Real-world example: NBA Finals scoring breakdown

The examples above use simple, two-column data. Here's a more
complex chart — a grouped stacked bar chart of quarter-by-quarter
scoring in an NBA Finals game — that shows how `annotate_change()`
works when the annotation data differs from the plot data.

Brunson scored 52% of the Knicks' fourth-quarter points after
contributing just 20% in Q3. To annotate that shift, we pass a
separate two-row data frame to `annotate_change()` with the x
positions and values we want to compare, use a custom `label`, and
set `expand_y = FALSE` so the arrow doesn't push the y-axis beyond
the bars. We also override the default green with the same orange
used for Brunson's bars so the annotation feels integrated:

```{r nba-clutch}
scoring <- data.frame(
  Quarter = rep(c("Q1", "Q2", "Q3", "Q4"), 4),
  Team = rep(c("Knicks", "Knicks", "Spurs", "Spurs"), each = 4),
  Player = rep(c("Brunson", "Rest of Knicks", "Wemby", "Rest of Spurs"), each = 4),
  Points = c(
    5, 7, 5, 13,
    23, 20, 20, 12,
    8, 9, 7, 4,
    16, 17, 19, 15
  )
)

scoring$Quarter <- factor(scoring$Quarter, levels = c("Q1", "Q2", "Q3", "Q4"))
scoring$Player <- factor(scoring$Player,
  levels = c("Rest of Spurs", "Wemby", "Rest of Knicks", "Brunson"))

scoring$x_pos <- as.numeric(scoring$Quarter) +
  ifelse(scoring$Team == "Knicks", -0.2, 0.2)

team_totals <- aggregate(Points ~ Quarter + Team + x_pos, data = scoring, FUN = sum)
team_totals <- team_totals[!duplicated(team_totals[, c("Quarter", "Team")]), ]

ggplot(scoring, aes(x = x_pos, y = Points, fill = Player)) +
  geom_col(width = 0.35) +
  geom_text(
    data = team_totals,
    aes(x = x_pos, y = Points, label = Points, fill = NULL),
    vjust = -0.4, size = 5, fontface = "bold"
  ) +
  scale_fill_manual(
    values = c(
      "Brunson" = "#E86A00",
      "Rest of Knicks" = "#FDCB8B",
      "Wemby" = "#6D6D6D",
      "Rest of Spurs" = "#B8B8B8"
    ),
    name = NULL
  ) +
  scale_x_continuous(breaks = 1:4, labels = c("Q1", "Q2", "Q3", "Q4")) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.12))) +
  annotate_change(
    data.frame(x_pos = c(2.8, 3.8), Points = c(5, 13)),
    from = x_pos == 2.8,
    to = x_pos == 3.8,
    value = Points,
    format = "percent",
    expand_y = FALSE,
    label = "20% → 52% of team pts",
    colors = c(up = "#E86A00", down = "#B22222", flat = "#808080")
  ) +
  coord_cartesian(clip = "off") +
  labs(
    title = "Quarter-by-Quarter Scoring — Knicks vs Spurs",
    x = NULL, y = "Points"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    legend.position = "bottom",
    panel.grid.major.x = element_blank(),
    plot.margin = margin(10, 20, 10, 10)
  )
```

Key techniques used here:

* **Separate annotation data** — the plot uses the full `scoring`
  data frame, but `annotate_change()` receives a minimal two-row
  frame with just the x positions and values to compare.
* **Custom label** — instead of the auto-computed delta, we pass a
  descriptive `label` that tells the story directly.
* **`expand_y = FALSE`** — prevents the arrow from adding extra
  space above the bars.
* **Custom colors** — `colors = c(up = "#E86A00", ...)` matches the
  arrow to Brunson's bar color so the annotation looks intentional.

## What ggmemo doesn't do

ggmemo is focused on two tasks: callout annotations and change
annotations for business charts. For other annotation needs, these
packages are worth knowing:

* **Label overlap avoidance**: [ggrepel](https://ggrepel.slowkow.com/)
  automatically repositions text labels to avoid overlaps.
* **Precise NPC positioning**: [ggpp](https://docs.r4photobiology.info/ggpp/)
  supports normalized parent coordinates and is the package ggmemo
  builds on.
* **Interactive annotations**: [plotly](https://plotly.com/r/) and
  [ggiraph](https://davidgohel.github.io/ggiraph/) support hover
  tooltips, click events, and zoom.
* **Themes and styling**: ggthemes, hrbrthemes, and bbplot provide
  polished chart themes for business and publication contexts.
