# Playing around with dumbbell plots

Dumbbell plots can be used to show differences between two groups. Bob Rudis demonstrated a beautiful application of such plots using ggplot2 board methods.

In this plot, I will explain or comment his code, and adapt a few changes.

pacman::p_load(tidyverse, ggalt)

Let’s make up some data. Tip: Make up some data conveniently in Excel, copy it to the clipboard, and then paste it as tribble (see below) into R. For the last step, there is this RStudio Add-in available “Dataset Loads”. Then use the menu item “Paste as Tribble”.

This procedure will result in a data frame like this:

d <- tibble::tribble(
~country, ~last_year, ~this_year,
"Region A",       0.37,       0.82,
"Region B",       0.41,       0.84,
"Region D",       0.44,       0.79,
"Region E",       0.52,       0.87,
"Region F",       0.58,       0.92,
"Region C",       0.47,       0.79,
"Region G",       0.63,       0.92,
"Region J",       0.55,       0.86,
"Region H",       0.47,       0.76,
"Region I",       0.94,       0.72
)

Let’s add the difference between last year and this year as an extra column, in case we may need it later…

d %>%
mutate(diff = this_year - last_year) -> d 

In order to convince ggplot to plot the qualitative categories of country in the right order, better change them to type factor:

d$country <- factor(d$country, levels = rev(d$country)) We rev()erse the levels, so that the last category is a the intercept, and the first category is at the end of the axis, ie., on top of the diagram. See the diagram to better understand this cryptic explanation. Now let’s build up the plot: d %>% ggplot() + geom_dumbbell(aes(y = country, x = last_year, xend = this_year), colour = "grey60", size = 5, colour_x = "#F7BC08", colour_xend = "#395B74") -> gg1 gg1 The workhorse is, obviously, geom_dumbbell(); its parameters are the starting point (x), and the end point of the dumbbell (xend), as well as the grouping variable name (y). In addition, colors can be specified. Now, some aesthetic choices. Let’s use a clean white background, and provide some type horizontal grid line using geom_segment gg1 + theme_minimal() -> gg2 gg2 Now, some decoration. gg2 + geom_text(data = filter(d, country == "Region A"), aes(x = this_year, y = country), label = "This year", fontface = "bold", color = "#395B74", vjust = -2) + geom_text(data = filter(d, country == "Region A"), aes(x = last_year, y = country), label = "Last year", fontface = "bold", color = "#F7BC08", vjust = -2) + labs(x = "Satisfaction level", y = "", title = "Customer Satisfaction") + theme(title = element_text(size = rel(1.4)))-> gg3 gg3 Now let’s adjust the axes. gg3 + scale_x_continuous(expand = c(0,0.1), labels = scales::percent, breaks = c(0.25, 5., .75, 1)) + coord_cartesian(xlim = c(.2, 1.2)) -> gg4 gg4 Finally, let’s add the change or difference value to the right of the plot. gg4 + geom_text(aes(y = country, label = diff), x = 1.2, hjust = 1) + annotate(x = 1.2, y = "Region A", label = "Diff", geom = "text", vjust = -2, fontface = "bold", hjust = 1) -> gg5 gg5 Oh, finally-finally, let’s highlight “interesting” groups, ie., where the change is negative. gg5 + annotate(geom = "rect", xmin = .25, xmax = 1, ymin = as.numeric(d$country[d$country =="Region I"]) - 0.3, ymax = as.numeric(d$country[d\$country =="Region I"]) + 0.3,
alpha = .3,
fill = "firebrick") 