In this plot, we are looking into some ways of displaying association between (two) quantitative variables, aka correlation. Our goal is to present a rich representation of the correlation.
Let’s take the dataset flights
as an example.
data(flights, package = "nycflights13")
library(tidyverse)
## Warning: package 'dplyr' was built under R version 3.5.1
library(viridis)
flights %>%
filter(arr_delay < 100, dep_delay < 100) %>%
ggplot(aes(x = dep_delay, y = arr_delay, color = origin)) +
geom_point(alpha = .01) +
geom_smooth(se = FALSE, color = "grey20") +
geom_rug() +
facet_wrap(~origin) +
scale_color_viridis_d()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
Points are not the only geom that make sense here. Let’s try some more, e.g., geom_hex()
.
flights %>%
filter(arr_delay < 100, dep_delay < 100) %>%
ggplot(aes(x = dep_delay, y = arr_delay)) +
geom_hex() +
geom_smooth(se = FALSE, color = "grey20") +
geom_rug() +
facet_wrap(~origin) +
scale_color_viridis_d()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'