Load packages

library(tidyverse)  # data wrangling
library(mosaic)  # funplot
library(plotly)  # interactive plots
library(scatterplot3d)  # nomen est omen
library(rsm)  # 3d scatterplots

Motivation

Linear models are a standard way of predicting or explaining some data. Visualizing data is not only of didactical value but provides heuristical value too, as demonstrated by Anscombe’s Quartet.

Visualizing linear models in 2D is straightforward, but visualizing linear models with more than one predictor is much less so. The aim of this post is to demonstrate some ways do visualize linear models with more than one predictor, using popular R packages. We will focus on 3D examples, that is, two predictors.

Setup

Some data

data(mtcars)

Some packages

library(tidyverse)
library(mosaic)

Baseline model

For the sake of simplicity, let’s start with a 2D model.

lm0 <- lm(mpg ~ hp, data = mtcars)

$m o s a i c$ `

plotModel(lm0)

$g g f or μ l a$ `

gf_point(mpg ~ hp, data = mtcars) %>% 
  gf_lm()

`ggplot2`

f0 <- function(x) {coef(lm0)[1] + coef(lm0)[2] * x}

ggplot(mtcars) +
  aes(x = hp, y = mpg) +
  geom_point() +
  stat_function(fun = f0)

`makeFun`

Extracting the function itself from a linear model can be achieved like this:

f0 <- makeFun(lm0)
f0
#> function (hp, ..., transformation = function (x) 
#> x) 
#> return(transformation(predict(model, newdata = data.frame(hp = hp), 
#>     ...)))
#> <environment: 0x7f8528a74eb0>
#> attr(,"coefficients")
#> (Intercept)          hp 
#> 30.09886054 -0.06822828

Then again:

ggplot(mtcars) +
  aes(x = hp, y = mpg) +
  geom_point() +
  stat_function(fun = f0)

Our (initial) linear model for 3D

lm1 <- lm(mpg ~ hp * disp, data = mtcars)

Note that this notation is synonymous to this one:

lm1 <- lm(mpg ~ hp + disp + hp:disp, data = mtcars)

`gf_function_2d()`

Let’s briefly check the range of the predictors:

preds <- c("hp", "disp")

mtcars %>% 
  dplyr::select(one_of(preds)) %>% 
  summarise_all(funs(min, max))
#>   hp_min disp_min hp_max disp_max
#> 1     52     71.1    335      472

We’ll use this information to gauge the limits of the following graph:

f1 <- makeFun(lm1)

p1 <- gf_function_2d(f1, xlim = c(50, 300), ylim = c(50, 500)) %>% 
  gf_labs(x = "hp", y = "disp") %>% 
  gf_refine(scale_fill_viridis_c())
p1

Computing a grid and plotting it

First we devise seom combinations of hp and disp values. Then we predict the mpg values based on those values using lm1.

hp_disp_grid <- expand_grid(hp = seq(50, 300, by = 10), disp = seq(50, 500, by= 10))


grid2 <- hp_disp_grid %>% 
  mutate(mpg_pred_lm1 = predict(lm1, newdata = data.frame(hp, disp)))

Check

gf_density(~mpg_pred_lm1, data = grid2)

Now add the predicted value to the mtcars dataframe too:

mtcars <- mtcars %>% 
  mutate(pred_lm1 = predict(lm1))

ggplot(aes(x = hp, y = disp), data = grid2) +
  geom_raster(aes(fill = mpg_pred_lm1)) +
  scale_fill_viridis_c() +
  geom_point(data = mtcars, aes(color = pred_lm1)) +
  scale_color_viridis_c()

`plotFun` as tile plot (heatmap)

mosaic provides a convenient function to plot linear models with predictors, plotFun().

plotFun(f1(hp, disp) ~ hp & disp)

`plotFun` as 3d surface plot

plotFun(f1(hp, disp) ~ hp & disp, surface = TRUE)

Plotly surface

For plotly surface, we need the data in wide format, ie., we need to spread them out, from long (tidy, standard) format to the wide format. In addition, there should be no more name (key) column, and it ought to be a matrix:

grid_wide <- grid2 %>% 
  pivot_wider(names_from = disp, values_from = mpg_pred_lm1) %>% 
  select(-1) %>% 
  as.matrix


rownames(grid_wide) <- seq(50, 300, by = 10)

xaxis <- list(
  title = "hp"
  #range = c(50, 300)
)

yaxis <- list(
  title = "disp"
  #range = c(50, 500)
)

zaxis <- list(
  title = "mpg"
  #range = c(10, 30)
)

p <- plot_ly(z = ~grid_wide, 
             x = seq(50, 300, by = 10),
             y = seq(50, 500, by = 10)) %>% 
  add_surface(
  contours = list(
    z = list(
      show=TRUE,
      usecolormap=TRUE,
      highlightcolor="#ff0000"
      #project=list(z=TRUE)
      )
    )
  ) %>%
    layout(
    title = "lm1 as 3d plane",
    scene = list(
      xaxis = xaxis,
      yaxis = yaxis,
      zaxis = zaxis
    ))

p

`persp()`

x <- seq(min(mtcars$hp), max(mtcars$hp), length.out = 25)
y <- seq(min(mtcars$disp), max(mtcars$disp), length.out = 25)
z <- outer(x, y, function(x,y) {predict(lm1, data.frame(hp = x, disp = y))})
persp(x,y,z , theta = c(45, 135, 225, 315), col = "lightblue")

`contour()`

From package rsm we can make use of contour():

contour(lm1, hp ~ disp, image = TRUE)
contour(lm1, hp ~ disp, image = FALSE)

Scatterplot 3d

First plot the scatterplot:

d <- mtcars %>% 
  select(hp, disp, mpg)
lm1a <- lm(d$mpg ~ d$hp + d$disp)



p3 <- scatterplot3d(d)

Now add the regression plane:

p3$plane3d(lm1a)
#> Error in segments(x, z1, x + y.max * yx.f, z2 + yz.f * y.max, lty = ltya, : plot.new has not been called yet

For some reason, this code does not run in (my) rmarkdown setup, but it worked (on my machine) in a ordinary R script.

Some ways for plotting 3D linear models

Load packages

Motivation

Setup

Baseline model

mosaicmosaic`

ggforμlaggforμla`

ggplot2

makeFun