Load packages

library(tidyverse)

Motivation

The (simple) linear regression is a standard tool in data analysis and statistics. Its properties are well-known but sometimes not known in details to the applied analyst; which is ok. However, if one wishes to understand deeper the internals of the system, the question may arise how to derive the coefficients of the linear regression. Here’s one way.

This approach focuses on simple calculus and derivatives; no matrix algebra, and only the simple case for one predictor.

There are many sources and tutorials similar to this around for example this or here.

Here’s the workhorse

Simple linear regression is defines such that

${\hat{y}}_{i} = b_{0} + b_{1} x_{i}$

That’s the regression line.

The regression line is optimal in the sense that it minimizes the squared distances from the line; for example from this source:

library(ggplot2)

d <- mtcars
fit <- lm(mpg ~ hp, data = d)

d$predicted <- predict(fit)   # Save the predicted values
d$residuals <- residuals(fit) # Save the residual values

ggplot(d, aes(x = hp, y = mpg)) +
  geom_smooth(method = "lm", se = FALSE, color = "lightgrey") +  # Plot regression slope
  geom_segment(aes(xend = hp, yend = predicted), alpha = .2) +  # alpha to fade lines
  geom_point() +
  geom_point(aes(y = predicted), shape = 1) +
  theme_bw()  # Add theme for cleaner look

In other words, the residuals minimize this cost function, $S$ :

$S = \sum (y_{i} - \hat{y})^{2}$

Substituting the $\hat{y}$ values:

$S = \sum (y_{i} - b_{0} - b_{1} x_{i})^{2}$

To minimize a function, we can take the first derivative. For functions with more than one variable, we take the partial derivative of the respective parameter. Let’s begin with $b_{0}$ :

$\frac{\partial S}{\partial b_{0}} [\sum (y_{i} - b_{0} - b_{1} x_{i})^{2}]$

Arg, now what? Chain rule to the rescue!

Differentiate outer function first.
Multiply by inner function
Differentiate inner function and multiply with what we already have.

$\frac{\partial S}{\partial b_{0}} \sum - 2 (y_{i} - b_{0} - b_{1} x_{i})$

Note that the inner function here simplifies to $- 1$ .

Set to zero to get the minimum.

$\sum - 2 (y_{i} - b_{0} - b_{1} x_{i}) = 0$

Now pull out the -2 from the summation and divide both equations by -2:

$\sum (y_{i} - b_{0} - b_{1} x_{i}) = 0$

Multiply out the summation:

$\sum y_{i} - \sum b_{0} - b_{1} \sum x_{i} = 0$

Note that the sum of a constant is $n$ times the constant: $\sum k = n k$ .

Using the above term we get:

$\sum y_{i} - n b_{0} - b_{1} \sum x_{i} = 0$

Now solve for $b_{0}$ :

$b_{0} = \frac{\sum y_{i} - b_{1} \sum x_{i}}{n}$

And that’s simply the mean of $y$ and $x$ !

$b_{0} = \bar{y} - b_{1} \bar{x}$

Finding $b_{1}$

Get the partial derivative w.r.t. $b_{1}$ :

$\frac{\partial S}{\partial b_{1}} \sum - 2 (y_{i} - b_{0} - b_{1} x_{i})$

That gives us, set to zero:

$\sum - 2 x_{i} (y_{i} - b_{0} - b_{1} x_{i}) = 0$

Divide by -2:

$\sum x_{i} (y_{i} - b_{0} - b_{1} x_{i}) = 0$

Redistribute $x_{i}$ :

$\sum (x_{i} y_{i} - b_{0} x_{i} - b_{1} x_{i}^{2}) = 0$

As we already now $b_{0}$ , let’s substitute that back in:

$\sum (x_{i} y_{i} - (\bar{y} - b_{1} \bar{x}) x_{i} - b_{1} x_{i}^{2}) = 0$

Split in two sums:

$\sum (x_{i} y_{i} - \bar{y} x_{i}) + \sum (b_{1} \bar{x} x_{i} - b_{1} x_{i}^{2}) = 0$

We would like to isolate $b_{1}$ , so let’s get this guy more isolated:

$\sum (x_{i} y_{i} - \bar{y} x_{i}) + b_{1} \sum (\bar{x} x_{i} - x_{i}^{2}) = 0$

To finally isolate, substract the first sum and divide by the second:

$b_{1} = \frac{\sum (x_{i} y_{i} - \bar{y} x_{i})}{\sum (x_{i}^{2} - \bar{x} x_{i})}$

Yeah! Here we go, there we have $b_{1}$ .

Simple derivation of linear regression coefficients

Load packages

Motivation

Here’s the workhorse

Finding b1b1

Finding $b_{1}$