Simple derivation of linear regression coefficients

Load packages

library(tidyverse)

Motivation

The (simple) linear regression is a standard tool in data analysis and statistics. Its properties are well-known but sometimes not known in details to the applied analyst; which is ok. However, if one wishes to understand deeper the internals of the system, the question may arise how to derive the coefficients of the linear regression. Here’s one way.

This approach focuses on simple calculus and derivatives; no matrix algebra, and only the simple case for one predictor.

There are many sources and tutorials similar to this around for example this or here.

Here’s the workhorse

Simple linear regression is defines such that

y^i=b0+b1xi

That’s the regression line.

The regression line is optimal in the sense that it minimizes the squared distances from the line; for example from this source:

library(ggplot2)

d <- mtcars
fit <- lm(mpg ~ hp, data = d)

d$predicted <- predict(fit)   # Save the predicted values
d$residuals <- residuals(fit) # Save the residual values

ggplot(d, aes(x = hp, y = mpg)) +
  geom_smooth(method = "lm", se = FALSE, color = "lightgrey") +  # Plot regression slope
  geom_segment(aes(xend = hp, yend = predicted), alpha = .2) +  # alpha to fade lines
  geom_point() +
  geom_point(aes(y = predicted), shape = 1) +
  theme_bw()  # Add theme for cleaner look

In other words, the residuals minimize this cost function, S:

S=(yiy^)2

Substituting the y^ values:

S=(yib0b1xi)2

To minimize a function, we can take the first derivative. For functions with more than one variable, we take the partial derivative of the respective parameter. Let’s begin with b0:

Sb0[(yib0b1xi)2]

Arg, now what? Chain rule to the rescue!

  • Differentiate outer function first.
  • Multiply by inner function
  • Differentiate inner function and multiply with what we already have.

Sb02(yib0b1xi)

Note that the inner function here simplifies to 1.

Set to zero to get the minimum.

2(yib0b1xi)=0

Now pull out the -2 from the summation and divide both equations by -2:

(yib0b1xi)=0

Multiply out the summation:

yib0b1xi=0

Note that the sum of a constant is n times the constant: k=nk.

Using the above term we get:

yinb0b1xi=0

Now solve for b0:

b0=yib1xin

And that’s simply the mean of y and x!

b0=y¯b1x¯

Finding b1

Get the partial derivative w.r.t. b1:

Sb12(yib0b1xi)

That gives us, set to zero:

2xi(yib0b1xi)=0

Divide by -2:

xi(yib0b1xi)=0

Redistribute xi:

(xiyib0xib1xi2)=0

As we already now b0, let’s substitute that back in:

(xiyi(y¯b1x¯)xib1xi2)=0

Split in two sums:

(xiyiy¯xi)+(b1x¯xib1xi2)=0

We would like to isolate b1, so let’s get this guy more isolated:

(xiyiy¯xi)+b1(x¯xixi2)=0

To finally isolate, substract the first sum and divide by the second:

b1=(xiyiy¯xi)(xi2x¯xi)

Yeah! Here we go, there we have b1.