The mean minimizes the sum of squares

Load packages

library(tidyverse)

Properties of the arithmetic mean

The stuff presented here is far from new, that’s all well-known and basic. See here for a source.

The arithmetic mean has a number of properties …

Residuals cancel out

… such as that the residuals cancel out, i.e, the sum of the deviations from the mean (the residuals) sum up to zero:

\[\sum (x_i - \bar{x}) = \sum x_i - \sum \bar{x} = n \cdot \bar{x} - n \cdot \bar{x} = 0\]

The sum of the squared residuals is a minimum

Take all residuals, square them, then sum them up. The resulting number will be the smallest number with respect to all alternative values to the mean (such as the median or some arbitrary number).

Suppose we would like to know which number \(a\) will minimize the residuals:

\[\text{min}_a : \qquad f(x) = \sum (x_i - a)^2\]

Multiply out the binomial formula:

\[= \sum (x_i^2 - a x_i + a^2)\]

Unlock the sum:

\[ = \sum x_i^2 - \sum 2ax_i + \sum a^2\]

Pull the constant in front of the sum and simplify the last sum:

\[= \sum x_i^2 - 2a\sum x_i + n a^2\]

To find a minimum, differentiate with respect to a:

\[f'(x)= 0 - 2\cdot1 \sum x_i + 2na\]

Set to zero:

\[f'(x) = - 2\cdot1 \sum x_i + 2na = 0\]

Solve for \(a\):

\[a = \frac{\sum x_i}{n}\]

And that’s the mean of the values.

Since \(f''(x) = 2n > 0\), we have a minimum.

Done. 😄.