# Correlation cannot be more extreme than +1/-1, proof using Cauchy-Schwarz inequality

library(tidyverse)

# The correlation coefficient cannot exceed an absolute value of 1

This is well-known. But why is that the case? How can we proof it? This post gives one explanation using the Cauchy-Schwarz inequality.

Here’s one version of the definition of correlation:

$r = \frac{\sum(\Delta x \Delta y)}{\sqrt{\sum \Delta x^2} \sqrt{\sum \Delta y^2}}$

where $$\Delta x$$ and $$\Delta y$$ are the differences of $$x_i$$ and $$\bar{x}$$, that is: $$\Delta x_i = x_i - \bar{x}$$, and similarly for $$\Delta y_i$$.

For the ease of notation, let’s proceed with the understanding that $$x$$ stands for the differences, ie $$\Delta x$$ (and similarly for $$y$$):

$r = \frac{\sum(xy)}{\sqrt{\sum x^2} \sqrt{\sum y^2}}$

Now, we conjecture that

$r = \frac{\sum(xy)}{\sqrt{\sum x^2} \sqrt{\sum y^2}} \le 1$

Let’s multiply the equation by the denominator of the LHS:

$\sum(xy) \le \sqrt{\sum x^2} \cdot \sqrt{\sum y^2}$

The Cauchy Schwarz inequality states that

$\big| \langle x,y\rangle \big |\leq ||x|| \cdot ||y||$

In words, the inner product $$\langle x,y\rangle$$ (in its positive variant, ie $$>0$$) is smaller or equal to the product of the vector norms.

Stated differently:

$\sum xy \le \sqrt{\sum x^2} \cdot \sqrt{\sum y^2}$

Which is what we wanted to proof in the first place.

Here’s a quite nice intuition on the Cauchy Schwarz inequality.