This note is the proof of Exercise 7 in section 3.7 of An Introduction to Statistical Learning.
For n observations \((x_i, y_i), i=1..n\), let:
Substitute \(x\) with \(y\), we get:
And:
For correlation between \(X\) and \(Y\), also denoted as \(Cor(X, Y)\):
Here \(\bar x = \frac{\sum_{i=1}^n x_i}n, \bar y = \frac{\sum_{i=1}^n y_i}n\).
See Correlation Coefficient and Least Squares Fitting for detailed reasonings.
For the target regression function \(y = a + bx\), \(a\) and \(b\) should be the value that make \(RSS\) get its minimum, where
So we have:
These lead to:
Eq\(\eqref{eq7}\) can be written as:
From eq\(\eqref{eq6}\) we have:
Take eq\(\eqref{eq1}, \eqref{eq3}, \eqref{eq9}\) into eq\(\eqref{eq8}\), we get:
Take eq\(\eqref{eq9}, \eqref{eq10}\) into eq\(\eqref{eq5}\), we get:
Take eq\(\eqref{eq11}\) into equation (3.17) of An Introduction to Statistical Learning, we have:
With eq\(\eqref{eq4}\), we get \(R^2 = Cor(X, Y)^2\).
Here \(RSS\), \(R^2\), \(TSS\) and \(Cor(X, Y)\) is defined in Equation (3.16) ~ (3.18) of An Introduction to Statistical Learning. \(ss\), \(r\) is defined in Correlation Coefficient and Least Squares Fitting.
Other references: