logo

What is Regression Analysis? 📂Statistical Analysis

What is Regression Analysis?

Description

20190905_104344.png

Regression analysis is so ubiquitous a foundation of nearly all statistical techniques that it is often described either too generally or too specifically. If one were to explain what regression analysis is in a sentence for someone curious, it could be described as a method for discovering the relationships between variables.

This useful and astonishing method of analysis was born from the ideas of Francis Galton, the father of eugenics.

Galton, while studying genetics, came across data on the heights of fathers and their sons and noticed that generally, if the father was tall, the son was also likely to be tall and vice versa. While this relationship was known to everyone before, Galton focused on the phenomenon where, over generations, it regresses to the mean.

Sons of taller fathers were indeed tall but tended to be shorter than their fathers, and sons of shorter fathers were also short but tended to be taller than their fathers. Logically, this makes sense; otherwise, over generations, height would diverge infinitely or converge to $0$.

Meanwhile, it’s not always regressing to the mean because there are unavoidable errors, such as environmental factors or mutations. Nevertheless, the evident linear relationship must have persuaded Galton that ‘height is inherited.’

So, even if it’s not exactly accurate, isn’t it possible, albeit with some error, to roughly predict a son’s height by just looking at the father’s height? If the height of the parents $x$ and the height of the son $y$ have a relationship like $y = a + b x$, then by substituting the father’s height for $x$, one could guess the height of the son. Of course, it might not match perfectly, but on average, it would fall reasonably close.

This is how regression analysis came about. Now, of course, regression analysis is applied in an incredibly broad range of fields, and there’s no need to talk about generational change, so the term ‘regression’ has lost its original meaning. It’s okay to just understand its etymology and move on.