logo

Cumulative Average Formula Derivation 📂Lemmas

Cumulative Average Formula Derivation

Formula

Given a sample $x_{1} , \cdots , x_{n}$ with a sample mean of $y_{n}$, when a new sample $x_{n+1}$ is provided, the overall sample mean $y_{n+1}$ is as follows. $$ y_{n+1} := {{ n } \over {n + 1}} y_{n} + {{1} \over {n+1}} x_{n+1} $$

Description

Cumulative Average is also called Moving Average or Running Average.

It’s a mistake anyone can make at least once during their middle school years, but I used to just insert the previously calculated average into the rest of the subjects and average them out while grading papers. (At that time, I had no idea that I would go on to major in statistics in the future.) For example, let’s say Korean and English were 90 and 80 points respectively, and math wasn’t graded yet. My average at this point was $$ y_{2} = (90+80)/2 = 85 $$, but if I received 70 points in math, the overall average should be $$ y_{3} = (90+80+70)/3 = 80 $$, but I simply calculated it as $$ (85 + 70)/2 = 77.5 $$ points. I’m not exactly sure why, but after realizing that there is a difference between adding everything up and dividing, and just continuously inserting one subject at a time to find the average, I always started over by adding everything from the beginning and dividing. The formula introduced in this post is exactly the way to prevent such foolishness. Instead of recalculating all the numbers, you just need to calculate the Weighted Average by multiplying $n, 1$ with the new data and the existing average.

It may seem trivial, but surprisingly, many non-specialists are unaware of this formula and make similar mistakes, and it suddenly appears without any explanation in fields such as artificial intelligence, dealing with Data Streams. In this case, the role of the cumulative average formula is mainly to efficiently update the figures related to learning when new data is provided.

Derivation

$$ \begin{align*} y_{n+1} =& {{ x_{1} + \cdots + x_{n} + x_{n+1} } \over {n + 1}} \\ =& {{ n {{x_{1} + \cdots + x_{n}} \over {n}} + x_{n+1} } \over {n + 1}} \\ =& {{ n y_{n} + x_{n+1} } \over {n + 1}} \end{align*} $$