logo

Definition of Median in Basic Statistics 📂Data Science

Definition of Median in Basic Statistics

Definition 1

Given $n$ quantitative data in ascending order, the value located in the middle of all the data is called the median $m$. If $n$ is odd, $m := x_{(n+1)/2}$ is used, and if $n$ is even, any values that satisfy the following are considered the median. $$ x_{1} \le \cdots \le x_{ \lceil {{ n+1 } \over { 2 }} \rceil } \le m \le x_{ \lceil {{ n+1 } \over { 2 }} \rceil + 1} \le \cdots \le x_{n} $$

Here, $\lceil \cdot \rceil : \mathbb{R} \to \mathbb{Z}$ is the ceiling function.

Explanation

The median serves as a measure of center, which, compared to the average, is less sensitive to outliers but is not guaranteed to be unique. As mentioned in the definition, if the number of samples is even, there are infinitely many possible medians, though mathematically, in reality, it is simply narrowed down to one as follows. $$ m := \left( x_{\lceil {{ n+1 } \over { 2 }} \rceil} + x_{ \lceil {{ n+1 } \over { 2 }} \rceil + 1} \right) / 2 $$

For example, if the given data is $$ 1,2,5,8,9 $$ since the number of samples is odd, $m = 5$ in the middle is the median, and if it is $$ 1,2,2,4,7,81 $$ then $2 \le m \le 4$ are all medians, but it is usually reduced to $m = (2+4)/2 = 3$. Here, because of a large outlier like $81$, the average rises to $16.16$, but the median remains unaffected.

See also


  1. Mendenhall. (2012). Introduction to Probability and Statistics (13th Edition): p55. ↩︎