logo

Definition of Median in Basic Statistics 📂Data Science

Definition of Median in Basic Statistics

Definition 1

Given nn quantitative data in ascending order, the value located in the middle of all the data is called the median mm. If nn is odd, m:=x(n+1)/2m := x_{(n+1)/2} is used, and if nn is even, any values that satisfy the following are considered the median. x1xn+12mxn+12+1xn x_{1} \le \cdots \le x_{ \lceil {{ n+1 } \over { 2 }} \rceil } \le m \le x_{ \lceil {{ n+1 } \over { 2 }} \rceil + 1} \le \cdots \le x_{n}

Here, :RZ\lceil \cdot \rceil : \mathbb{R} \to \mathbb{Z} is the ceiling function.

Explanation

The median serves as a measure of center, which, compared to the average, is less sensitive to outliers but is not guaranteed to be unique. As mentioned in the definition, if the number of samples is even, there are infinitely many possible medians, though mathematically, in reality, it is simply narrowed down to one as follows. m:=(xn+12+xn+12+1)/2 m := \left( x_{\lceil {{ n+1 } \over { 2 }} \rceil} + x_{ \lceil {{ n+1 } \over { 2 }} \rceil + 1} \right) / 2

For example, if the given data is 1,2,5,8,9 1,2,5,8,9 since the number of samples is odd, m=5m = 5 in the middle is the median, and if it is 1,2,2,4,7,81 1,2,2,4,7,81 then 2m42 \le m \le 4 are all medians, but it is usually reduced to m=(2+4)/2=3m = (2+4)/2 = 3. Here, because of a large outlier like 8181, the average rises to 16.1616.16, but the median remains unaffected.

See also


  1. Mendenhall. (2012). Introduction to Probability and Statistics (13th Edition): p55. ↩︎