logo

Three Representative Values of Statistics: Mode, Median, Average 📂Mathematical Statistics

Three Representative Values of Statistics: Mode, Median, Average

Overview

Measures of central tendency are statistics that summarize the data by identifying the central position within a data set. Even when dealing with thousands or millions of data points, it’s often not practical or necessary to examine each one individually. Instead, what’s important is understanding what the data represents, and measures of central tendency effectively condense this information. The three most commonly used measures of central tendency are mode, median, and mean.

  • (0) Mode: The value that appears most frequently in a data set
  • (1) Median: The middle value of a data set, or a value such that at least half the dataset is less than or equal to it
  • (2) Mean: The value obtained by dividing the sum of all the observations in the data set by the number of observations

Examples

For instance, let’s say we roll a die $10$ times and get $1,1,2,3,3,4,6,6,6,6$. Then, the mode is the most frequently occurring value $6$, the median is the value between $3$ and $4$, which is $\displaystyle {{3 + 4} \over {2} } = 3.5$, and the mean is calculated as $\displaystyle {{38} \over {10}} = 3.8$.

  • (0): The mode is often used for qualitative data without numerical values or where the order doesn’t matter, like favorite political parties or numbers.
  • (1): The median is used for data where order matters, such as income or grades, because it differentiates data based on rank.
  • (2): The mean is the most commonly used measure of central tendency but is sensitive to outliers. It’s not only in small samples that the mean might fail to accurately represent data. For various economic indicators, especially those at a national level, it’s more common to use quantiles, such as the top 10% of earners or low-income groups. As income inequality widens, the mean loses its relevance, and the ability to differentiate between the median and the mean becomes critical.

If it’s intuitively clear why these statistics are important, that’s great. Otherwise, it may be helpful to look at the mathematical properties and proofs of these concepts. This also explains why we number our examples as 0,1,2 instead of 1,2,3.

Definitions

Let’s denote a random variable as $X$ and its probability density function as $f(x)$.

  • (0’) Mode: $\argmax f(x)$
  • (1’) Median: $\displaystyle \arg \int_{- \infty}^{x} f(t) dt = {{1} \over {2}}$
  • (2’) Mean: $\displaystyle \int_{-\infty}^{\infty} x f(x) dx$

Description

These measures of central tendency can also be defined for probability distributions, not just for samples. If one is familiar with statistical mathematics, understanding these definitions through the provided equations can be helpful.

  • (0’): The mode, simply put, is the value of $x$ at which the probability density function $f(x)$ reaches its maximum.
  • (1’): The median is the value of $x$ at which the integral reaches $0.5$.
  • (2’): The mean is calculated as the expected value of the distribution.

Let’s take a look at where these measures of central tendency are typically located in the following diagrams. The red arrow represents the mode, the blue arrow represents the median, and the green arrow represents the mean.

Non-Unimodal Distributions

20181031\_172502.png

Being non-unimodal means that there are multiple peaks in the distribution, as shown above. Although such distributions aren’t typically the focus of statistics or probability theory, they are frequently seen in college exam results. Not everyone competes equally in a course—some study while others play—which leads to such distributions. The median and mean may not show consistency, and the mode appears at the highest peak.

Unimodal and Right-Skewed Distributions

20181031\_172508.png

Distributions such as the exponential, chi-squared, and F-distributions follow this pattern. This type follows a sequence of mode$\le$median$\le$mean.

Unimodal and Symmetrical Distributions

20181031\_172515.png

Distributions like the normal distribution fall under this category. In such cases, the mode, median, and mean are all equal. Many things in the world follow a normal distribution, leading people without statistical knowledge to confuse these measures. For example, if one is ranked 15th out of 30 students, it’s common to think of one’s grade as being average. While this might be true, it’s not always guaranteed because the assumption of a normal distribution might be incorrect.

See Also