Order Statistics
Theorem1
Let’s say that a random sample $X_{1} , \cdots , X_{n}$ has a probability density function $f(x)$ with support $\mathcal{S} =(a,b)$, following a continuous probability distribution. If we arrange these according to their size as $Y_{1} < \cdots < Y_{n}$, then their joint and marginal probability density functions are as follows:
[1] Joint: $$ g \left( y_{1} , \cdots , y_{n} \right) = \begin{cases} n! f (y_{1}) \cdots f (y_{n}) &, a < y_{1} < \cdots < y_{n} < b \\ 0 & , \text{elsewhere} \end{cases} $$
[2] Marginal: calling the cumulative density function of $Y_{k}$ as $F(y_{k})$, $$ g (y_{k}) = \begin{cases} {{ n! } \over { (k-1)! (n-k)! }} \left[ F (y_{k}) \right]^{k-1} \left[ 1 - F(y_{k}) \right]^{n-k} f(y_{k}) & , a < y_{k} < b \\ 0 & , \text{elsewhere} \end{cases} $$
Explanation
At first glance, the formulas might seem complex, but they are not too difficult to grasp once you understand their intuitive meanings. The joint probability density function in [1] is about arranging $n$ random variables in order, showing the number of permutations $n!$, whereas the marginal probability density function in [2] reflects the number $\displaystyle {{ n! } \over { (k-1)! 1! (n-k)! }}$ that appears according to the combinations from selecting one of $Y_{k}$, smaller than $ y_{k}$ by $k-1$ variables, and greater than by $n-k$ variables. If we arrange the arguments in the order of $\left\{ Y_{i} \right\}$ without omitting any, the shape is as follows: $$ g (y_{k}) = {{ n! } \over { (k-1)! 1! (n-k)! }} \left[ F (y_{k}) \right]^{k-1} f(y_{k}) \left[ 1 - F(y_{k}) \right]^{n-k} $$ Order statistics, as the name implies, refer to statistics that have been ordered, allowing for assumptions about the probability distribution of a random sample to predict the probabilities of observing the greatest value, the second greatest, the smallest, or a value in the median range. According to summary [2], the probability density functions of the minimum and maximum values can be directly calculated using the following formula: $$ Y_{1} = \min \left\{ X_{1} , \cdots , X_{n} \right\} \implies g_{1} (y_{1}) = n f(y_{1}) \left[ 1- F(y_{1}) \right]^{n-1} \\ Y_{n} = \max \left\{ X_{1} , \cdots , X_{n} \right\} \implies g_{n} (y_{n}) = n f(y_{n}) \left[ F(y_{n}) \right]^{n-1} $$ As a real-life application, consider the water level in a reservoir. Should a heavy rain pour and exceed the capacity or should the dam break, it would cause a significant problem. While the water level can be expressed as a time series data, with both average and standard deviation calculable, such statistical figures are useless in emergency situations facing flooding. However, focusing on the highest water level from the start allows for a more stable and rational basis for determining the size and construction of the reservoir. If the thought crosses your mind, ‘Reservoirs hardly ever overflow, do they?’ then the point has already been made. The reason they don’t overflow is precisely because such factors have already been considered.
Proof
[1] 2
Strategy: Knowing that permutations result in $n!$ in permutations is essentially all there is to it.
Transformation of random variables: The joint probability density function of the transformed multivariate random variables $Y = ( Y_{1} , \cdots , Y_{n} )$ is as follows: $$ g(y_{1},\cdots,y_{n}) = \sum_{i=1}^{k} f \left[ w_{1i}(y_{1},\cdots , y_{n}) , \cdots , w_{ni}(y_{1},\cdots , y_{n}) \right] \left| J_{i} \right| $$
The number of ways $X_{1} , \cdots , X_{n}$ can be transformed into $Y_{1} , \cdots , Y_{n}$ is $n!$, and regardless of the transformation, only the order changes, hence the Jacobian is $\pm 1$. Therefore, $$ \begin{align*} g \left( y_{1} , \cdots , y_{n} \right) =& \sum_{i=1}^{n!} | \pm 1 | f (y_{1}) \cdots f (y_{n}) \\ =& n! f (y_{1}) \cdots f (y_{n}) \end{align*} $$
■
[2] 3
Strategy: Similarly, it’s virtually over once you realize that the number of combinations for selecting $3$ types from $n$ elements is $\displaystyle {{ n! } \over { a! b!(n-a-b)! }}$. It suffices to assume $a = k-1$ and $b = 1$ here.
Since one of $Y_{k}$ and the smaller than $ y_{k}$ by $k-1$ variables are chosen with the probability of $F(y_{k})$, and the greater than by $n-k$ variables with the probability of $[1-F(y_{k})]$, according to the combination formula, $$ g (y_{k}) = {{ n! } \over { (k-1)! 1! (n-k)! }} \left[ F (y_{k}) \right]^{k-1} f(y_{k}) \left[ 1 - F(y_{k}) \right]^{n-k} $$
■