F-distribution
Definition 1
The continuous probability distribution $F \left( r_{1} , r_{2} \right)$, which has the following probability density function for degrees of freedom $r_{1}, r_{2} > 0$, is called the F-distribution. $$ f(x) = {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{1} } \over { r_{2} }} \right)^{r_{1} / 2} x^{r_{1} / 2 - 1} \left( 1 + {{ r_{1} } \over { r_{2} }} x \right)^{-(r_{1} + r_{2}) / 2} \qquad , x \in (0, \infty) $$
- $B(r_{1} / 2, r_{2}/2)$ refers to the beta function.
Basic Properties
Moment Generating Function
- [1]: The F-distribution does not have a moment-generating function.
Mean and Variance
- [2]: If $X \sim F ( r_{1} , r_{2})$, then $$ \begin{align*} E(X) =& {{ r_{2} } \over { r_{2} - 2 }} & \qquad , r_{2} > 2 \\ \operatorname{Var}(X) =& {{ 2 r_{2}^{2} (r_{1} + r_{2} - 2) } \over { r_{1} (r_{2} -2)^{2} (r_{2} - 4) }} & \qquad , r_{2} > 4 \end{align*} $$
Theorem
Let two random variables $U,V$ be independent with $U \sim \chi^{2} ( r_{1})$ and $V \sim \chi^{2} ( r_{2})$.
$k$th Moment
- [a]: If $d_{2} > 2k$, then $\displaystyle F := {{ U / r_{1} } \over { V / r_{2} }}$ exists as the $k$th moment $$ E F^{k} = \left( {{ r_{2} } \over { r_{1} }} \right)^{k} E U^{k} E V^{-k} $$
Derived from the Chi-Squared Distribution
- [b]: $${{ U / r_{1} } \over { V / r_{2} }} \sim F \left( r_{1} , r_{2} \right)$$
Derived from the Beta Distribution
- [c]: A random variable $X \sim F \left( r_{1}, r_{2} \right)$ that follows the F-distribution with degrees of freedom $r_{1} , r_{2}$ is defined as $Y$, which follows the beta distribution $\text{Best} \left( {{ r_{1} } \over { 2 }} , {{ r_{2} } \over { 2 }} \right)$. $$ Y := {{ \left( r_{1} / r_{2} \right) X } \over { 1 + \left( r_{1} / r_{2} \right) X }} \sim \text{Beta} \left( {{ r_{1} } \over { 2 }} , {{ r_{2} } \over { 2 }} \right) $$
Derived from the t-Distribution
- [d]: A random variable $X \sim t(\nu)$ that follows the t-distribution with degrees of freedom $\nu > 0$ is defined as $Y$, which follows the F-distribution $F (1,\nu)$. $$ Y := X^{2} \sim F (1,\nu) $$
Reciprocality
- [e]: If $X \sim F \left( r_{1}, r_{2} \right)$, then the distribution of its reciprocal is as follows. $$ {{ 1 } \over { X }} \sim F \left( r_{2}, r_{1} \right) $$
- $\chi^{2} \left( r \right)$ is a chi-squared distribution with degrees of freedom $r$.
Explanation
Just as the t-distribution is called the Student t-distribution, the F-distribution is also referred to as the Snedecor F-distribution, named after the statistician George Snedecor.2
The probability density function of the F-distribution may seem incredibly complex at first glance, but in reality, there is little need to manipulate the formula itself. Understanding the relationship with the chi-squared distribution is of utmost importance. Just as the chi-squared distribution can be used for goodness-of-fit tests, the F-distribution can be used to compare the variances of two populations. As can be directly seen in theorem [b], since the F-distribution is expressed as a ratio of data following the chi-squared distribution, if this statistic deviates too much from $1$, it can be inferred that the variances of the two distributions are different.
Proof
[1]
The existence of a moment-generating function for a random variable means that the $k$th moment exists for all $k \in \mathbb{N}$. However, as per theorem [a], the $k$th moment of the F-distribution exists when $k < d_{2} / 2$, thus a moment-generating function cannot exist.
■
[2]
Use the moment formula stated in [a].
■
[a]
Substituting with $t = {{ r_{1} } \over { r_{2} }} x$ results in $dt = {{ r_{1} } \over { r_{2} }} dx$, so $$ \begin{align*} E F^{k} =& \int_{0}^{\infty} x^{k} {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{1} } \over { r_{2} }} \right)^{r_{1} / 2} x^{r_{1} / 2 - 1} \left( 1 + {{ r_{1} } \over { r_{2} }} x \right)^{-(r_{1} + r_{2}) / 2} dx \\ =& {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{1} } \over { r_{2} }} \right)^{r_{1} / 2} \int_{0}^{\infty} x^{k + r_{1} / 2 - 1} \left( 1 + {{ r_{1} } \over { r_{2} }} x \right)^{-(r_{1} + r_{2}) / 2} dx \\ =& {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{1} } \over { r_{2} }} \right)^{r_{1} / 2} \int_{0}^{\infty} \left( {{ r_{2} } \over { r_{1} }} t \right)^{k + r_{1} / 2 - 1} \left( 1 + t \right)^{-(r_{1} + r_{2}) / 2} {{ r_{2} } \over { r_{1} }} dt \\ =& {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{1} } \over { r_{2} }} \right)^{r_{1} / 2} \left( {{ r_{2} } \over { r_{1} }} \right)^{k + r_{1} / 2}\int_{0}^{\infty} t^{k + r_{1} / 2 } \left( 1 + t \right)^{-r_{1}/2 - r_{2}/ 2} dt \\ =& {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{2} } \over { r_{1} }} \right)^{k }\int_{0}^{\infty} t^{k + r_{1} / 2 } \left( 1 + t \right)^{-(r_{1}/2+k) - (r_{2}/ 2-k)} dt \end{align*} $$
Representation of the beta function as a definite integral: $$ B(p,q)=\int_{0}^{\infty}\frac{ t^{p-1} }{ (1+t)^{p+q}}dt $$
Relationship between the beta and gamma functions: $$ B(p,q) = {{\Gamma (p) \Gamma (q)} \over {\Gamma (p+q) }} $$
$$ \begin{align*} EF^{k} =& {{ 1 } \over { B \left( r_{1}/2 , r_{2} / 2 \right) }} \left( {{ r_{2} } \over { r_{1} }} \right)^{k } B \left( {{ r_{1} } \over { 2 }} + k, {{ r_{2} } \over { 2 }} - k \right) \\ =& \left( {{ r_{2} } \over { r_{1} }} \right)^{k } {{ \Gamma (r_{1}/2 + r_{2}/2) } \over { \Gamma (r_{1}/2 ) \Gamma ( r_{2}/2) }} {{ \Gamma (r_{1}/2 + k) \Gamma ( r_{2}/2 - k) } \over { \Gamma (r_{1}/2 +k + r_{2}/2 - k) }} \\ =& \left( {{ r_{2} } \over { r_{1} }} \right)^{k } {{ 1 } \over { \Gamma (r_{1}/2 ) \Gamma ( r_{2}/2) }} {{ \Gamma (r_{1}/2 + k) \Gamma ( r_{2}/2 - k) } \over { 1 }} \\ =& \left( {{ r_{2} } \over { r_{1} }} \right)^{k } {{ \Gamma (r_{1}/2 + k) 2^{k}} \over { \Gamma (r_{1}/2 ) }} {{ 2^{-k} \Gamma ( r_{2}/2 - k) } \over { \Gamma ( r_{2}/2) }} \end{align*} $$
Moment of the chi-squared distribution: Let’s say $X \sim \chi^{2} (r)$. If $k > - r/ 2$, then the $k$th moment exists $$ E X^{k} = {{ 2^{k} \Gamma (r/2 + k) } \over { \Gamma (r/2) }} $$
$$ E F^{k} = \left( {{ r_{2} } \over { r_{1} }} \right)^{k } E U^{k} E V^{-k} $$
■
[b]
Derive directly from the joint density function.
■
[c]
Derive directly from the variable change.
■
[d]
Circumvent as a ratio of the chi-squared distributions.
■
[e]
Since the numerator and the denominator are reversed, it is trivial according to theorem [b]. From a practical statistician’s point of view, defining the F-distribution according to theorem [b] and deriving the probability density function accordingly is more natural.
■