logo

Pythagorean Winning Percentage Derivation 📂Sabermetrics

Pythagorean Winning Percentage Derivation

Formulas

Let’s assume we have a team from a certain sports league. The Team Scores SS and Team Allows AA are random variables that each follow a Weibull distribution, SWeibull(αS,β,γ)AWeibull(αA,β,γ) \begin{align*} S & \sim \text{Weibull} \left( \alpha_{S} , \beta , \gamma \right) \\ A & \sim \text{Weibull} \left( \alpha_{A} , \beta , \gamma \right) \end{align*} and are also independent of each other independently. The team’s expected winning percentage pp is given with respect to γ>0\gamma > 0 as follows. pγ=μSγμSγ+μAγ p_{\gamma} = {{ \mu_{S}^{\gamma} } \over { \mu_{S}^{\gamma} + \mu_{A}^{\gamma} }} Here, μS:=E(S)\mu_{S} := E (S) and μA:=E(A)\mu_{A} := E (A) represent the expected score and expected allows, respectively.

Derivation 1

Strategy: This is a statistical derivation of the Pythagorean winning percentage. It’s straightforwardly deduced through the joint probability density function. The function Γ:RR\Gamma : \mathbb{R} \to \mathbb{R} represents the gamma function.


Mean and variance of the Weibull distribution: A probability distribution named Three-parameter Weibull Distribution has the probability density function as follows, with the scale parameter α>0\alpha > 0, location parameter β>0\beta > 0, and shape parameter γ>0\gamma > 0. f(x)=γα(xβα)γ1e((xβ)/α)γ,xβ f(x) = {{ \gamma } \over { \alpha }} \left( {{ x-\beta } \over { \alpha }} \right)^{\gamma-1} e^{- \left( (x - \beta) / \alpha \right)^{\gamma}} \qquad , x \ge \beta When XWeibull(α,β,γ)X \sim \text{Weibull} (\alpha, \beta, \gamma), its mean and variance are as follows. E(X)=αΓ(1+1γ)+βVar(X)=α2[Γ(1+2γ)(Γ(1+1γ)2)] \begin{align*} E(X) =& \alpha \Gamma \left( 1 + {{ 1 } \over { \gamma }} \right) + \beta \\ \operatorname{Var} (X) =& \alpha^{2} \left[ \Gamma \left(1 + {{ 2 } \over { \gamma }} \right) - \left( \Gamma \left( 1 + {{ 1 } \over { \gamma }} \right)^{2} \right) \right] \end{align*}

μS=E(S)=αSΓ(1+γ1)+βμA=E(A)=αAΓ(1+γ1)+β \begin{align*} \mu_{S} =& E \left( S \right) = \alpha_{S} \Gamma \left( 1 + \gamma^{-1} \right) + \beta \\ \mu_{A} =& E \left( A \right) = \alpha_{A} \Gamma \left( 1 + \gamma^{-1} \right) + \beta \end{align*} If we denote the population means of SS and AA as μS\mu_{S} and μA\mu_{A} respectively, then the first parameters of the Weibull distribution αS\alpha_{S}, αA\alpha_{A} αS=μSβΓ(1+γ1)αA=μAβΓ(1+γ1) \begin{align*} \alpha_{S} =& {{ \mu_{S} - \beta } \over { \Gamma \left( 1 + \gamma^{-1} \right) }} \\ \alpha_{A} =& {{ \mu_{A} - \beta } \over { \Gamma \left( 1 + \gamma^{-1} \right) }} \end{align*} are represented as above, and for the sake of simplification in the derivation, let’s define α\alpha as follows. 1αγ=1αSγ+1αAγ=αSγ+αAγαSγαAγ {{ 1 } \over { \alpha^{\gamma} }} = {{ 1 } \over { \alpha_{S}^{\gamma} }} + {{ 1 } \over { \alpha_{A}^{\gamma} }} = {{ \alpha_{S}^{\gamma} + \alpha_{A}^{\gamma} } \over { \alpha_{S}^{\gamma} \alpha_{A}^{\gamma} }}

Now, it’s time to calculate the expected winning percentage. In most sports, a win is defined as the event where the score SS is greater than the allows AA, hence the expected winning percentage is essentially P(S>A)P \left( S > A \right). If the probability density functions of SS and AA are fSf_{S} and fAf_{A} respectively, following the assumption that SS and AA are independent, their joint probability density function is fSfAf_{S} f_{A}. P(S>A)=ββxfS(x)fA(y)dydx=ββxγαS(xβαS)γ1e((xβ)/αS)γγαA(yβαA)γ1e((yβ)/αA)γdydx=00xγαS(xαS)γ1e(x/αS)γγαA(yαA)γ1e(y/αA)γdydx=0γαS(xαS)γ1e(x/αS)γ[0xγαA(yαA)γ1e(y/αA)γdy]dx=0γαS(xαS)γ1e(x/αS)γ[1e(x/αA)γ]dx=1+0γαS(xαS)γ1e(x/αS)γ[e(x/αA)γ]dx=10γαS(xαS)γ1exp(xγ(1αSγ+1αAγ))dx=10γαS(xαS)γ1exp((xα)γ)dx=1αγαSγ0γα(xα)γ1e(x/α)γdx=1αγαSγ1=11αSγαSγαAγαSγ+αAγ=1αAγαSγ+αAγ=αSγαSγ+αAγ=(μSβ)γ(μSβ)γ+(μAβ)γ \begin{align*} & P \left( S > A \right) \\ =& \int_{\beta}^{\infty} \int_{\beta}^{x} f_{S} (x) f_{A} (y) dy dx \\ =& \int_{\beta}^{\infty} \int_{\beta}^{x} {{ \gamma } \over { \alpha_{S} }} \left( {{ x-\beta } \over { \alpha_{S} }} \right)^{\gamma-1} e^{- \left( (x - \beta) / \alpha_{S} \right)^{\gamma}} {{ \gamma } \over { \alpha_{A} }} \left( {{ y-\beta } \over { \alpha_{A} }} \right)^{\gamma-1} e^{- \left( (y - \beta) / \alpha_{A} \right)^{\gamma}} dy dx \\ =& \int_{0}^{\infty} \int_{0}^{x} {{ \gamma } \over { \alpha_{S} }} \left( {{ x } \over { \alpha_{S} }} \right)^{\gamma-1} e^{- \left( x / \alpha_{S} \right)^{\gamma}} {{ \gamma } \over { \alpha_{A} }} \left( {{ y } \over { \alpha_{A} }} \right)^{\gamma-1} e^{- \left( y / \alpha_{A} \right)^{\gamma}} dy dx \\ =& \int_{0}^{\infty} {{ \gamma } \over { \alpha_{S} }} \left( {{ x } \over { \alpha_{S} }} \right)^{\gamma-1} e^{- \left( x / \alpha_{S} \right)^{\gamma}} \left[ \int_{0}^{x} {{ \gamma } \over { \alpha_{A} }} \left( {{ y } \over { \alpha_{A} }} \right)^{\gamma-1} e^{- \left( y / \alpha_{A} \right)^{\gamma}} dy \right] dx \\ =& \int_{0}^{\infty} {{ \gamma } \over { \alpha_{S} }} \left( {{ x } \over { \alpha_{S} }} \right)^{\gamma-1} e^{- \left( x / \alpha_{S} \right)^{\gamma}} \left[ 1 - e^{- \left( x / \alpha_{A} \right)^{\gamma}} \right] dx \\ =& 1 + \int_{0}^{\infty} {{ \gamma } \over { \alpha_{S} }} \left( {{ x } \over { \alpha_{S} }} \right)^{\gamma-1} e^{- \left( x / \alpha_{S} \right)^{\gamma}} \left[ - e^{- \left( x / \alpha_{A} \right)^{\gamma}} \right] dx \\ =& 1 - \int_{0}^{\infty} {{ \gamma } \over { \alpha_{S} }} \left( {{ x } \over { \alpha_{S} }} \right)^{\gamma-1} \exp \left( - x^{\gamma} \left( {{ 1 } \over { \alpha_{S}^{\gamma} }} + {{ 1 } \over { \alpha_{A}^{\gamma} }} \right) \right) dx \\ =& 1 - \int_{0}^{\infty} {{ \gamma } \over { \alpha_{S} }} \left( {{ x } \over { \alpha_{S} }} \right)^{\gamma-1} \exp \left( - \left( {{ x } \over { \alpha }} \right)^{\gamma} \right) dx \\ =& 1 - {{ \alpha^{\gamma} } \over { \alpha_{S}^{\gamma} }} \int_{0}^{\infty} {{ \gamma } \over { \alpha }} \left( {{ x } \over { \alpha }} \right)^{\gamma-1} e^{- \left( x / \alpha \right)^{\gamma} } dx \\ =& 1 - {{ \alpha^{\gamma} } \over { \alpha_{S}^{\gamma} }} \cdot 1 \\ =& 1 - {{ 1 } \over { \alpha_{S}^{\gamma} }} {{ \alpha_{S}^{\gamma} \alpha_{A}^{\gamma} } \over { \alpha_{S}^{\gamma} + \alpha_{A}^{\gamma} }} \\ =& 1 - {{ \alpha_{A}^{\gamma} } \over { \alpha_{S}^{\gamma} + \alpha_{A}^{\gamma} }} \\ =& {{ \alpha_{S}^{\gamma} } \over { \alpha_{S}^{\gamma} + \alpha_{A}^{\gamma} }} \\ =& {{ \left( \mu_{S} - \beta \right)^{\gamma} } \over { \left( \mu_{S} - \beta \right)^{\gamma} + \left( \mu_{A} - \beta \right)^{\gamma} }} \end{align*} Here, β\beta represents the minimum value between allows and scores, so it’s reasonable to set β=0\beta = 0, leading to the following result. P(S>A)=μSγμSγ+μAγ P \left( S > A \right) = {{ \mu_{S}^{\gamma} } \over { \mu_{S}^{\gamma} + \mu_{A}^{\gamma} }}


  1. Miller. (2005). A Derivation of the Pythagorean Won-Loss Formula in Baseball. https://doi.org/10.48550/arXiv.math/0509698 ↩︎