Probability Variables and Probability Distribution in Mathematical Statistics
Definition 1
Let us assume that probability is defined in the sample space .
- A function whose domain is the sample space is called a Random Variable. The range of a random variable is also called its Space.
- A function that satisfies the following is called the Cumulative Distribution Function (cdf) of .
Discrete
- D1: If the space of the random variable is a countable set, then is called a Discrete Random Variable and is said to follow a discrete probability distribution.
- D2: The following is called the Probability Mass Function (pmf) of the discrete random variable .
- D3: is called the Support of .
Continuous
- C1: If the cumulative distribution function of the random variable is continuous at all , then is called a Continuous Random Variable and is said to follow a continuous probability distribution.
- C2: A function that satisfies the following is called the Probability Density Function (pdf) of the continuous random variable , and is said to be Absolutely Continuous.
- C3. is called the Support of .
Explanation
Support, or support set, simply put, is a collection that marks the section we are interested in. It’s not a commonly used term, but it certainly conveys what probability theory wants to express. Probability doesn’t care about something definitive, and a probability of means that it will never occur. Thus, can be seen as ‘a really important set’ or ‘a set we must know’, allowing us to direct our limited energy not towards the entirety of but towards .
Even when encountering probability in high school, teachers would emphatically state that ‘a random variable is a function’. However, genuinely conceptualizing and treating random variables as functions requires a higher level of abstraction. Although the definitions introduced here are not yet mathematically strict, describing the concept of probability with sets and functions is not an easy task. Don’t despair if you don’t understand immediately, and don’t gloss over it if you think you do.
From the definitions, one can notice an essential difference between discrete and continuous random variables, which extends into a formal difference. At the undergraduate level, it can be confusing, but it is crucial to understand that the addition of a Jacobian happens only when dealing with continuous random variables.
Theorem
For a continuous random variable with the support and a differentiable injective function , if we define a random variable as , then the probability density function of is derived as follows with respect to . [ NOTE: In fact, since is not assumed to be bijective, the existence of inverse function is not always guaranteed. ]
- Here, is the support of , and means .
Proof
is injective and continuous, so it is either increasing or decreasing. Let’s think about it in cases.
Case 1. If is increasing According to the fundamental theorem of calculus, the probability density function of is Since is increasing, , and therefore
Case 2. If is decreasing Similarly, . Since is decreasing, , and therefore
■
Strict Definition
- Probability variables and probability distributions defined by Measure Theory
- Cumulative Distribution Function defined by Measure Theory
- Discrete Probability Distribution in Measure Theory
- Absolute Continuity in Measure Theory
Hogg et al. (2013). Introduction to Mathematical Statistics(7th Edition): p32~41. ↩︎