logo

Differences between Credit Intervals and Confidence Intervals 📂Mathematical Statistics

Differences between Credit Intervals and Confidence Intervals

Theorem

The difference between a credence interval and a confidence interval can indeed be considered the difference between Bayesian and frequentist approaches.

  • Confidence Interval (Frequentist): The parameter is a fixed constant and the confidence interval is randomly determined.
  • Credence Interval (Bayesian): The parameter is a variable with a distribution, and the credence interval is also determined from the posterior distribution.

Confidence Interval

In classical statistics, what a confidence interval $[a , b]$ for a parameter $\mu$, expressed as $95 \% $ confidence interval $[a , b]$, means is that if there are $100$ confidence intervals made by the same method, about $95$ of them would satisfy $\mu \in [a,b]$. Simplified as $p ( \mu \in [a,b] ) = 95 \% $, the interpretation, contrary to common sense, is quite intricate. Although it’s the same formula, it can be read with a different nuance:

  1. The probability that $\mu$ is included in $[a,b]$ is $95 \% $. (X)
  2. The probability that $[a,b]$ contains $\mu$ is $95 \% $. (O)

Of course, both sentences mean the same, but the reader should be able to feel the difference in nuance rather than the words themselves. That is, it’s not that the probability of $\mu$ being in the confidence interval $[a , b]$ is $95 \% $, but according to the process of creating a confidence interval, there are about $95 \% $ of the $n$ confidence intervals $[a , b]$ that include $\mu$.

The parameter $\mu$ we’re curious about is a constant with an unknown distribution, but since the upper and lower limits of the confidence interval, $a,b$, have that distribution, we can consider the case where it’s both $a < \mu$ and $ \mu < b$. This subtle difference makes the confidence interval conceptually mismatched with our intuition, hence even explaining such a straightforward concept accompanies cluttered explanations about parts whether or not they’re included.

Since there’s no defined distribution for the parameter $\mu$ from the outset, saying $p ( \mu \in [a,b] )$ doesn’t make sense. If someone asks the value of $P \left( 0 \le X \le 2 \right)$ without mentioning the distribution of $X$, it would be quite perplexing, mirroring the inaccurate interpretation of confidence intervals.

From the perspective of the frequentist, the parameter $\mu$ exists as a fixed constant, and what changes according to the sample is the confidence interval itself. As if assuming the current sample will be similar to the population, one creates a confidence interval and sees it as likely to be similar to the remaining confidence intervals not yet created.

Credence Interval

When a subset $C \subset \Theta$ of the parameter space $\Theta$ satisfies $P ( \theta \in C | y ) \ge 1 - \alpha$ for the significance level $\alpha$, given the data $y$, $C$ is called $100(1 - \alpha) \% $ credence interval for $\theta$.

On the other hand, credence intervals start with a definition expressed in terms of probability about $\theta$. That’s possible because it properly assumes the posterior distribution $p ( \theta | y)$ about the parameter $\theta$. This aligns intuitively with the definition of confidence/credence intervals that statistics students have conceptually accepted.

Bayesians don’t concern themselves with samples yet to be collected. It’s nothing but the best answer given, considering the assumption of the posterior distribution and the samples collected so far. Therefore, if $100$ credence intervals were made, there’s no need for a cluttered explanation stating that about $95 \% $ of them include the parameter $\theta$ within $C$.