Epsilon-Delta Argument
Definition1
Let $I$ be an interval containing $a \in \mathbb{R}$, and suppose that $f$ is a function defined at $I \setminus \left\{ a \right\}$. If for every $\epsilon > 0$, there exists a $\delta>0$ such that
$$ 0 < | x - a | < \delta \implies | f(x) - L | < \varepsilon $$
is satisfied, then we say that $f(x)$ converges to $L \in \mathbb{R}$ as $x \to a$ approaches $a \in \mathbb{R}$. This is denoted as follows:
$$ \lim\limits_{x \to a} f(x) = L \qquad \text{or} \qquad f(x)=L \quad \text{as } x \to a $$
Explanation
The term “epsilon-delta” comes from, as you can see, epsilon $\varepsilon$ and delta $\delta$ appearing in the definition. It’s an expression first used by Cauchy, the “father of analysis”, where epsilon and delta represent error $\varepsilon$rror and distance $\delta$istance, respectively.
As you can see, the expression is very complex and far from intuitive, which is why it is initially difficult to grasp. Like the limit of a sequence, there is both a reason for redefining it this way and a reason for its complicated definition, but understanding these reasons and truly understanding epsilon-delta are two different matters. In fact, understanding epsilon-delta is not enough; it only becomes useful after getting used to it.
To get a grip on it, imagine shooting game. In this game, you have a gun $f$ and are shooting at a specified target $L$ from a set position $a$, where whether you hit the target or not is judged within an allowed error $\varepsilon$. Of course, if you couldn’t move at all from $a$, you wouldn’t be able to hit the target. Let’s assume the shooter can determine how much they need to move to hit the target when given an error $\varepsilon$ and thus can propose an allowable distance $\delta$.
The given gun is $f(x) := 2x$, and with it, aiming at $x$ and shooting results in hitting $2x$. To judge if this gun is proper, one might test if it can hit $L=0$ from $a=0$. But can a gun with scattered hit points really hit the target? Let’s examine a few cases in practice.
**Case 1. $\varepsilon = 12$
The first allowable error is given generously as $\varepsilon = 12$. Since only $| f(x) - L | < \varepsilon$ needs to be satisfied, shooting from $x$ to meet $| f(x) | < 12$ will be considered a hit. So, $x$ must not exceed the absolute value of $6$, meaning as long as it is $| x | < 6$, it will meet $| f(x) | < 12$. Rewriting it as an equation:
$$ | x | < 6 \implies | f(x) | < 12 $$
This shows that, with an allowable error of $\varepsilon = 12$, we can identify an allowable distance $\delta = 6$ in which the gun $f$ can hit the target $L = 0$ from $a = 0$. Of course, a smaller distance would also work, but there is no need to make it harder than necessary.
**Case 2. $\varepsilon = 6$
The second allowable error is given as $\varepsilon = 6$. Just like before, satisfying $| f(x) | < 6$ is all it takes, hence the necessary allowable distance $\delta = 3$ can be presented.
**Case 3. $\varepsilon > 0$
As we have seen, no matter how the allowable error $\varepsilon > 0$ is set, we can present an allowable distance $\delta = \varepsilon / 2$ to hit the target. Being able to specify $\delta$ for every $\varepsilon> 0$ essentially means the following:
$$ \forall \varepsilon > 0 , \exists \delta : | x - 0 | < \delta \implies | f(x) - 0 | < \varepsilon $$
Rewritten in familiar terms, it becomes $\lim_{x \to 0} 2x = 0$. Until now, we demonstrated that when $x \to 0$, it implies $2x \to 0$. The shooting analogy is no longer needed, but to reiterate, it means with the gun $f$, you can hit the target $L = 0$ from $a = 0$.
For example, when explaining that $\delta (12) =6$ exists, that $\delta (6) = 3$ exists, and so on, you might have felt some understanding. As you read through the explanation, you suddenly proved $\lim_{x \to 0} 2x = 0$, but such an analogy might be forgettable due to its lack of cohesion. Now, let’s consider why epsilon-delta is difficult.
Intuition: The sensation of using epsilon-delta and the feeling of ‘approaching infinitely close’ like $x \to a$ and $f(x) \to L$ feels different
In fact, this is the real reason for using epsilon-delta, but for now, it might not be ‘convincing’ why the existence of $\delta$ equates to something like $\lim_{x \to a} f(x) = L$. If this is the only obstacle, it doesn’t mean you failed to understand epsilon-delta; it’s just unfamiliar. Whether it’s $| x - a | < \delta$ or $| f(x) - L | < \varepsilon$, $\delta$ isn’t thought of as a ’large number’. It’s perceived as a sufficiently small positive number that ‘suppresses’ $| x - a |$ and $| f(x) - L |$, ultimately leading to the following thought process:
$$ | x - a | < \delta \implies \lim_{\delta \to 0} | x - a | = 0 \implies x \to a $$
$$ | f(x) - L | < \varepsilon \implies \lim_{\varepsilon \to 0} | f(x) - L | = 0 \implies f(x) \to L $$
Terminology: The phrase ‘$\delta$ exists’ doesn’t quite resonate
In reality, this isn’t about literally creating $\delta$ but rather about presenting it in relation to $\varepsilon$. If you’ve managed to express $\delta$ as a function $\delta = \delta ( \varepsilon )$ of $\varepsilon$, then since the existence of $\varepsilon > 0$ is already assumed, $\delta$ exists as well.
Order: The condition is $|x - a| < \delta \implies | f(x) - L | < \varepsilon$, but the order of thought is opposite
This is a really confusing part because the form of $\implies$ might make it seem like the order should go from front to back. However, as made clear by “for all $\varepsilon > 0$”, $| f(x) - L | < \varepsilon$ must be considered first before $| x - a | < \delta$. If you don’t know what $\varepsilon$ is, then it’s not worth pondering.
Considering these three reasons while re-reading the explanation will be helpful. If you have understood, now a few odd points might appear, such as caring only about $| x - a | < \delta$ instead of $0 < | x - a | < \delta$, suddenly no more mentioning whether $f$ is a proper gun, or not being able to hit the target without moving from $a$, etc. These are merely analogies twisted in various ways to make epsilon-delta as intuitive as possible. Important is not to focus on irrelevant parts but to use concentration on logically necessary proofs.
William R. Wade, An Introduction to Analysis (4th Edition, 2010), p68 ↩︎