Drift in the ARIMA Model 📂Statistical Analysis

Drift in the ARIMA Model

Explanation

When analyzing time series, one often comes across a coefficient called Drift.

$20190716\_155512.png$

Of course, in the case above, the coefficient is too small compared to the standard error to matter. However, if you actually have a significant coefficient and need to write it down in a formula, it’s necessary to understand what a drift is. Unfortunately, there are no good explanations about what drift actually is domestically, and without involving formulas, it can only be described as something ‘similar to a constant term’.

If you’re not interested in reading formulas and it’s not crucially important for analysis, just knowing it as a constant term or an average should suffice. See the second analysis below if that’s the case. Of course, if you’re a major in the field, it’s strongly recommended to follow the formula derivation at least once.

Mathematical Explanation

Let’s start with the ARIMA model $\nabla^{d} Y_{t} := \sum_{i = 1}^{p} \phi_{i} \nabla^{d} Y_{t-i} + e_{t} - \sum_{i = 1}^{q} \theta_{i} e_{t-i}$ . No, let’s start with something simpler without difference, the ARMA model. $\begin{align} \displaystyle Y_{t} - \sum_{i = 1}^{p} \phi_{i} Y_{t-i} = e_{t} - \sum_{i = 1}^{q} \theta_{i} e_{t-i} \end{align}$ To simplify, let’s use the following expression involving the backshift $B$ . $\phi (B) := \left( 1 - \phi_{1} B \cdots - \phi_{p} B^{p} \right)$

$\theta (B) := \left( 1 - \theta_{1} B \cdots - \phi_{q} B^{q} \right)$ Applying this expression to $(1)$ results in $\phi (B) Y_{t} = \theta (B) e_{t}$ This is another representation of the ARMA model, where we add a constant term $c$ as follows. $\phi (B) \left( Y_{t} - c \right) = \theta (B) e_{t}$ Here, $c$ in equation $(1)$ is what we refer to as the Drift. Looking at the formula, you will understand why the drift is described as a ‘constant term’ or an ‘average’. Of course, this is the ARMA model, and in the ARIMA model with difference included, it appears differently. For convenience, consider only taking $d=1$ times in $ARIMA(p,1,q)$ when thinking about the difference. $\begin{align} \phi (B) \nabla \left( Y_{t} - a - b t \right) = \theta (B) e_{t} \end{align}$ In equation $(2)$ , $a$ is called the Intercept, and $b$ is called the Drift. Dissecting the difference reveals $\begin{align*} & \phi (B) (1 - B) \left( Y_{t} - a - b t \right) = \theta (B) e_{t} \\ \iff & \phi (B) \left[ \left( Y_{t} - a - b t \right) - \left( Y_{t-1} - a - b (t-1) \right) \right] = \theta (B) e_{t} \\ \iff & \phi (B) \left[ Y_{t} - Y_{t-1} - a + a - bt - b(t-1) \right] = \theta (B) e_{t} \\ \iff & \phi (B) \left( \nabla Y_{t} - b \right) = \theta (B) e_{t} \end{align*}$

Eventually, $b$ serves the same function as $c$ in $(1)$ , therefore it’s logical to call it Drift. Although analyses requiring more than two differences are rare, this method can also be generalized for natural numbers $d$ . [ NOTE: People familiar with mathematical intuition might recall the phenomenon where taking a difference in the constant term $a$ in $(2)$ makes it disappear, reminiscent of differentiation. ]

Practice

$20190716\_155837.png$ Returning to the analysis. As you can see, or as expected, the $ARIMA(1,1,2)$ of the data and the $ARIMA(1,0,2)$ of the data with difference taken are naturally the same. The only difference is that the name of the coefficient written as drift in $ARIMA(1,1,2)$ has changed to mean in $ARIMA(1,0,2)$ . It would be better if you understood the formula, but even if you didn’t, you might guess that drift is something ‘similar to a constant term or an average’.

Code

library(TSA)
data("oil.price")
out<-auto.arima(oil.price); summary(out)
out<-auto.arima(diff(oil.price)); summary(out)