Selecting an ARMA Model Using EACF in R
Practice 1
PACF is very helpful in determining the order of $AR(p)$, while ACF aids in determining the order of $MA(q)$.
Let’s directly observe an example. ma1.2.s
data comes from a $MA(1)$ model, and ar1.s
data comes from a $AR(1)$ model, both from the TSA
package. By using the acf()
and pacf()
functions from the TSA
package, it generates a Correlogram for various lags $k$ as follows.
Merely looking at the diagram and concluding it is an ARMA model because there are many instances where it exceeds the blue line, is perilous. This is due to the reversibility of ARMA models, where $AR(p)$ could appear as $MA(\infty)$, and $MA(q)$ could appear as $AR(\infty)$. Indeed, the ACF of the top-left $AR(1)$ and the PACF of the bottom-right $MA(1)$ show a strange decreasing trend in magnitude. Conversely, the PACF of $AR(1)$ and the ACF of $MA(1)$ sharply drop after $k=1$.
The problem is there’s no way to know beforehand if real data is $AR(p)$ or $MA(q)$.
$$
\begin{align}
Y_{t} := 0.6 Y_{t-1} + e_{t} + 0.3 e_{t-1}
\end{align}
$$
For instance, arma11.s
is sample data from a $ARMA(1,1)$ model defined as equation $(1)$, but when we draw its ACF and PACF, it appears as follows.
arma11.s
definitely should follow $ARMA(1,1)$, but from the diagram above, it unmistakably looks like an $AR(1)$ model. This implies that if one were only looking at ACF and PACF to identify the model, they might have mistakenly identified it as an $AR(p)$ model even though it was supposed to be an $ARMA(p,q)$ model.
Typically, examining the $ARMA(p,q)$ model with ACF and PACF reveals a trend similar to the one shown in the table above. The reason why the question mark ‘gradually decrease?’ is attached to $ARMA(p,q)$’s PACF is precisely because, as in arma11.s
, there are cases where it suddenly drops sharply. Not every document explaining time series refers to this table, but it isn’t always both necessary and sufficient. If ACF and PACF gradually decrease, one might suspect a $ARMA(p,q)$ model, but it does not mean that for $ARMA(p,q)$, both ACF and PACF always decrease gradually.
In contrast, EACF detects that it was sampled from $ARMA(1,1)$.
As another example, let’s look at lynx
from the built-in data. lynx
is annual data on the number of lynxes caught in traps in Canada from 1821 to 1934.
Based on ACF and PACF alone, lynx
appears to strictly follow a $AR(2)$ model, but in fact, it could follow a $ARMA(2,2)$ model as shown below.
Indeed, a quick check with the auto.arima()
function easily clarifies that it certainly isn’t a $AR(2)$. In this way, the difference between professionals and amateurs, between those who have studied properly and those who have not, is not whether one can produce analysis at all, but whether one can glance at ACF, PACF, EACF and quickly detect discomfort and identify problems. Even a layman’s understanding of the figures can be greatly beneficial, and theoretical comprehension adds immeasurably more value.
Code
Below is an example code.
library(TSA)
win.graph(6,6); par(mfrow=c(2,2))
data(ar1.s); acf(ar1.s, main="ACF of AR(1)"); pacf(ar1.s, main="PACF of AR(1)")
data(ma1.2.s); acf(ma1.2.s, main="ACF of MA(1)"); pacf(ma1.2.s, main="PACF of MA(1)")
win.graph(6,3); par(mfrow=c(1,2))
data(arma11.s); acf(arma11.s, main="ACF of ARMA(1,1)"); pacf(arma11.s, main="PACF of ARMA(1,1)")
eacf(arma11.s)
win.graph(6,3); par(mfrow=c(1,2))
data(lynx); acf(lynx, main="ACF of lynx"); pacf(lynx, main="PACF of lynx")
eacf(lynx)
library(forecast)
out<-auto.arima(lynx); summary(out)
See Also
Cryer. (2008). Time Series Analysis: With Applications in R(2nd Edition): p117. ↩︎