logo

Residual Analysis of ARIMA Models 📂Statistical Analysis

Residual Analysis of ARIMA Models

Explanation

Like regression analysis, time series analysis also involves residual analysis. According to the assumptions of the ARIMA model, residuals are all white noise, thus they should follow linearity, homoscedasticity, independence, and normality. Compared to regression analysis, it’s generally not as strict, but independence is rigorously checked. After all, the purpose of time series analysis is to understand autocorrelation; if residuals still lack independence, it means the analysis is incomplete.

Explanation

In R, tsdiag() function allows for a simple residual analysis.

6.png

Take the built-in data WWWusage as an example. WWWusage is a time series data representing the number of internet users a long time ago. The ARIMA model for this is $ARIMA(1,1,1)$, and the red solid line is its fit. If you feed the model to the tsdiag() function, it outputs three plots as follows.

7.png

  • (1) Standardized Residuals: This is the familiar standardized residuals plot, just not expressed as a scatter plot. You can visually check for linearity and homoscedasticity.
  • (2) ACF of Residuals: When calculating the ACF of residuals, certain patterns must not emerge. Of course, the autocorrelation coefficient with itself is always $\rho = 1$, so there’s nothing wrong with it as per the plot.
  • (3) p values for Ljung-Box Statistics: This is repeating the Ljung-Box test at lag $k = 1, 2, \cdots$ and plotting the p-values. If it drops below the blue solid line, the currently used ARMA model $ARMA(p,q)$ is considered not fitting at significance level $\alpha = 5 \%$. According to the plot, it can be said that the analysis was done well.
  • (4) Normality Test: Not depicted in the plot and not overly important, but if there’s time, and you want to make the analysis more accurate, considering the Shapiro-Wilk test or Jarque-Bera test might be worth a try. In theory, it should always be performed, but in practice, due to frequent outliers, which often lead to lacking normality, it’s often omitted. Luckily, in this example, normality is also satisfied as follows.

20190730\_133508.png

Code

library(TSA)
out<-arima(WWWusage,order=c(1,1,1)); out
win.graph(6,3); plot(WWWusage,main='WWWusage\'); lines(fitted(out),col='red')
win.graph(6,6); tsdiag(out)
shapiro.test(out$residuals)