**Time series modeling carefully collects and studies the past observations expressed as a time series for the purpose of developing an appropriate model which describes the inherent structure of the series. One of the most frequently used stochastic time series models is the Autoregressive Integrated Moving Average (ARIMA), whose popularity is mainly due to its flexibility to represent several varieties of time series with simplicity.**

**Autoregressive Integrated Moving Average (ARIMA) process for univariate time series**

ARIMA^{1} is a class of generalized model that captures temporal structure in time series data. For this purpose, ARIMA combines Auto Regressive process (AR) and Moving Average (MA) processes so as to build a composite model of the time series. In particular, ARIMA forecasts the next values using auto regression with some parameters fitted to the model. Then, AMIRA applies a moving average with a set of parameters. During the autoregression, the variable of interest 𝑦_{𝑡} is forecasted using a linear combination of past values of the variable 𝑦_{𝑡}* _{-1}*, 𝑦

_{𝑡-2}, … ,

*α*𝑦

_{p}_{𝑡-p. }The Autoregressive term is written as:

𝑦_{𝑡} = c + *α*_{1}𝑦_{𝑡}* _{-1}*+ α

_{2}𝑦

_{𝑡-2}+ … + α

*𝑦*

_{p}_{𝑡-p}+ ε

_{𝑡}

where c is a constant, *α _{i}* (i= 1, 2,…,p) is the model parameter that needs to be discovered, 𝑦

_{𝑡-1}(i= 1,2,…,p) are the lagged values of 𝑦

*and ε*

_{t}_{𝑡}is the white noise.

The moving average term 𝑦_{𝑡} can be expressed based on the past forecast errors (rather than using past values):

𝑦_{𝑡} = *u* + *ϴ*_{1} *ε*_{𝑡}* _{-1}*+

*ϴ*

_{2}

*ε*

_{𝑡}

*+ … +*

_{-2}*ϴ*

_{q}*ε*

_{𝑡}

*+ ε*

_{-q }_{𝑡}

where *u* is a constant, *ϴ** _{i}* (i= 1,2,…,q) are the model parameters, ε

_{𝑡}

_{–i}are random shocks at time period t-i (i= 1,2,…,q) and

*ε*

_{𝑡}is white noise.

Overall, the autoregressive (AR), moving average (MA) and Integration models are effectively combined to form a class of time series models, called ARIMA (with 𝑦’_{𝑡} representing the differenced time series), which is expressed as:

𝑦’_{𝑡} = *c* + *α*_{1}𝑦_{𝑡}* _{-1}*+ α

_{2}𝑦

_{𝑡}

_{-2}+ … +

*α*

*𝑦*

_{p}_{𝑡}

_{–p}

*+*

*ϴ*

_{1}

*ε*

_{𝑡}

*+*

_{-1}*ϴ*

_{2}

*ε*

_{𝑡}

*+ … +*

_{-2 }*ϴ*

_{q}*ε*

_{𝑡}

*+ ε*

_{-q }_{𝑡}

An important prerequisite is to check whether a time series is stationary (constant mean and variance) through plotting and root testing using augmented Dickey-Fuller^{1} or Philips-Perron^{2} unit root test. If the time series is not stationary, it can be made stationary by differencing the time series^{3}.

The best parameters are found using the Box-Jenkins method^{4}, which is a three-step approach that consists in:

- identifying the model to ensure that the variables are stationary and selecting parameters based on the Autocorrelation Function (AFC)
^{6}for the MA terms and the Partial Autocorrelation Function (PACF)^{5}for the AR terms. - estimating the parameters (
*α*and*ϴ*) that best fit the ARIMA model based on e.g. maximum likelihood^{6 }or the nonlinear least square^{7}. Among candidate models, the best suited model is the one that has the best AIC or BIC value^{8}. - Statistical model checking lying in studying if the residual is white noise and has a constant mean and variance over time. If these assumptions are not satisfied a more appropriate model needs to be fitted.

If all the assets are satisfied, the future values can be forecasted according to the model. The ARIMA model has been generalized by Box and Jenkins to deal with seasonality.

**Seasonal Autoregressive Integrated Moving Average (SARIMA) process for univariate time series**

Seasonal ARIMA (SARIMA)^{10} deals with a seasonal component in univariate time series. In addition to the autoregression (AR), differencing (I) and moving average (MA), SARIMA accounts for the seasonal component of the time series leveraging additional parameters for the period of the seasonality. The SARIMA model is hence represented as SARIMA(p,d,q)(P,D,Q)m where P defines the order of the seasonal AR term, D the order of the seasonal Integration term, Q the order of the seasonal MA term and M the seasonal factor.

**Vector Autoregressive Integrated Moving Average (VARMA) process for multivariate time series**

Contrary to the ARIMA model, which is fitted for univariate time series, VARMA(p,q)^{10} deals with multiple time series that may influence each other. For each time series, we regress a variable on p lags of itself and all the other variables and so on for the q parameter. Given k time 𝑦_{𝑡}* _{-1, }*𝑦

_{𝑡-2},

_{…, }𝑦

*series expressed as a vector V*

_{kt}_{𝑡}= [𝑦

_{𝑡}

*𝑦*

_{-1, }_{𝑡-2, …, }𝑦

*] , VARMA(p,q) models is defined by the Var and Ma models:*

_{kt}where *c _{k}* matrix is a constants vector,

*α𝑡*(i,j=\ 1,2,…,k) and

_{i,j }*ϴ*(i,j=\ 1,2,…,k) matrixes are the model parameters and k is the number of time series, 𝑦

_{ ij }_{k},

_{𝑡}

*(,i= 1,2,…,p) are the lagged values matrix and the cross variables dependency.*

_{-1}*ε*

_{k,t-q }(i= 1,2,…,q) are the matrix of random shocks and Φ

*is white noise vector with zero mean and constant covariance matrix.*

_{kt}In the following, we will use this family of models to model and predict the behavior of the NVF/CNF system and detect anomalies.

**If you have missed the first part, here you can read the introduction**

**References ⤵**

- [1] Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American statistical association, 74(366a), 427-431
- [2] Phillips, Peter & Perron, Pierre. (1986). Testing for a Unit Root in Time Series Regression. Cowles Foundation, Yale University, Cowles Foundation Discussion Papers. 75. 10.1093/biomet/75.2.335
- [3] Nason, G. P. (2006). Stationary and non-stationary time series. Statistics in volcanology, 60.
- [4] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.
- [5] Watson, P. K., & Teelucksingh, S. S. (2002). A practical introduction to econometric methods: Classical and modern. University of West Indies Press
- [6] Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal of mathematical Psychology, 47(1), 90-100.
- [7] Hartley, H. O., & Booker, A. (1965). Nonlinear least squares estimation. Annals of Mathematical Statistics, 36(2), 638-650.
- [8] Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. In Selected papers of hirotugu akaike (pp. 199-213). Springer, New York, NY.
- [9] Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
- [10] Brockwell, P. J., Brockwell, P. J., Davis, R. A., & Davis, R. A. (2016). Introduction to time series and forecasting. springer.

## Add comment