Swiftorial Logo
Home
Swift Lessons
Matchups
CodeSnaps
Tutorials
Career
Resources

ARIMA Models Tutorial

Introduction to ARIMA Models

ARIMA stands for AutoRegressive Integrated Moving Average. It is a class of models that explains a given time series based on its own past values (AR terms), the differenced past values (I terms), and past forecast errors (MA terms). ARIMA models are widely used in time series forecasting.

Understanding the Components of ARIMA

An ARIMA model is characterized by three parameters: (p, d, q).

  • p: The number of lag observations included in the model (AR part).
  • d: The number of times that the raw observations are differenced (I part).
  • q: The size of the moving average window (MA part).

Step-by-Step Guide to Building an ARIMA Model

Step 1: Importing Libraries

import pandas as pd
import numpy as np
from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

Step 2: Load and Visualize the Data

For this tutorial, we'll use a sample time series dataset.

# Load dataset
data = pd.read_csv('sample_time_series.csv', index_col='Date', parse_dates=True)

# Visualize the data
plt.figure(figsize=(10, 6))
plt.plot(data)
plt.title('Sample Time Series Data')
plt.xlabel('Date')
plt.ylabel('Values')
plt.show()
Sample Time Series Data Plot

Step 3: Stationarity Check

Before fitting an ARIMA model, we need to ensure that the time series is stationary. We can use the Augmented Dickey-Fuller (ADF) test for this purpose.

from statsmodels.tsa.stattools import adfuller

result = adfuller(data['Values'])
print('ADF Statistic:', result[0])
print('p-value:', result[1])
ADF Statistic: -3.123456
p-value: 0.012345

Step 4: Differencing the Data

If the data is not stationary, we need to difference it. Differencing can be done as follows:

# Differencing the data
data_diff = data.diff().dropna()

# Visualize the differenced data
plt.figure(figsize=(10, 6))
plt.plot(data_diff)
plt.title('Differenced Time Series Data')
plt.xlabel('Date')
plt.ylabel('Differenced Values')
plt.show()
Differenced Time Series Data Plot

Step 5: Fit the ARIMA Model

Now, we can fit the ARIMA model to the (stationary) time series data. For this example, we'll use ARIMA(1,1,1).

# Fit the ARIMA model
model = ARIMA(data, order=(1, 1, 1))
model_fit = model.fit()

# Summary of the model
print(model_fit.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                 Values   No. Observations:                  100
Model:                 ARIMA(1, 1, 1)   Log Likelihood                -120.000
Date:                Tue, 01 Jan 2023   AIC                            246.000
Time:                        00:00:00   BIC                            254.000
Sample:                    01-01-2000   HQIC                           249.000
                         - 12-31-2000                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.1234      0.056      2.200      0.028       0.013       0.234
ar.L1         -0.4567      0.123     -3.713      0.000      -0.698      -0.215
ma.L1          0.5678      0.098      5.791      0.000       0.375       0.760
sigma2         1.2345      0.234      5.278      0.000       0.776       1.693
==============================================================================

Step 6: Diagnostic Plots

We can use diagnostic plots to ensure that the residuals of the model are approximately normally distributed and uncorrelated.

# Diagnostic plots
model_fit.plot_diagnostics(figsize=(15, 12))
plt.show()
Diagnostic Plots

Step 7: Forecasting

Finally, we can use the fitted ARIMA model to make forecasts.

# Forecasting
forecast = model_fit.forecast(steps=10)
print(forecast)

# Plot the forecast
plt.figure(figsize=(10, 6))
plt.plot(data, label='Original')
plt.plot(forecast, label='Forecast')
plt.title('Forecasting using ARIMA Model')
plt.xlabel('Date')
plt.ylabel('Values')
plt.legend()
plt.show()
Forecast Plot