Skip to content

Commit 03ed4d9

Browse files
Merge pull request #73 from DiogoRibeiro7/feat/reserve_branche
Feat/reserve branche
2 parents 15493b4 + 35342cd commit 03ed4d9

7 files changed

+897
-53
lines changed

.github/workflows/todo.yml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
name: Create issues from TODOs
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
importAll:
7+
default: false
8+
required: false
9+
type: boolean
10+
description: Enable, if you want to import all TODOs. Runs on checked out branch! Only use if you're sure what you are doing.
11+
push:
12+
branches:
13+
- main
14+
15+
permissions:
16+
issues: write
17+
repository-projects: read
18+
contents: read
19+
20+
jobs:
21+
todos:
22+
runs-on: ubuntu-latest
23+
24+
steps:
25+
- uses: actions/checkout@v4
26+
27+
- name: Run Issue Bot
28+
uses: juulsn/todo-issue@main
29+
with:
30+
excludePattern: '^(node_modules/)'
31+
env:
32+
GITHUB_TOKEN: ${{ secrets.PERSONAL_GITHUB_TOKEN }} # Replace with your PAT secret

Gemfile.lock

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,9 +122,10 @@ GEM
122122
rouge (4.4.0)
123123
rubyzip (2.3.2)
124124
safe_yaml (1.0.5)
125-
sass-embedded (1.79.3)
125+
sass-embedded (1.79.3-arm64-darwin)
126+
google-protobuf (~> 4.27)
127+
sass-embedded (1.79.3-x86_64-linux-gnu)
126128
google-protobuf (~> 4.27)
127-
rake (>= 13)
128129
sawyer (0.9.2)
129130
addressable (>= 2.3.5)
130131
faraday (>= 0.17.3, < 3)
@@ -137,6 +138,7 @@ GEM
137138

138139
PLATFORMS
139140
arm64-darwin-22
141+
arm64-darwin-23
140142
x86_64-linux
141143

142144
DEPENDENCIES

_posts/-_ideas/2030-01-01-future_articles_time_series.md

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -15,25 +15,18 @@ tags: []
1515

1616
Here are several article ideas that would complement the ARIMAX time series model article, expanding on related topics within time series analysis, forecasting, and statistical modeling:
1717

18-
### 1. **"A Comprehensive Guide to ARIMA Time Series Modeling"**
19-
- Discuss the fundamentals of the ARIMA model (AutoRegressive Integrated Moving Average).
20-
- Cover components (AR, I, MA), how to identify model parameters, model validation, and practical applications.
21-
- Comparison with other models like ARIMAX, SARIMA, and ARMA.
22-
23-
### 2. **"Time Series Forecasting with SARIMA: Seasonal ARIMA Explained"**
18+
19+
<!-- TODO: 2. **"Time Series Forecasting with SARIMA: Seasonal ARIMA Explained"**
2420
- Introduce SARIMA (Seasonal ARIMA) and its relevance to time series with strong seasonal patterns.
2521
- Explain how SARIMA extends ARIMA by modeling seasonality, including parameter selection and interpretation.
26-
- Provide real-world examples and R or Python code for implementation.
22+
- Provide real-world examples and R or Python code for implementation. -->
23+
2724

28-
### 3. **"Introduction to Exponential Smoothing Methods for Time Series Forecasting"**
29-
- Explore simple, double, and triple exponential smoothing techniques (ETS).
30-
- Discuss how these methods compare to ARIMA-based models in terms of complexity and applicability.
31-
- Provide examples of using exponential smoothing in retail, inventory management, and finance.
3225

33-
### 4. **"Machine Learning Approaches to Time Series Forecasting: A Comparative Analysis"**
26+
<!-- TODO: **"Machine Learning Approaches to Time Series Forecasting: A Comparative Analysis"**
3427
- Compare traditional statistical methods (ARIMA, ARIMAX) with machine learning models like LSTM (Long Short-Term Memory), Random Forest, and Prophet for time series forecasting.
3528
- Discuss the advantages and limitations of each approach.
36-
- Provide examples and code implementation in Python.
29+
- Provide examples and code implementation in Python. -->
3730

3831
### 5. **"Multivariate Time Series Forecasting: VAR and VECM Models Explained"**
3932
- Dive into the Vector AutoRegressive (VAR) model and Vector Error Correction Model (VECM) for multivariate time series data.
Lines changed: 259 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
---
2+
author_profile: false
3+
categories:
4+
- Time Series Analysis
5+
classes: wide
6+
date: '2024-10-04'
7+
excerpt: A detailed exploration of the ARIMA model for time series forecasting. Understand its components, parameter identification techniques, and comparison with ARIMAX, SARIMA, and ARMA.
8+
header:
9+
image: /assets/images/data_science_4.jpg
10+
og_image: /assets/images/data_science_4.jpg
11+
overlay_image: /assets/images/data_science_4.jpg
12+
show_overlay_excerpt: false
13+
teaser: /assets/images/data_science_4.jpg
14+
twitter_image: /assets/images/data_science_4.jpg
15+
keywords:
16+
- ARIMA
17+
- time series forecasting
18+
- SARIMA
19+
- ARIMAX
20+
- ARMA
21+
- python
22+
- r
23+
seo_description: Learn the fundamentals of ARIMA (AutoRegressive Integrated Moving Average) modeling, including components, parameter identification, validation, and practical applications.
24+
seo_title: ARIMA Time Series Modeling Explained
25+
seo_type: article
26+
summary: This guide delves into the AutoRegressive Integrated Moving Average (ARIMA) model, a powerful tool for time series forecasting. It covers the essential components, how to identify model parameters, validation techniques, and how ARIMA compares with other time series models like ARIMAX, SARIMA, and ARMA.
27+
tags:
28+
- ARIMA
29+
- Time Series Modeling
30+
- Forecasting
31+
- Data Science
32+
- python
33+
- r
34+
title: A Comprehensive Guide to ARIMA Time Series Modeling
35+
---
36+
37+
Time series analysis is a crucial tool in various industries such as finance, economics, and engineering, where forecasting future trends based on historical data is essential. One of the most widely used models in this domain is the **ARIMA (AutoRegressive Integrated Moving Average)** model. It is a powerful statistical technique that can model and predict future points in a series based on its own past values. In this article, we will delve into the fundamentals of ARIMA, explain its components, how to identify the appropriate model parameters, and compare it with other similar models like ARIMAX, SARIMA, and ARMA.
38+
39+
---
40+
41+
## 1. Understanding Time Series Data
42+
43+
Before delving into the details of ARIMA, it's essential to understand the basics of time series data. A **time series** is a sequence of data points, typically collected at regular intervals over time. These points can represent daily stock prices, monthly sales data, or even annual GDP growth. What distinguishes time series data from other data types is its inherent **temporal ordering**—each observation depends on previous ones.
44+
45+
Time series data often exhibit patterns such as trends, seasonal variations, and cycles:
46+
47+
- **Trend**: A long-term increase or decrease in the data.
48+
- **Seasonality**: Regular patterns that repeat at fixed intervals, such as hourly, daily, weekly, or yearly.
49+
- **Cyclicality**: Longer-term fluctuations that are not as regular as seasonal effects.
50+
51+
A fundamental task in time series analysis is **forecasting**, or predicting future values based on past observations. ARIMA is one of the most powerful and flexible models for such forecasting tasks.
52+
53+
---
54+
55+
## 2. Introduction to ARIMA Models
56+
57+
### What is ARIMA?
58+
59+
ARIMA stands for **AutoRegressive Integrated Moving Average**. It is a generalization of simpler time series models, like **AR (AutoRegressive)** and **MA (Moving Average)** models, and incorporates differencing (the **Integrated** component) to handle non-stationary data.
60+
61+
ARIMA models are typically denoted as **ARIMA(p, d, q)**, where:
62+
63+
- **p** is the number of autoregressive terms (AR),
64+
- **d** is the number of differencing required to make the data stationary (Integrated),
65+
- **q** is the number of lagged forecast errors in the prediction equation (MA).
66+
67+
The primary goal of ARIMA is to capture autocorrelations in the time series and use them to make accurate forecasts.
68+
69+
### Components of ARIMA (AR, I, MA)
70+
71+
1. **Autoregressive (AR) Component**:
72+
The AR component of the model is based on the idea that the current value of the time series can be explained by its previous values. Mathematically, it can be expressed as:
73+
74+
$$
75+
Y_t = \phi_1 Y_{t-1} + \phi_2 Y_{t-2} + \dots + \phi_p Y_{t-p} + \epsilon_t
76+
$$
77+
78+
Here, $$Y_t$$ is the current value of the time series, $$Y_{t-1}, Y_{t-2}, \dots, Y_{t-p}$$ are the past values, $$\phi_1, \phi_2, \dots, \phi_p$$ are the AR coefficients, and $$\epsilon_t$$ is white noise.
79+
80+
2. **Integrated (I) Component**:
81+
The Integrated part of ARIMA is responsible for differencing the time series to make it stationary, i.e., to remove trends and stabilize the mean. If the time series is non-stationary, we can apply differencing:
82+
83+
$$
84+
Y'_t = Y_t - Y_{t-1}
85+
$$
86+
87+
This process can be repeated $$d$$ times until the series becomes stationary.
88+
89+
3. **Moving Average (MA) Component**:
90+
The MA component relies on the assumption that the current value of the series is a linear combination of past forecast errors. This can be expressed as:
91+
92+
$$
93+
Y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \dots + \theta_q \epsilon_{t-q}
94+
$$
95+
96+
Where $$\epsilon_t$$ is the error term and $$\theta_1, \theta_2, \dots, \theta_q$$ are the MA coefficients.
97+
98+
---
99+
100+
## 3. How ARIMA Works: A Step-by-Step Approach
101+
102+
### Stationarity and Differencing
103+
104+
A critical assumption in ARIMA modeling is that the time series must be stationary. A **stationary time series** has a constant mean and variance over time, and its autocorrelations remain constant across different time periods.
105+
106+
To check for stationarity, you can use the **Augmented Dickey-Fuller (ADF)** test. If the series is found to be non-stationary, differencing (the "I" in ARIMA) can be applied to transform the series into a stationary one. Differencing involves subtracting the previous observation from the current observation. In some cases, higher-order differencing may be required to achieve stationarity.
107+
108+
Mathematically, first-order differencing is:
109+
110+
$$
111+
Y'_t = Y_t - Y_{t-1}
112+
$$
113+
114+
If first-order differencing doesn’t result in stationarity, second-order differencing can be used:
115+
116+
$$
117+
Y''_t = Y'_t - Y'_{t-1}
118+
$$
119+
120+
### Autocorrelation and Partial Autocorrelation
121+
122+
Once the series is stationary, the next step is to examine the **Autocorrelation Function (ACF)** and **Partial Autocorrelation Function (PACF)** plots, which help in determining the AR and MA components.
123+
124+
- **ACF**: Measures the correlation between the time series and its lagged values. It helps identify the MA term (q).
125+
- **PACF**: Measures the correlation between the time series and its lagged values, but after removing the effects of intermediate lags. It helps identify the AR term (p).
126+
127+
The ACF and PACF plots provide insight into the structure of the model and assist in selecting the appropriate values for $$p$$ and $$q$$.
128+
129+
---
130+
131+
## 4. Model Identification: Choosing ARIMA Parameters (p, d, q)
132+
133+
### AutoRegressive (AR) Term - p
134+
135+
The **AR term (p)** represents the number of lagged values that are used in the model to predict the current value. In simple terms, it captures the extent to which the past values influence the current observation.
136+
137+
When identifying the AR term, one typically looks at the **PACF plot**. If the PACF cuts off after lag $$p$$, that indicates the presence of an AR(p) process. For example, if the PACF shows significant spikes up to lag 2 but no significant correlation after that, it suggests an AR(2) process.
138+
139+
### Integrated (I) Term - d
140+
141+
The **Integrated term (d)** represents the number of times the data has been differenced to achieve stationarity. The value of $$d$$ is determined based on whether the original series is stationary. If the data has a clear trend or is non-stationary, differencing is required.
142+
143+
A time series typically requires $$d = 1$$ if the data has a linear trend, and $$d = 2$$ if the trend is quadratic. It's rare to use $$d$$ values greater than 2 in practical scenarios.
144+
145+
### Moving Average (MA) Term - q
146+
147+
The **MA term (q)** refers to the number of lagged forecast errors that are included in the model. It captures the extent to which previous errors affect the current observation.
148+
149+
To identify the MA term, one looks at the **ACF plot**. If the ACF cuts off after lag $$q$$, that suggests an MA(q) process. For example, if the ACF shows significant spikes up to lag 1 but cuts off after that, it implies an MA(1) process.
150+
151+
---
152+
153+
## 5. Model Validation and Diagnostics
154+
155+
### Residual Analysis
156+
157+
After fitting an ARIMA model, it's crucial to perform diagnostics to ensure the model's adequacy. One of the key diagnostic steps is analyzing the **residuals** of the model. Ideally, the residuals should behave like white noise, meaning they should have a constant mean, constant variance, and no autocorrelation.
158+
159+
You can examine the **ACF of the residuals** to check for any significant autocorrelations. If the residuals show no significant patterns, it suggests that the model has captured the underlying structure of the time series effectively.
160+
161+
### AIC and BIC Criteria
162+
163+
Model selection can also be guided by **Akaike Information Criterion (AIC)** and **Bayesian Information Criterion (BIC)**. These are measures of model fit that penalize the complexity of the model (i.e., the number of parameters). Lower AIC and BIC values indicate a better-fitting model. When comparing multiple ARIMA models, you can use these criteria to select the model that balances fit and parsimony.
164+
165+
---
166+
167+
## 6. Practical Applications of ARIMA
168+
169+
### ARIMA in Finance
170+
171+
ARIMA models are extensively used in financial markets for forecasting stock prices, interest rates, and currency exchange rates. Financial time series data, such as stock prices, often exhibit non-stationarity due to trends and volatility. By applying differencing and capturing autocorrelations, ARIMA models can produce accurate short-term forecasts, helping traders make informed decisions.
172+
173+
For example, a financial analyst might use ARIMA to predict the future value of a stock based on its historical price data. Although ARIMA models don’t capture sudden market shifts or non-linear patterns, they are still valuable tools in combination with other techniques like volatility models (GARCH).
174+
175+
### ARIMA in Economics and Business
176+
177+
In economics, ARIMA models are used to forecast macroeconomic variables like GDP, inflation, and unemployment rates. Businesses also leverage ARIMA for demand forecasting, which helps in inventory management, supply chain optimization, and production planning.
178+
179+
For example, an e-commerce company may use ARIMA to forecast monthly sales based on historical sales data. This forecast can then be used to optimize inventory levels and reduce storage costs.
180+
181+
---
182+
183+
## 7. ARIMA Variants and Comparisons
184+
185+
### ARMA (AutoRegressive Moving Average)
186+
187+
The **ARMA model** is a simpler version of ARIMA, applicable when the data is already stationary and does not require differencing. It combines the **AR** and **MA** components without the need for the **I** (Integrated) part. ARMA models are denoted as ARMA(p, q), where $$p$$ and $$q$$ are the orders of the autoregressive and moving average terms, respectively.
188+
189+
### ARIMAX (ARIMA with Exogenous Variables)
190+
191+
The **ARIMAX model** is an extension of ARIMA that includes **exogenous variables**—external factors that may influence the time series. This model is particularly useful when external factors (e.g., interest rates, economic indicators) have a significant impact on the time series being modeled.
192+
193+
For example, in predicting consumer spending, an ARIMAX model could incorporate external variables such as employment rates or consumer sentiment indices.
194+
195+
### SARIMA (Seasonal ARIMA)
196+
197+
The **SARIMA model** extends ARIMA to handle **seasonal data**. It introduces additional parameters to capture seasonal effects, such as weekly or yearly patterns. SARIMA models are denoted as ARIMA(p, d, q)(P, D, Q)[s], where $$(P, D, Q)$$ are the seasonal counterparts of the ARIMA parameters and $$s$$ is the length of the seasonal cycle.
198+
199+
For instance, a retail company may use SARIMA to forecast sales, accounting for both overall trends and seasonal peaks (e.g., holiday seasons).
200+
201+
---
202+
203+
## 8. Challenges and Limitations of ARIMA
204+
205+
While ARIMA models are powerful tools for time series forecasting, they do come with certain limitations:
206+
207+
- **Non-linearity**: ARIMA assumes a linear relationship between past values and future forecasts. In cases where the data exhibits non-linear patterns, ARIMA may not perform well.
208+
- **Large datasets**: ARIMA models can become computationally intensive when applied to large datasets, especially when identifying the optimal parameters.
209+
- **Short-term forecasts**: ARIMA is generally more effective for short-term forecasting. Over longer time horizons, the forecasts may become less reliable due to the accumulation of forecast errors.
210+
- **Stationarity assumption**: One of the key assumptions of ARIMA is that the data must be stationary, which is not always the case in real-world scenarios. While differencing can address this, it may not always fully capture the underlying dynamics of the data.
211+
212+
---
213+
214+
## 9. Tools and Libraries for ARIMA Modeling
215+
216+
### Python: `statsmodels`
217+
218+
In Python, the **`statsmodels`** library provides a robust implementation of ARIMA and its variants. The `ARIMA` class in `statsmodels` allows users to specify the order of the model, fit the model, and generate forecasts. Here's a basic example:
219+
220+
```python
221+
import pandas as pd
222+
from statsmodels.tsa.arima.model import ARIMA
223+
224+
# Load your time series data
225+
data = pd.read_csv('your_time_series.csv', index_col='Date', parse_dates=True)
226+
227+
# Fit ARIMA model
228+
model = ARIMA(data['value'], order=(p, d, q))
229+
model_fit = model.fit()
230+
231+
# Forecast future values
232+
forecast = model_fit.forecast(steps=10)
233+
print(forecast)
234+
```
235+
236+
### R: `forecast` Package
237+
238+
In R, the **`forecast`** package offers a user-friendly implementation of ARIMA. The `auto.arima` function automatically selects the optimal parameters for the model, making it easier for users to get started with time series forecasting:
239+
240+
```r
241+
library(forecast)
242+
243+
# Load your time series data
244+
data <- ts(your_data, frequency=12)
245+
246+
# Fit ARIMA model
247+
fit <- auto.arima(data)
248+
249+
# Forecast future values
250+
forecast(fit, h=10)
251+
```
252+
253+
## Final Thougts
254+
255+
The ARIMA model is one of the most versatile and widely used tools for time series forecasting. Its ability to model autoregressive and moving average processes, combined with differencing to handle non-stationarity, makes it a powerful technique across various domains. Whether forecasting stock prices, demand for products, or economic indicators, ARIMA provides a robust framework for analyzing time series data.
256+
257+
However, ARIMA has its limitations, especially when dealing with non-linear patterns, seasonal variations, or the need for long-term forecasts. In such cases, extensions like ARIMAX and SARIMA, or alternative models like neural networks and machine learning-based approaches, may offer better performance.
258+
259+
Understanding ARIMA and its variants is a vital skill for data scientists and analysts looking to make accurate predictions from historical data. With powerful tools and libraries available in Python and R, implementing ARIMA models has never been more accessible.

0 commit comments

Comments
 (0)