feat: new article

DiogoRibeiro7 · DiogoRibeiro7 · commit d83855466a3f · 2025-08-30T11:00:52.000+01:00
diff --git a/_posts/-_ideas/2030-01-01-future_articles_time_series.md b/_posts/-_ideas/2030-01-01-future_articles_time_series.md
@@ -25,16 +25,6 @@ Here are several article ideas that would complement the ARIMAX time series mode
    - Provide examples and code implementation in Python. -->
 
 
-### 7. **"Prophet: A Modern Approach to Time Series Forecasting Developed by Facebook"**
-   - Introduce the Prophet model developed by Facebook, which is designed to handle seasonality and holidays with ease.
-   - Discuss how it differs from ARIMA, its ease of use for non-experts, and how it handles missing data and seasonality.
-   - Provide code examples and case studies of its use in industry.
-
-### 8. **"Evaluating Time Series Forecasting Models: Metrics and Best Practices"**
-   - Discuss how to evaluate the performance of time series models using metrics like RMSE, MAE, AIC, BIC, and MAPE.
-   - Explain model validation techniques such as cross-validation and out-of-sample testing.
-   - Provide a guide to interpreting results and improving model performance.
-
 ### 9. **"Causality and Granger Causality in Time Series Data"**
    - Explore the concept of Granger Causality and how it is used to determine if one time series can predict another.
    - Provide mathematical explanations, real-world examples (e.g., economic indicators), and implementation in R or Python.
diff --git a/_posts/2025-08-03-evaluating_time_series_forecasting_models_metrics_best_bractices.md b/_posts/2025-08-03-evaluating_time_series_forecasting_models_metrics_best_bractices.md
@@ -0,0 +1,334 @@
+---
+title: 'Evaluating Time Series Forecasting Models: Metrics and Best Practices'
+categories:
+  - Time Series
+  - Model Evaluation
+  - Forecasting
+tags:
+  - RMSE
+  - MAE
+  - MAPE
+  - model validation
+  - rolling-origin evaluation
+  - forecasting metrics
+author_profile: false
+seo_title: Evaluating Time Series Forecasting Models
+seo_description: >-
+  A comprehensive guide to evaluating time series forecasts using metrics like
+  RMSE, MAE, AIC, and BIC. Learn validation strategies, practical coding
+  examples, and best practices.
+excerpt: >-
+  Effective model evaluation is essential for reliable time series forecasting.
+  Learn the most important metrics, validation methods, and strategies for
+  interpreting and improving forecasts.
+summary: >-
+  This article explores the metrics and methods used to evaluate time series
+  forecasting models. Covering RMSE, MAE, AIC/BIC, cross-validation techniques,
+  and residual analysis, it helps practitioners ensure their models are robust,
+  accurate, and actionable.
+keywords:
+  - forecast accuracy
+  - time series model evaluation
+  - validation strategies
+  - MAE and RMSE
+  - forecasting best practices
+classes: wide
+date: '2025-08-03'
+header:
+  image: /assets/images/data_science_9.jpg
+  og_image: /assets/images/data_science_9.jpg
+  overlay_image: /assets/images/data_science_9.jpg
+  show_overlay_excerpt: false
+  teaser: /assets/images/data_science_9.jpg
+  twitter_image: /assets/images/data_science_9.jpg
+---
+
+# Introduction
+
+Time series forecasting is one of the most challenging yet essential tasks in data science, econometrics, engineering, and applied research. Accurate forecasts can drive business decisions, inform policy, optimize resources, and improve operational efficiency. However, building a forecasting model is only half the battle. The real challenge lies in **evaluating** the model's performance to ensure it is both reliable and actionable.
+
+Evaluation of time series models requires a careful selection of performance metrics and validation techniques that account for the unique structure of time-dependent data. Unlike cross-sectional problems, time series data are ordered and often autocorrelated, meaning standard evaluation methods like random shuffling for cross-validation are not directly applicable.
+
+This article provides a comprehensive overview of how to evaluate time series forecasting models. We discuss the most commonly used performance metrics such as RMSE, MAE, MAPE, AIC, and BIC, highlight best practices for model validation (including rolling-origin evaluation and time series cross-validation), and provide guidance on interpreting results for model improvement. Practical examples in Python and R are included for hands-on understanding.
+
+# 1\. Importance of Evaluation in Time Series Forecasting
+
+Evaluation ensures that forecasts are accurate, reliable, and robust under real-world conditions. Without proper evaluation:
+
+- Models may appear accurate in-sample but fail out-of-sample.
+- Forecasts may be biased, systematically over- or under-predicting.
+- Models may overfit, capturing noise instead of true signals.
+
+Therefore, evaluation is not just a technical step but a critical component of the forecasting process.
+
+# 2\. Forecast Accuracy Metrics
+
+Performance metrics quantify how well a model's predictions match the observed values. Below are the most widely used metrics for time series forecasting.
+
+## 2.1 Mean Absolute Error (MAE)
+
+The **MAE** measures the average absolute difference between forecasted and actual values:
+
+$$ MAE = \frac{1}{n} \sum_{t=1}^n |y_t - \hat{y}_t| $$
+
+- Easy to interpret (same units as the data).
+- Robust to outliers compared to squared-error metrics.
+
+**Python Example:**
+
+```python
+from sklearn.metrics import mean_absolute_error
+mae = mean_absolute_error(y_true, y_pred)
+```
+
+**R Example:**
+
+```r
+mae <- mean(abs(y_true - y_pred))
+```
+
+## 2.2 Root Mean Squared Error (RMSE)
+
+The **RMSE** is the square root of the average squared differences:
+
+$$ RMSE = \sqrt{\frac{1}{n} \sum_{t=1}^n (y_t - \hat{y}_t)^2} $$
+
+- Penalizes large errors more heavily than MAE.
+- Commonly used in competitions and benchmarks.
+
+**Python Example:**
+
+```python
+from sklearn.metrics import mean_squared_error
+import numpy as np
+rmse = np.sqrt(mean_squared_error(y_true, y_pred))
+```
+
+**R Example:**
+
+```r
+rmse <- sqrt(mean((y_true - y_pred)^2))
+```
+
+## 2.3 Mean Absolute Percentage Error (MAPE)
+
+The **MAPE** expresses errors as percentages:
+
+$$ MAPE = \frac{100}{n} \sum_{t=1}^n \left| \frac{y_t - \hat{y}_t}{y_t} \right| $$
+
+- Intuitive since it provides error in percentage terms.
+- Sensitive when actual values are near zero.
+
+**Python Example:**
+
+```python
+mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100
+```
+
+**R Example:**
+
+```r
+mape <- mean(abs((y_true - y_pred) / y_true)) * 100
+```
+
+## 2.4 Symmetric Mean Absolute Percentage Error (sMAPE)
+
+The **sMAPE** adjusts for asymmetry by dividing by the average of forecast and actual:
+
+$$ sMAPE = \frac{100}{n} \sum_{t=1}^n \frac{|y_t - \hat{y}_t|}{(|y_t| + |\hat{y}_t|)/2} $$
+
+- Avoids extreme errors when values are small.
+- Widely used in competitions like M3 and M4.
+
+## 2.5 Mean Absolute Scaled Error (MASE)
+
+Proposed by Hyndman & Koehler (2006), **MASE** compares forecast errors against a naive benchmark:
+
+$$ MASE = \frac{MAE}{MAE_{naive}} $$
+
+- Scale-independent and comparable across series.
+- Values > 1 mean worse than naive; < 1 means better.
+
+## 2.6 Information Criteria: AIC and BIC
+
+While MAE and RMSE measure predictive accuracy, **AIC** (Akaike Information Criterion) and **BIC** (Bayesian Information Criterion) assess model quality by balancing goodness of fit and complexity:
+
+$$ AIC = 2k - 2\ln(L) $$
+
+$$ BIC = k\ln(n) - 2\ln(L) $$
+
+where $k$ is the number of parameters, $L$ the likelihood, and $n$ the number of observations.
+
+- Lower AIC/BIC values indicate better models.
+- BIC penalizes complexity more strongly than AIC.
+
+**Python Example:**
+
+```python
+import statsmodels.api as sm
+model = sm.tsa.ARIMA(y, order=(1,1,1)).fit()
+print(model.aic, model.bic)
+```
+
+**R Example:**
+
+```r
+model <- arima(y, order=c(1,1,1))
+AIC(model); BIC(model)
+```
+
+# 3\. Model Validation Techniques
+
+Metrics alone are insufficient; how we validate the model matters equally. Standard random cross-validation is inappropriate for time series due to temporal dependencies.
+
+## 3.1 Train-Test Split
+
+Divide the series into training (first portion) and testing (last portion). Fit the model on training and evaluate forecasts on testing.
+
+- Simple and intuitive.
+- Risk: results may depend heavily on split point.
+
+## 3.2 Rolling-Origin Evaluation (Walk-Forward Validation)
+
+Iteratively expand the training set forward in time, testing on the next observation or small batch. Repeat until the end of the series.
+
+- Mimics real-world forecasting where new data arrives sequentially.
+- Provides multiple evaluation points.
+
+**Python Example:**
+
+```python
+from sklearn.model_selection import TimeSeriesSplit
+
+ts_cv = TimeSeriesSplit(n_splits=5)
+for train_idx, test_idx in ts_cv.split(X):
+    X_train, X_test = X[train_idx], X[test_idx]
+    y_train, y_test = y[train_idx], y[test_idx]
+    # fit and evaluate model here
+```
+
+**R Example:**
+
+```r
+library(caret)
+ts_cv <- createTimeSlices(1:length(y), initialWindow=100, horizon=10)
+```
+
+## 3.3 Time Series Cross-Validation
+
+Generalizes rolling-origin by systematically evaluating across multiple folds while preserving order.
+
+- Provides more robust performance estimates.
+- Can be computationally expensive.
+
+## 3.4 Out-of-Sample Testing
+
+Keep the last part of the data entirely untouched until final evaluation. Prevents overfitting during model selection.
+
+
+# 4\. Interpreting Results and Improving Performance
+
+## 4.1 Comparing Metrics
+
+No single metric is universally best. Always evaluate multiple:
+
+- Use RMSE for penalizing large errors.
+- Use MAE for robust absolute error measure.
+- Use MAPE/sMAPE for percentage errors (careful near zeros).
+- Use MASE for scale-independent comparisons.
+- Use AIC/BIC for model parsimony.
+
+## 4.2 Bias and Residual Analysis
+
+Inspect residuals:
+
+- Residuals should resemble white noise (uncorrelated, zero mean, constant variance).
+- Autocorrelation in residuals suggests underfitting.
+- Non-constant variance suggests need for GARCH-type models.
+
+## 4.3 Model Improvement Strategies
+
+- **Hyperparameter tuning:** Explore ARIMA orders, neural network architectures, etc.
+- **Feature engineering:** Add external regressors (ARIMAX, VARX).
+- **Transformation:** Log or Box-Cox to stabilize variance.
+- **Ensemble methods:** Combine forecasts for improved accuracy.
+- **Hybrid approaches:** Blend statistical and machine learning models.
+
+## 4.4 Avoiding Overfitting
+
+- Keep models simple unless complexity is justified.
+- Use out-of-sample validation rigorously.
+- Monitor performance drift as new data arrives.
+
+
+# 5\. Case Study: Forecasting and Evaluation in Practice
+
+## 5.1 Data Simulation
+
+```python
+import numpy as np
+np.random.seed(0)
+t = np.arange(100)
+y = 0.5*t + 10*np.sin(2*np.pi*t/12) + np.random.normal(0, 3, 100)
+```
+
+## 5.2 Train-Test Split
+
+```python
+train, test = y[:80], y[80:]
+```
+
+## 5.3 Fit Model
+
+```python
+import statsmodels.api as sm
+model = sm.tsa.ARIMA(train, order=(2,1,2)).fit()
+forecast = model.forecast(steps=len(test))[0]
+```
+
+## 5.4 Evaluate
+
+```python
+from sklearn.metrics import mean_absolute_error, mean_squared_error
+import numpy as np
+mae = mean_absolute_error(test, forecast)
+rmse = np.sqrt(mean_squared_error(test, forecast))
+```
+
+Resulting metrics provide insights into forecast performance and guide adjustments.
+
+# 6\. Advanced Considerations
+
+## 6.1 Forecast Horizons
+
+Errors often grow with horizon length. Evaluate short-term and long-term forecasts separately.
+
+## 6.2 Probabilistic Forecasts
+
+Instead of point forecasts, models can generate prediction intervals or full predictive distributions. Evaluate with metrics like:
+
+- **Coverage Probability:** Proportion of true values within prediction intervals.
+- **CRPS (Continuous Ranked Probability Score):** Measures accuracy of full distributions.
+
+## 6.3 Multivariate and Hierarchical Forecasting
+
+- **Multivariate:** Use metrics like multivariate RMSE or trace of error covariance.
+- **Hierarchical:** Ensure coherence across aggregation levels (e.g., bottom-up vs top-down forecasts).
+
+## 6.4 Real-Time Constraints
+
+In applied settings, evaluation must balance accuracy with computational efficiency and interpretability.
+
+# 7\. Summary and Best Practices
+
+- **Use multiple metrics.** RMSE, MAE, MAPE, MASE, AIC/BIC each provide unique insights.
+- **Validate properly.** Employ rolling-origin or cross-validation; avoid random shuffling.
+- **Analyze residuals.** Residual diagnostics reveal systematic issues.
+- **Prevent overfitting.** Simplicity often outperforms over-complex models.
+- **Match evaluation to context.** Select metrics aligned with application needs (e.g., absolute errors vs percentage errors).
+
+# Conclusion
+
+Evaluating time series forecasting models is a nuanced process that requires both statistical rigor and practical judgment. By carefully choosing appropriate accuracy metrics, implementing robust validation strategies, and thoroughly analyzing residuals, practitioners can ensure that their forecasts are not only accurate in historical data but also reliable in predicting the future.
+
+As the field evolves, with deep learning, probabilistic forecasts, and hybrid models gaining traction, the principles of evaluation remain central. Strong evaluation practices form the foundation for trustworthy and actionable time series forecasting.
diff --git a/_posts/2025-08-07-smarter_tree_splits.md b/_posts/2025-08-07-smarter_tree_splits.md
@@ -27,6 +27,14 @@ keywords:
 - "XGBoost"
 - "scikit-learn"
 classes: wide
+date: '2025-08-07'
+header:
+  image: /assets/images/data_science_6.jpg
+  og_image: /assets/images/data_science_6.jpg
+  overlay_image: /assets/images/data_science_6.jpg
+  show_overlay_excerpt: false
+  teaser: /assets/images/data_science_6.jpg
+  twitter_image: /assets/images/data_science_6.jpg
 ---
 
 When building regression trees, whether in standalone models or ensembles like Random Forests and Gradient Boosted Trees, the key objective is to decide the best way to split nodes for optimal predictive performance. Traditionally, this has been done using **Mean Squared Error (MSE)** as a split criterion. However, many modern implementations — such as those in **LightGBM**, **XGBoost**, and **scikit-learn’s HistGradientBoostingRegressor** — use a mathematically equivalent but computationally superior alternative: **Friedman MSE**.
diff --git a/_posts/2025-08-13-preregistering_structural_equation_modeling .md b/_posts/2025-08-13-preregistering_structural_equation_modeling .md
@@ -32,6 +32,14 @@ keywords:
   - research reproducibility
   - confirmatory analysis
 classes: wide
+date: '2025-08-13'
+header:
+  image: /assets/images/data_science_10.jpg
+  og_image: /assets/images/data_science_10.jpg
+  overlay_image: /assets/images/data_science_10.jpg
+  show_overlay_excerpt: false
+  teaser: /assets/images/data_science_10.jpg
+  twitter_image: /assets/images/data_science_10.jpg
 ---
 
 Structural Equation Modeling (SEM) is a powerful analytical tool, capable of modeling complex latent structures and causal relationships between variables. From psychology to marketing, SEM is used in diverse fields to test theoretical models with observed data. Yet, the same flexibility that makes SEM attractive also opens the door to excessive researcher degrees of freedom. Without constraints, analysts can tweak specifications post hoc--knowingly or unknowingly--to produce more favorable results.