chore: new article

DiogoRibeiro7 · DiogoRibeiro7 · commit 0b6794bd079a · 2025-06-06T15:07:19.000+01:00
diff --git a/_posts/-_ideas/2030-01-01-data_model_drift.md b/_posts/-_ideas/2030-01-01-data_model_drift.md
@@ -14,9 +14,6 @@ tags: []
 ## Article Ideas on Data Drift and Model Drift
 
 
-### 3. **How to Detect Data Drift in Machine Learning Models**
-   - **Overview**: Provide a guide to detecting data drift using statistical techniques and machine learning-based approaches.
-   - **Focus**: Methods like **Kullback-Leibler Divergence**, **Population Stability Index (PSI)**, **Chi-square tests**, and model monitoring tools such as **NannyML** and **Evidently AI**.
 
 ### 4. **Techniques for Monitoring and Managing Model Drift in Production**
    - **Overview**: Discuss best practices for monitoring model performance over time to detect and mitigate model drift.
diff --git a/_posts/2025-04-27-techniques_moniitoring_managing_model_drift_production.md b/_posts/2025-04-27-techniques_moniitoring_managing_model_drift_production.md
@@ -0,0 +1,148 @@
+---
+author_profile: false
+categories:
+- Machine Learning
+- Model Monitoring
+classes: wide
+date: '2025-04-27'
+excerpt: Model drift is inevitable in production ML systems. This guide explores monitoring
+  strategies, alert systems, and retraining workflows to keep models accurate and
+  robust over time.
+header:
+  image: /assets/images/data_science_8.jpg
+  og_image: /assets/images/data_science_8.jpg
+  overlay_image: /assets/images/data_science_8.jpg
+  show_overlay_excerpt: false
+  teaser: /assets/images/data_science_8.jpg
+  twitter_image: /assets/images/data_science_8.jpg
+keywords:
+- Model drift
+- Model monitoring
+- Mlflow
+- Seldon
+- Tfx
+- Retraining models
+seo_description: Learn best practices and tools for monitoring model performance,
+  detecting model drift, and retraining ML models in production using MLflow, Seldon,
+  and TensorFlow Extended (TFX).
+seo_title: Monitoring and Managing Model Drift in Production ML Systems
+seo_type: article
+summary: This article outlines practical techniques for managing model drift in machine
+  learning production environments, including real-time monitoring, automated alerts,
+  and retraining using popular tools like MLflow, Seldon, and TFX.
+tags:
+- Model drift
+- Model monitoring
+- Ml ops
+- Mlflow
+- Tfx
+- Seldon
+title: Techniques for Monitoring and Managing Model Drift in Production
+---
+
+# Techniques for Monitoring and Managing Model Drift in Production
+
+Deploying a machine learning model into production is a major milestone—but it's only the beginning of its lifecycle. As environments evolve, data changes, and user behavior shifts, even the most accurate model at deployment can degrade over time. This phenomenon, known as **model drift**, makes proactive monitoring and management essential for any production ML system.
+
+This article explores practical strategies and tools for detecting, mitigating, and responding to model drift to ensure sustained performance in real-world deployments.
+
+## Why Monitoring Matters in Production
+
+Machine learning models don't operate in a vacuum. Once deployed, they interact with live, dynamic environments where data distributions may differ from the training set. Without proper monitoring, these changes can lead to:
+
+- Reduced prediction accuracy  
+- Erosion of business value  
+- Missed anomalies or false positives  
+- Compliance and reliability issues  
+
+To address this, a robust monitoring and retraining pipeline is critical.
+
+## Core Practices for Monitoring Model Drift
+
+### 1. Real-Time Model Monitoring
+
+Continuous tracking of predictions and input data is the foundation of drift detection. Real-time monitoring ensures that significant changes are identified as they occur, enabling prompt corrective action.
+
+**Key metrics to monitor include:**
+
+- Prediction distributions over time  
+- Input feature distributions  
+- Model confidence or uncertainty  
+- Accuracy and other performance metrics (when ground truth labels are available)
+
+### 2. Automated Drift Alerts
+
+Setting up threshold-based alerts allows teams to automate detection of performance issues. For example:
+
+- Alert if PSI for any feature exceeds 0.2  
+- Notify if prediction accuracy drops by more than 5% compared to a baseline  
+- Trigger retraining if statistical tests indicate concept drift  
+
+This automation ensures that changes are acted upon quickly, reducing downtime or poor decisions.
+
+### 3. Retraining and Redeployment Workflows
+
+Once drift is detected, models need to be updated to reflect new patterns in the data. There are three primary retraining strategies:
+
+- **Scheduled Retraining**: Retrain models at fixed intervals (e.g., weekly/monthly), regardless of detected drift.  
+- **Trigger-Based Retraining**: Retrain only when specific drift or performance thresholds are crossed.  
+- **Online Learning**: Continuously update models with new data in small batches—suitable for streaming or rapidly changing data environments.  
+
+Retraining must be paired with validation, version control, and safe deployment practices to prevent degradation due to faulty updates.
+
+## Tools for Managing Model Drift
+
+### MLflow
+
+**MLflow** is an open-source platform for managing the ML lifecycle. It supports experiment tracking, model versioning, and reproducible pipelines, making it useful for implementing retraining workflows.
+
+**Key Features:**
+
+- Log and compare training runs  
+- Track model performance over time  
+- Serve and deploy models with integrated REST APIs  
+- Integrate with custom monitoring scripts and dashboards  
+
+MLflow excels at experiment management and reproducible retraining processes.
+
+### Seldon
+
+**Seldon** is a Kubernetes-native deployment platform for machine learning models. It enables advanced inference monitoring, traffic control, and A/B testing.
+
+**Key Features:**
+
+- Real-time model monitoring (including input/output logging)  
+- Outlier and drift detection via custom components  
+- Canary and shadow deployments for safe rollouts  
+- Scales seamlessly in containerized environments  
+
+Seldon is ideal for teams deploying models at scale with tight control over performance and safety.
+
+### TensorFlow Extended (TFX)
+
+**TensorFlow Extended (TFX)** is Google’s end-to-end platform for production ML pipelines. It is tightly integrated with TensorFlow but extensible to other frameworks.
+
+**Key Features:**
+
+- Automatic data validation and schema drift detection  
+- Integrated model analysis (TFMA)  
+- Pipeline orchestration via Apache Airflow or Kubeflow  
+- Scalable training, evaluation, and serving workflows  
+
+TFX is especially powerful in data-heavy environments where standardized workflows and governance are critical.
+
+## Best Practices for Managing Drift
+
+- **Version Everything**: Track data, models, metrics, and configurations for reproducibility.  
+- **Monitor Frequently**: Real-time or batch monitoring should be baked into the pipeline.  
+- **Visualize Trends**: Use dashboards to make drift visible and understandable for both technical and business teams.  
+- **Automate Intelligently**: Alerts and retraining should be driven by clear metrics and thresholds.  
+- **Include Humans in the Loop**: Domain experts should validate retraining decisions, especially in high-stakes settings.
+
+## Final Thoughts
+
+Model drift is not a matter of *if*, but *when*. The difference between a robust machine learning system and a brittle one often lies in the strength of its monitoring and maintenance strategy.
+
+By combining real-time metrics, automated alerts, and structured retraining workflows, ML teams can ensure that their models stay reliable, interpretable, and impactful long after deployment.
+
+In today’s production ML landscape, **operational excellence is just as important as model accuracy**. Managing drift effectively is what transforms machine learning from experimental research into dependable infrastructure.
diff --git a/_posts/2025-05-25-Understanding_Statistical_Models.md b/_posts/2025-05-25-Understanding_Statistical_Models.md
@@ -16,12 +16,12 @@ header:
   teaser: /assets/images/data_science_16.jpg
   twitter_image: /assets/images/data_science_16.jpg
 keywords:
-- statistical model
-- data modeling
-- probability
-- prediction
-- inference
-- simulation
+- Statistical model
+- Data modeling
+- Probability
+- Prediction
+- Inference
+- Simulation
 seo_description: 'A comprehensive exploration of statistical models: what they are,
   how they work, and why they''re fundamental to data analysis, prediction, and decision-making
   across disciplines.'
@@ -31,10 +31,10 @@ summary: This article explores the essence of statistical models, including thei
   structure, function, and real-world applications, with a focus on their role in
   inference, uncertainty quantification, and decision support.
 tags:
-- Statistical Models
+- Statistical models
 - Inference
 - Simulation
-- Predictive Analytics
+- Predictive analytics
 - Probability
 title: 'Understanding Statistical Models: Foundations, Functions, and Applications'
 ---
diff --git a/_posts/2025-05-26-detect_data_drift_machine_learning_models.md b/_posts/2025-05-26-detect_data_drift_machine_learning_models.md
@@ -16,12 +16,12 @@ header:
   teaser: /assets/images/data_science_2.jpg
   twitter_image: /assets/images/data_science_2.jpg
 keywords:
-- data drift detection
-- Kullback-Leibler divergence
-- Population Stability Index
+- Data drift detection
+- Kullback-leibler divergence
+- Population stability index
 - Chi-square test
-- Evidently AI
-- NannyML
+- Evidently ai
+- Nannyml
 seo_description: Learn how to detect data drift in machine learning using statistical
   techniques like KL Divergence and PSI, and tools like NannyML and Evidently AI to
   maintain model accuracy in production.
@@ -31,11 +31,11 @@ summary: Explore how to detect data drift in machine learning systems, including
   techniques like KL Divergence, PSI, and Chi-square tests, as well as practical tools
   like NannyML and Evidently AI.
 tags:
-- Data Drift
-- Drift Detection
-- Model Monitoring
-- Statistical Tests
-- ML Ops
+- Data drift
+- Drift detection
+- Model monitoring
+- Statistical tests
+- Ml ops
 title: How to Detect Data Drift in Machine Learning Models
 ---
 
diff --git a/_posts/2025-06-05-Least_Angle_Regression.md b/_posts/2025-06-05-Least_Angle_Regression.md
@@ -16,10 +16,10 @@ header:
   teaser: /assets/images/data_science_18.jpg
   twitter_image: /assets/images/data_science_18.jpg
 keywords:
-- Least Angle Regression
-- LARS
-- Feature Selection
-- Linear Regression
+- Least angle regression
+- Lars
+- Feature selection
+- Linear regression
 - Lasso
 seo_description: Explore Least Angle Regression (LARS), a regression algorithm that
   combines efficiency with feature selection. Learn how it works, its advantages,
@@ -31,9 +31,9 @@ summary: This article explores Least Angle Regression (LARS), explaining its cor
   most effectively applied.
 tags:
 - Regression
-- LARS
-- Linear Models
-- Feature Selection
+- Lars
+- Linear models
+- Feature selection
 title: 'Least Angle Regression: A Gentle Dive into LARS'
 ---
 
diff --git a/files_with_multiple_categories.txt b/files_with_multiple_categories.txt