diff --git a/_posts/-_ideas/2030-01-01-ideas_statistical_tests.md b/_posts/-_ideas/2030-01-01-ideas_statistical_tests.md index 323ee28c..cbf89ded 100644 --- a/_posts/-_ideas/2030-01-01-ideas_statistical_tests.md +++ b/_posts/-_ideas/2030-01-01-ideas_statistical_tests.md @@ -36,11 +36,6 @@ title: 15 Article Ideas on Statistical Tests Here are some interesting article ideas centered around statistical tests, designed to help you explore the differences, applications, and nuances of various statistical methods: -### 1. **"T-Test vs. Z-Test: When and Why to Use Each"** - - Explain the differences between the t-test and z-test. - - Discuss when each test is appropriate based on sample size, variance, and distribution. - - Provide real-world applications for each test. - - Explore one-sample, two-sample, and paired t-tests. ### 2. **"Chi-Square Test: Applications in Categorical Data Analysis"** - Overview of the chi-square test for independence and goodness of fit. diff --git a/_posts/-_ideas/2030-01-01-new_articles_topics.md b/_posts/-_ideas/2030-01-01-new_articles_topics.md index 958f6bea..3d32d8e5 100644 --- a/_posts/-_ideas/2030-01-01-new_articles_topics.md +++ b/_posts/-_ideas/2030-01-01-new_articles_topics.md @@ -25,13 +25,8 @@ There are several interesting article topics you can explore under the umbrella - **Overview**: An in-depth guide on how machine learning models are applied in PdM, covering supervised, unsupervised, and reinforcement learning techniques. - **Focus**: How models like decision trees, random forests, support vector machines (SVM), and neural networks are used to predict equipment failures. -### 5. A Comparison of Predictive Maintenance Algorithms: Classical vs. Machine Learning Approaches - - **Overview**: Compare traditional predictive maintenance models (e.g., statistical models like ARIMA) with machine learning algorithms. - - **Focus**: Performance, accuracy, and scalability differences between classical approaches and machine learning models in real-world applications. -### 6. IoT and Sensor Data: The Backbone of Predictive Maintenance - - **Overview**: Explain how IoT-enabled devices and sensors gather data that drives predictive maintenance strategies. - - **Focus**: Types of sensors (vibration, temperature, pressure, etc.), the importance of real-time monitoring, and how this data is utilized for predictive maintenance. + ### 7. Deep Learning for Predictive Maintenance: Unlocking Hidden Patterns in Data - **Overview**: Explore how deep learning models such as convolutional neural networks (CNN) and recurrent neural networks (RNN) are used for complex PdM scenarios. diff --git a/_posts/2020-01-06-role_data_science_predictive_maintenance.md b/_posts/2020-01-06-role_data_science_predictive_maintenance.md index 8a0e9b07..c1781fca 100644 --- a/_posts/2020-01-06-role_data_science_predictive_maintenance.md +++ b/_posts/2020-01-06-role_data_science_predictive_maintenance.md @@ -32,22 +32,6 @@ tags: title: Leveraging Data Science Techniques for Predictive Maintenance --- -## Table of Contents - -1. Introduction to Predictive Maintenance (PdM) -2. The Importance of Data Science in PdM -3. Key Data Science Techniques in Predictive Maintenance - 1. Regression Analysis - 2. Anomaly Detection - 3. Clustering Algorithms -4. Data Requirements and Challenges in PdM -5. Role of Machine Learning in Predictive Maintenance -6. Applications of PdM Across Industries -7. Future of Data Science in Predictive Maintenance -8. Conclusion - ---- - ## 1. Introduction to Predictive Maintenance (PdM) Predictive maintenance (PdM) refers to the practice of using data-driven techniques to predict when equipment will fail, allowing for timely and efficient maintenance. This proactive approach aims to reduce downtime, optimize equipment lifespan, and minimize maintenance costs. Unlike traditional maintenance strategies, such as reactive (fixing after failure) or preventive (servicing at regular intervals), PdM leverages real-time data, statistical analysis, and predictive models to forecast equipment degradation and identify the optimal time for intervention. diff --git a/_posts/2021-05-10-estimating_uncertainty_neural_networks_using_monte_carlo_dropout.md b/_posts/2021-05-10-estimating_uncertainty_neural_networks_using_monte_carlo_dropout.md index 6f503d75..0bafb56b 100644 --- a/_posts/2021-05-10-estimating_uncertainty_neural_networks_using_monte_carlo_dropout.md +++ b/_posts/2021-05-10-estimating_uncertainty_neural_networks_using_monte_carlo_dropout.md @@ -54,7 +54,7 @@ Dropout is a regularization technique commonly used to prevent overfitting in ne Monte Carlo dropout, introduced by Yarin Gal and his colleagues, builds on this technique by keeping dropout enabled during inference. This seemingly simple modification allows the model to behave like a Bayesian approximation, enabling it to produce a distribution of outputs for a given input. By running the neural network multiple times on the same input (with different dropout masks applied each time), we can approximate the posterior predictive distribution of the model’s outputs. -Mathematically, if $f(y|x)$ denotes the output of the neural network for class $y$ on input $x$, then the Monte Carlo dropout approach involves drawing multiple samples from $f(y|x)$ by running the model several times with dropout enabled. These samples can be used to compute the mean and variance of the model's predictions, which serve as estimates of the predictive mean $\mathbb{E}[f(y|x)]$ and predictive variance $\text{Var}[f(y|x)]$. +Mathematically, if $$f(y|x)$$ denotes the output of the neural network for class $$y$$ on input $$x$$, then the Monte Carlo dropout approach involves drawing multiple samples from $$f(y|x)$$ by running the model several times with dropout enabled. These samples can be used to compute the mean and variance of the model's predictions, which serve as estimates of the predictive mean $$\mathbb{E}[f(y|x)]$$ and predictive variance $$\text{Var}[f(y|x)]$$. This technique provides a straightforward way to quantify the uncertainty of a model's predictions. In practice, Monte Carlo dropout is used to estimate uncertainty in both classification and regression tasks, although our focus here will be on multi-class classification. @@ -72,13 +72,13 @@ Monte Carlo dropout works by approximating the posterior distribution of a model ### Formalizing the Process -Let $f(y|x)$ be the softmax output of the neural network for class $y$ given input $x$. Monte Carlo dropout involves generating $T$ samples $\{ f_t(y|x) \}_{t=1}^{T}$ by running the network $T$ times with different dropout masks. From these samples, we can compute: +Let $$f(y|x)$$ be the softmax output of the neural network for class $$y$$ given input $$x$$. Monte Carlo dropout involves generating $$T$$ samples $$\{ f_t(y|x) \}_{t=1}^{T}$$ by running the network $$T$$ times with different dropout masks. From these samples, we can compute: - **Predictive mean**: $$ \mathbb{E}[f(y|x)] = \frac{1}{T} \sum_{t=1}^{T} f_t(y|x) $$ - This gives the average probability assigned to class $y$ across the $T$ stochastic forward passes. + This gives the average probability assigned to class $$y$$ across the $$T$$ stochastic forward passes. - **Predictive variance**: $$ @@ -100,7 +100,7 @@ $$ \text{Uncertainty Score} = 1 - \max_y \mathbb{E}[f(y|x)] $$ -This score measures the model's confidence in its most likely prediction. A high value for $\max_y \mathbb{E}[f(y|x)]$ indicates high confidence in the predicted class, while a lower value suggests greater uncertainty. +This score measures the model's confidence in its most likely prediction. A high value for $$\max_y \mathbb{E}[f(y|x)]$$ indicates high confidence in the predicted class, while a lower value suggests greater uncertainty. This method is simple and easy to implement, but it has some limitations. For example, it only takes into account the predicted class's probability and ignores the spread of probabilities across other classes. In cases where the model assigns similar probabilities to multiple classes, this method might underestimate uncertainty. @@ -118,7 +118,7 @@ This method captures uncertainty more comprehensively than the maximum class pro ### 3. Variance-Based Uncertainty Estimation -Another method is to use the variance of the predicted probabilities as a measure of uncertainty. The variance for each class $y$ is computed as: +Another method is to use the variance of the predicted probabilities as a measure of uncertainty. The variance for each class $$y$$ is computed as: $$ \text{Var}[f(y|x)] = \frac{1}{T} \sum_{t=1}^{T} (f_t(y|x) - \mathbb{E}[f(y|x)])^2 @@ -136,19 +136,19 @@ Variance-based methods are particularly useful when the goal is to detect out-of ### 4. Error Function and Normal Approximation -In some cases, particularly when dealing with binary or reduced two-class problems, it may be useful to approximate the predictive distribution using a normal distribution. Specifically, we can model the output probabilities for class $y$ as a Gaussian distribution: +In some cases, particularly when dealing with binary or reduced two-class problems, it may be useful to approximate the predictive distribution using a normal distribution. Specifically, we can model the output probabilities for class $$y$$ as a Gaussian distribution: $$ p(y|x) \sim \mathcal{N}(\mu_y, \sigma_y^2) $$ -where $\mu_y = \mathbb{E}[f(y|x)]$ is the predictive mean and $\sigma_y^2 = \text{Var}[f(y|x)]$ is the predictive variance. +where $$\mu_y = \mathbb{E}[f(y|x)]$$ is the predictive mean and $$\sigma_y^2 = \text{Var}[f(y|x)]$$ is the predictive variance. -For a two-class classifier, let $y$ be the predicted class (i.e., $y = \arg\max_y \mathbb{E}[f(y|x)]$) and $\neg y$ be the other class. The probability that a future evaluation of the classifier will also output $y$ is given by: +For a two-class classifier, let $$y$$ be the predicted class (i.e., $$y = \arg\max_y \mathbb{E}[f(y|x)]$$) and $$\neg y$$ be the other class. The probability that a future evaluation of the classifier will also output $$y$$ is given by: $$ u = \Pr[X \geq 0] $$ -where $X \sim \mathcal{N}(\mu_y - \mu_{\neg y}, \sigma_y^2 + \sigma_{\neg y}^2)$. +where $$X \sim \mathcal{N}(\mu_y - \mu_{\neg y}, \sigma_y^2 + \sigma_{\neg y}^2)$$. This probability can be estimated using the error function: diff --git a/_posts/2021-05-11-predictive_maintenance_algorithms_classical_vs_machine_learning_approaches.md b/_posts/2021-05-11-predictive_maintenance_algorithms_classical_vs_machine_learning_approaches.md new file mode 100644 index 00000000..dd45339d --- /dev/null +++ b/_posts/2021-05-11-predictive_maintenance_algorithms_classical_vs_machine_learning_approaches.md @@ -0,0 +1,230 @@ +--- +author_profile: false +categories: +- Machine Learning +classes: wide +date: '2021-05-11' +excerpt: Explore the differences between classical statistical models and machine learning algorithms in predictive maintenance, including their performance, accuracy, and scalability in industrial settings. +header: + image: /assets/images/data_science_20.jpg + og_image: /assets/images/data_science_20.jpg + overlay_image: /assets/images/data_science_20.jpg + show_overlay_excerpt: false + teaser: /assets/images/data_science_20.jpg + twitter_image: /assets/images/data_science_20.jpg +keywords: +- Predictive Maintenance +- ARIMA +- Machine Learning +- Statistical Models +- Predictive Analytics +- Industrial Analytics +- Predictive Algorithms +seo_description: This article compares traditional statistical models like ARIMA with modern machine learning approaches for predictive maintenance, focusing on performance, accuracy, and scalability in real-world applications. +seo_title: Classical vs. Machine Learning Algorithms in Predictive Maintenance +seo_type: article +summary: A deep dive into how classical predictive maintenance algorithms, such as ARIMA, compare with machine learning models, examining their strengths and weaknesses in terms of performance, accuracy, and scalability. +tags: +- Predictive Maintenance +- Statistical Models +- Machine Learning +- Predictive Algorithms +- ARIMA +- Industrial Analytics +title: 'A Comparison of Predictive Maintenance Algorithms: Classical vs. Machine Learning Approaches' +--- + +## 1. Introduction to Predictive Maintenance Algorithms + +Predictive maintenance (PdM) is an essential strategy in industries reliant on machinery and equipment. It aims to predict equipment failures before they occur by analyzing historical data and current conditions, allowing for maintenance to be scheduled proactively rather than reactively. At the core of this approach are various predictive algorithms, ranging from classical statistical models to modern machine learning techniques. + +Traditionally, industries have relied on time series analysis and regression-based models for failure prediction. However, the rise of machine learning and artificial intelligence has introduced new algorithms capable of learning from complex, high-dimensional data and uncovering patterns that classical methods might miss. This has led to a debate on the relative merits of classical predictive maintenance algorithms versus machine learning approaches. + +This article explores the strengths and limitations of both approaches by comparing their performance, accuracy, and scalability in real-world applications. + +## 2. Classical Predictive Maintenance Algorithms + +Classical predictive maintenance algorithms are based on statistical methods, where the underlying assumption is that future equipment behavior can be predicted based on its past behavior. These methods typically rely on historical time-series data and have been used extensively in industries such as manufacturing, energy, and transportation. + +### 2.1 ARIMA (AutoRegressive Integrated Moving Average) + +ARIMA (AutoRegressive Integrated Moving Average) is one of the most widely used classical algorithms for time series forecasting. The model combines three elements: + +- **AutoRegressive (AR) component**: Predicts the future value based on past values in the time series. +- **Integrated (I) component**: Accounts for the differencing of observations to make the data stationary. +- **Moving Average (MA) component**: Models the error terms as a linear combination of past forecast errors. + +ARIMA models are effective in scenarios where equipment degradation or failure follows a consistent, predictable trend. The model works well for univariate time-series data, where only a single variable (e.g., vibration levels, temperature) is being used to predict equipment failure. It is often employed in industries where machine performance tends to degrade in a linear, time-dependent fashion. + +**Advantages**: + +- Well-suited for linear, univariate time series forecasting. +- Mature and well-understood method, widely adopted across industries. +- Provides a transparent, interpretable model with clear mathematical foundations. + +**Limitations**: + +- ARIMA requires stationary data, which may not always be available. +- Poor performance on nonlinear systems or when dealing with high-dimensional data. +- Limited ability to handle multivariate data (i.e., data with multiple variables affecting equipment health). + +### 2.2 Regression-Based Models + +Regression models, such as linear and polynomial regression, are commonly used in predictive maintenance to model the relationship between equipment health and various operational parameters. The goal of regression is to fit a curve to the data that best explains the relationship between independent variables (e.g., temperature, pressure, vibration) and the dependent variable (e.g., time to failure). + +- **Linear Regression** assumes a straight-line relationship between the variables, making it ideal for simple, linear degradation patterns. +- **Polynomial Regression** extends linear regression by fitting a curve to the data, which is more useful for equipment that degrades in a nonlinear fashion. + +These models are effective when there is a clear and quantifiable relationship between the predictors and the outcome. + +**Advantages**: + +- Simple and interpretable models, easy to implement and understand. +- Effective for equipment with linear or well-defined nonlinear failure patterns. +- Can incorporate multiple variables, improving predictive accuracy. + +**Limitations**: + +- Assumes a specific form of the relationship between variables (linear, quadratic, etc.), which may not capture complex degradation patterns. +- Sensitive to outliers, which can skew predictions. +- Often requires manual feature selection and engineering, which can be time-consuming. + +### 2.3 Exponential Smoothing Methods + +Exponential smoothing models, such as **Simple Exponential Smoothing (SES)** and **Holt-Winters**, are used for time series forecasting when the data shows trends or seasonality. These models apply weighted averages to past observations, with more recent observations receiving higher weights. + +- **Simple Exponential Smoothing (SES)**: Assumes no trend or seasonality, focusing on smoothing past observations to predict future values. +- **Holt’s Linear Trend Model**: Extends SES to handle data with trends by smoothing both the level and the trend. +- **Holt-Winters Seasonal Model**: Further extends Holt’s method to account for seasonality in the data. + +**Advantages**: + +- Works well for time series data with trends or seasonality. +- Flexible, with various versions to handle different types of data. +- Easy to implement and requires minimal computational resources. + +**Limitations**: + +- Assumes that future values are primarily influenced by past values, which may not always hold true. +- Cannot handle complex, multivariate data or nonlinear relationships. +- Limited ability to generalize to new or unseen data patterns. + +## 3. Machine Learning Approaches for Predictive Maintenance + +Machine learning models offer a more flexible and powerful alternative to classical methods. These models can automatically learn complex patterns from data, making them suitable for handling high-dimensional, nonlinear, and multivariate datasets that are common in predictive maintenance scenarios. Unlike classical models, which require manual feature selection and domain expertise, machine learning models can extract relevant features from data autonomously. + +### 3.1 Decision Trees and Random Forests + +**Decision Trees** are supervised learning algorithms that split data into branches based on feature values, creating a tree-like structure where each leaf node represents a predicted outcome. Decision trees are easy to interpret and can handle both numerical and categorical data. + +**Random Forests**, an ensemble learning method, improve on decision trees by combining multiple trees to reduce overfitting and improve prediction accuracy. Random forests are well-suited for predictive maintenance because they can capture nonlinear relationships between variables and are robust to noisy data. + +**Advantages**: + +- Handle nonlinear and multivariate data well. +- Random forests reduce overfitting, making them more robust than individual decision trees. +- Automatically handle feature interactions and can provide feature importance rankings. + +**Limitations**: + +- Less interpretable than linear models or decision trees. +- Require more computational resources than simpler models. +- Performance may degrade if not properly tuned. + +### 3.2 Support Vector Machines (SVM) + +Support Vector Machines (SVM) are a powerful class of supervised learning algorithms used for classification and regression tasks. In predictive maintenance, SVMs are often used to classify whether equipment is likely to fail within a certain timeframe, based on historical sensor data. + +SVMs work by finding the hyperplane that best separates data points of different classes (e.g., “normal” vs. “about to fail”) in high-dimensional space. They are particularly effective when the data is not linearly separable and can be transformed into a higher-dimensional space using kernel functions. + +**Advantages**: + +- Effective for binary classification problems in PdM. +- Can model nonlinear relationships between features using kernel functions. +- Robust to outliers and can handle high-dimensional data. + +**Limitations**: + +- Less interpretable than simpler models. +- Computationally expensive, especially with large datasets. +- Difficult to scale to real-time applications. + +### 3.3 Neural Networks and Deep Learning Models + +Neural networks, and in particular deep learning models, have gained popularity in recent years due to their ability to model highly complex, nonlinear relationships in data. In PdM, neural networks can learn from large amounts of sensor data, maintenance logs, and operational records to predict equipment failures with high accuracy. + +- **Feedforward Neural Networks (FNN)**: Basic neural networks used for predictive tasks by learning patterns in historical data. +- **Recurrent Neural Networks (RNN)** and **Long Short-Term Memory (LSTM)**: Designed for time-series forecasting, these models can capture temporal dependencies in equipment sensor data. +- **Convolutional Neural Networks (CNN)**: Often used in image-based PdM applications, such as analyzing thermal or vibration images to detect defects. + +**Advantages**: + +- Can handle complex, high-dimensional, and multivariate data. +- Particularly effective at learning from large, labeled datasets. +- Capable of discovering subtle patterns and interactions between variables that classical models might miss. + +**Limitations**: + +- Require large amounts of labeled data for training. +- High computational cost, requiring specialized hardware (e.g., GPUs). +- Difficult to interpret and explain results. + +## 4. Comparison Criteria: Performance, Accuracy, and Scalability + +To compare classical predictive maintenance models with machine learning algorithms, it’s important to consider key criteria such as predictive performance, accuracy, scalability, and interpretability. Each approach has its strengths and weaknesses depending on the specific application. + +### 4.1 Predictive Performance and Accuracy + +Machine learning models, especially deep learning techniques like neural networks, tend to outperform classical models in terms of predictive accuracy, particularly when dealing with complex, nonlinear systems. While ARIMA and regression-based models work well for simple, linear relationships, they often struggle with the intricate patterns that emerge in multivariate or nonlinear systems. + +For example, a recurrent neural network (RNN) may capture the temporal dependencies in time-series data more effectively than ARIMA when the system exhibits complex, nonlinear behaviors. Similarly, random forests can model interactions between multiple variables more accurately than traditional regression techniques. + +However, the performance of machine learning models depends heavily on the quality and quantity of training data. Classical models, by contrast, often perform well with smaller datasets and when the underlying relationships in the data are relatively simple. + +### 4.2 Scalability for Big Data and Real-Time Applications + +Scalability is another crucial factor when comparing classical and machine learning models. In modern industrial environments, vast amounts of data are generated from IoT sensors, machinery, and operational systems. Machine learning algorithms, particularly deep learning models, are designed to handle large datasets and can scale to meet the needs of big data applications. + +Classical models like ARIMA, on the other hand, often struggle to scale effectively. They are computationally less expensive but may lack the flexibility to process large-scale data or handle real-time predictions. + +Machine learning models, such as random forests and neural networks, are more suited for big data environments, as they can process vast amounts of historical and real-time data simultaneously. Additionally, the rise of edge computing and distributed systems has enabled machine learning algorithms to be deployed in real-time predictive maintenance systems, further enhancing their scalability. + +### 4.3 Interpretability and Transparency + +While machine learning models often excel in predictive performance, they tend to lack the interpretability of classical models. Techniques such as ARIMA and linear regression offer clear, mathematically interpretable results, which can be important in industries where regulatory compliance or safety is a concern. + +In contrast, deep learning models, especially neural networks, operate as "black boxes," making it difficult for engineers to understand how they arrived at a particular prediction. This can limit their adoption in certain industries where transparency and explainability are crucial. + +However, recent advancements in explainable AI (XAI) are addressing this challenge by providing tools and techniques that allow users to interpret machine learning models' outputs more effectively. + +## 5. Real-World Applications and Case Studies + +Both classical and machine learning approaches have been successfully applied to predictive maintenance across various industries. The choice of algorithm depends on the specific requirements of the application, including the complexity of the equipment, the availability of data, and the need for interpretability. + +### Case Study 1: ARIMA in Manufacturing + +In a manufacturing plant, ARIMA models were used to predict the failure of CNC machines based on time-series data of vibration and temperature. The simplicity and interpretability of ARIMA made it a suitable choice, as the plant's equipment followed a clear, linear degradation pattern. The model successfully predicted when maintenance was needed, reducing unexpected downtime by 20%. + +### Case Study 2: Neural Networks in Energy + +A major energy company implemented deep learning models, including LSTM networks, to predict failures in wind turbines. The turbines generated massive amounts of sensor data, including wind speed, temperature, and rotational speed. By training the LSTM models on this data, the company was able to predict failures with 90% accuracy, leading to a 30% reduction in maintenance costs and improved turbine uptime. + +## 6. Future Directions in Predictive Maintenance Algorithms + +As technology continues to evolve, the future of predictive maintenance algorithms will likely involve a hybrid approach, combining the strengths of classical and machine learning techniques. Some key trends to watch include: + +- **Explainable AI (XAI)**: As machine learning models become more widespread, the need for transparency and interpretability will drive the development of XAI techniques, allowing engineers to better understand how models make predictions. + +- **Transfer Learning**: Transfer learning allows models to apply knowledge gained from one system to another, reducing the need for large datasets. This is especially useful in predictive maintenance, where labeled failure data is often scarce. + +- **Edge Computing**: Edge computing enables machine learning models to process data locally, improving real-time decision-making capabilities and reducing the need for centralized processing. + +- **Hybrid Models**: Future predictive maintenance systems may combine classical models like ARIMA with machine learning algorithms, using the strengths of each to optimize performance, accuracy, and scalability. + +## 7. Conclusion + +Predictive maintenance algorithms play a crucial role in reducing downtime, extending equipment lifespan, and optimizing operational efficiency. Classical models like ARIMA, regression, and exponential smoothing offer simplicity and interpretability, making them suitable for straightforward, linear systems. On the other hand, machine learning algorithms such as random forests, SVMs, and neural networks excel in handling complex, nonlinear, and multivariate data, providing greater predictive accuracy in more challenging environments. + +The choice between classical and machine learning approaches depends on various factors, including the complexity of the data, the availability of computational resources, and the need for model interpretability. As industries continue to adopt predictive maintenance strategies, the combination of these two approaches will likely provide the most robust and scalable solutions. + +--- diff --git a/_posts/2022-10-30-iot_sensor_data_backbone_predictive_maintenance.md b/_posts/2022-10-30-iot_sensor_data_backbone_predictive_maintenance.md new file mode 100644 index 00000000..a9c4a045 --- /dev/null +++ b/_posts/2022-10-30-iot_sensor_data_backbone_predictive_maintenance.md @@ -0,0 +1,182 @@ +--- +author_profile: false +categories: +- IoT +classes: wide +date: '2022-10-30' +excerpt: Learn how IoT-enabled sensors like vibration, temperature, and pressure sensors gather crucial data for predictive maintenance, allowing for real-time monitoring and more effective maintenance strategies. +header: + image: /assets/images/data_science_19.jpg + og_image: /assets/images/data_science_19.jpg + overlay_image: /assets/images/data_science_19.jpg + show_overlay_excerpt: false + teaser: /assets/images/data_science_19.jpg + twitter_image: /assets/images/data_science_19.jpg +keywords: +- IoT +- Sensor Data +- Predictive Maintenance +- Real-Time Monitoring +- Industrial IoT +seo_description: Explore how IoT-enabled devices and sensors provide the real-time data that drives predictive maintenance strategies, and how various types of sensors contribute to equipment health monitoring. +seo_title: How IoT and Sensor Data Power Predictive Maintenance +seo_type: article +summary: This article delves into the critical role IoT and sensor data play in predictive maintenance, covering different types of sensors and their applications, the importance of real-time monitoring, and how the data is processed to optimize maintenance strategies. +tags: +- IoT +- Sensor Data +- Predictive Maintenance +- Real-Time Monitoring +- Industrial IoT +title: 'IoT and Sensor Data: The Backbone of Predictive Maintenance' +--- + +## 1. Introduction to IoT in Predictive Maintenance + +The Internet of Things (IoT) has revolutionized predictive maintenance (PdM) by enabling continuous, real-time monitoring of industrial equipment. IoT devices, particularly sensors, gather vast amounts of data on equipment performance, environmental conditions, and operational parameters. This data is the foundation of predictive maintenance, allowing companies to anticipate equipment failures, optimize maintenance schedules, and reduce operational downtime. + +In traditional maintenance strategies, inspections and servicing were scheduled at fixed intervals, regardless of the actual condition of the equipment. With IoT-enabled sensors, predictive maintenance shifts the paradigm by relying on real-time data that reflects the actual health of the equipment. By analyzing this data, companies can predict when maintenance is truly needed, minimizing both over-maintenance and unexpected failures. + +Sensors deployed on machines can monitor a range of critical parameters—such as vibration, temperature, pressure, and humidity—that influence equipment health. As IoT technology advances, the ability to collect, analyze, and act upon sensor data has become more sophisticated, allowing predictive maintenance to be implemented across various industries, including manufacturing, energy, healthcare, and transportation. + +## 2. Types of Sensors Used in Predictive Maintenance + +Different types of sensors are used in predictive maintenance to monitor various operational aspects of machinery and equipment. Each type of sensor provides specific data that helps in assessing the condition of the equipment and predicting potential failures. Below are some of the most commonly used sensors in PdM. + +### 2.1 Vibration Sensors + +Vibration sensors are among the most critical tools in predictive maintenance, especially for rotating equipment such as motors, pumps, and turbines. These sensors detect abnormal vibration patterns, which are often early indicators of mechanical issues like imbalances, misalignments, or bearing failures. + +- **Piezoelectric Sensors**: These vibration sensors convert mechanical stress into an electrical signal. They are highly sensitive and are used to detect small changes in vibration that could indicate wear or damage. + +- **Accelerometers**: Another type of vibration sensor, accelerometers measure the rate of change in velocity over time. These sensors are often used to monitor the health of rotating machinery. + +**Key Applications**: + +- Monitoring the condition of motors, pumps, and compressors. +- Detecting early signs of mechanical faults, reducing the risk of catastrophic failure. +- Providing data for predictive models that forecast time to failure. + +### 2.2 Temperature Sensors + +Temperature sensors are essential for monitoring heat levels in machinery. Abnormal temperature changes can be a sign of equipment malfunction, friction, or overheating, all of which can lead to failure if left unchecked. + +- **Thermocouples**: These sensors measure temperature differences between two points and are widely used for their accuracy and wide operating range. + +- **Resistance Temperature Detectors (RTDs)**: RTDs are used for precise temperature measurements, particularly in applications that require consistent and stable readings over time. + +**Key Applications**: + +- Detecting overheating in motors, transformers, and electrical systems. +- Monitoring thermal conditions in industrial furnaces, boilers, and heat exchangers. +- Identifying inefficiencies in cooling systems, which may lead to equipment degradation. + +### 2.3 Pressure Sensors + +Pressure sensors monitor the force exerted by liquids, gases, or solids within a machine. Pressure fluctuations can indicate leaks, blockages, or wear in hydraulic and pneumatic systems, leading to operational inefficiencies or equipment failure. + +- **Strain Gauges**: These sensors measure the strain on a material by detecting changes in its electrical resistance, making them ideal for pressure measurements. + +- **Capacitive Pressure Sensors**: Capacitive sensors detect pressure changes by measuring variations in capacitance due to the deformation of a diaphragm. + +**Key Applications**: + +- Monitoring hydraulic systems in industrial machinery. +- Detecting leaks in pipes, tanks, and pressure vessels. +- Measuring pressure in pneumatic systems to prevent air leaks and optimize performance. + +### 2.4 Acoustic Sensors + +Acoustic sensors detect sound waves produced by equipment in operation. By analyzing these sound waves, acoustic sensors can identify abnormalities, such as increased friction, cavitation, or leaks, which are indicative of mechanical issues. + +- **Ultrasonic Sensors**: These sensors detect high-frequency sound waves that are often produced by leaks or friction in machinery. Ultrasonic sensors are useful for early fault detection because they can identify issues before they become audible to the human ear. + +**Key Applications**: + +- Identifying air and gas leaks in pipelines. +- Detecting cavitation in pumps and valves. +- Monitoring bearings and other mechanical components for signs of wear or damage. + +### 2.5 Humidity Sensors + +Humidity sensors measure moisture levels in the air or within a machine’s environment. Excessive humidity can lead to corrosion, electrical malfunctions, and reduced performance in many types of equipment. + +- **Capacitive Humidity Sensors**: These sensors detect changes in humidity by measuring variations in the dielectric constant of a polymer film. + +- **Resistive Humidity Sensors**: These sensors measure changes in electrical resistance due to moisture absorption in a substrate material. + +**Key Applications**: + +- Monitoring humidity levels in electrical cabinets to prevent short circuits and corrosion. +- Protecting sensitive electronic equipment from moisture damage. +- Ensuring the optimal environment in HVAC systems, cleanrooms, and industrial environments. + +## 3. The Importance of Real-Time Monitoring in PdM + +Real-time monitoring is a fundamental aspect of IoT-driven predictive maintenance. Traditional maintenance strategies relied on scheduled inspections, which often failed to capture the actual condition of equipment between service intervals. By contrast, real-time monitoring provides continuous visibility into the health of machinery, allowing companies to detect potential issues as soon as they arise. + +### Benefits of Real-Time Monitoring: + +- **Immediate Issue Detection**: Continuous data collection allows maintenance teams to detect deviations from normal operating conditions immediately, triggering alerts that prompt swift corrective action. + +- **Reduced Downtime**: Early detection of equipment degradation enables timely maintenance, preventing unexpected breakdowns that could lead to costly downtime. + +- **Improved Equipment Lifespan**: Monitoring equipment in real-time helps prevent minor issues from escalating into major failures, extending the lifespan of machines and reducing the need for replacements. + +Real-time monitoring is made possible by the use of IoT sensors that continuously collect and transmit data to a central system for analysis. This data is then processed by predictive maintenance algorithms, which identify patterns or anomalies that indicate potential failures. + +## 4. How IoT Data is Processed for Predictive Maintenance + +The effectiveness of predictive maintenance depends not only on collecting data but also on how that data is processed and analyzed. The typical IoT data processing pipeline for PdM involves several stages: data collection, transmission, aggregation, storage, and analysis. + +### 4.1 Data Collection and Transmission + +IoT sensors deployed on equipment continuously collect data related to various operational parameters, such as temperature, pressure, vibration, and sound. This data is transmitted over a network, either to local edge devices or to a cloud-based platform for further analysis. + +- **Edge Devices**: In some cases, data is processed locally at the edge of the network (closer to the equipment) to reduce latency and bandwidth usage. Edge computing allows for faster decision-making, as data does not need to be sent to a central server for analysis. + +- **Cloud Computing**: In larger-scale implementations, data is transmitted to cloud platforms where it can be aggregated, stored, and analyzed. Cloud platforms offer scalable storage and powerful processing capabilities, making them ideal for handling large volumes of data from multiple IoT devices. + +### 4.2 Data Aggregation and Storage + +Once collected, the data is aggregated and stored in a centralized database or cloud infrastructure. This step is crucial for managing the vast amounts of data generated by IoT sensors. Data aggregation also allows for the correlation of different sensor readings, providing a more comprehensive view of equipment health. + +- **Data Lakes**: In predictive maintenance, data lakes are often used to store large volumes of raw sensor data. These data lakes provide a flexible, scalable solution for handling unstructured and semi-structured data from diverse sources. + +- **Data Warehouses**: Structured data is often stored in data warehouses, where it can be queried and analyzed more efficiently. This is particularly useful for historical trend analysis and the development of predictive models. + +### 4.3 Data Analytics and Predictive Models + +Once the data is stored, advanced analytics are applied to identify patterns, trends, and anomalies that indicate potential equipment failure. Machine learning algorithms, such as neural networks, decision trees, and regression models, are used to analyze historical and real-time data to predict when a machine is likely to fail. + +- **Descriptive Analytics**: Descriptive analytics provide insights into the current state of equipment by summarizing historical data and identifying deviations from normal behavior. + +- **Predictive Analytics**: Predictive models forecast future equipment failures based on historical patterns and current sensor data. These models use machine learning algorithms to detect early warning signs of potential failures. + +- **Prescriptive Analytics**: Prescriptive analytics go a step further by recommending specific maintenance actions based on predictive insights, helping companies optimize their maintenance schedules and minimize downtime. + +## 5. Challenges in IoT Data for Predictive Maintenance + +While IoT and sensor data offer immense potential for predictive maintenance, there are several challenges associated with managing and analyzing this data: + +- **Data Quality**: Sensor data can be noisy, incomplete, or inaccurate due to sensor malfunction or environmental interference. Data cleaning and preprocessing are critical to ensure reliable predictions. + +- **Data Integration**: IoT data often comes from diverse sources and in different formats. Integrating this data into a unified system for analysis can be complex, requiring robust data integration frameworks. + +- **Scalability**: As more sensors are deployed and the volume of data grows, maintaining scalable storage and processing infrastructure becomes a challenge. Cloud computing offers scalability, but it comes with concerns about latency, bandwidth, and data security. + +## 6. The Future of IoT and Sensor Technology in Predictive Maintenance + +The future of predictive maintenance will be shaped by advancements in IoT and sensor technology. As sensors become more sophisticated and affordable, they will become ubiquitous across industries, enabling even more precise and reliable data collection. Some key trends to watch include: + +- **5G Connectivity**: The rollout of 5G networks will enable faster and more reliable data transmission, reducing latency and allowing real-time monitoring at an even larger scale. + +- **Self-Powered Sensors**: Advancements in energy harvesting technology will allow sensors to be self-powered, reducing the need for frequent battery replacements and making IoT deployments more sustainable. + +- **AI-Enhanced Sensors**: Sensors embedded with AI capabilities will be able to process data at the edge, reducing the need for cloud-based analytics and enabling faster, real-time decision-making. + +## 7. Conclusion + +IoT-enabled sensors are the backbone of predictive maintenance, providing the real-time data needed to monitor equipment health and predict potential failures. By collecting data on critical parameters like vibration, temperature, and pressure, sensors allow organizations to detect early signs of equipment degradation and take proactive maintenance actions. As IoT technology continues to evolve, the role of sensors in predictive maintenance will become even more integral, driving further improvements in operational efficiency and equipment reliability. + +--- diff --git a/_posts/2024-10-08-implementing_time_series.md b/_posts/2024-10-08-implementing_time_series.md index 95ead6d0..a2557765 100644 --- a/_posts/2024-10-08-implementing_time_series.md +++ b/_posts/2024-10-08-implementing_time_series.md @@ -7,12 +7,12 @@ classes: wide date: '2024-10-08' excerpt: Explore time-series classification in Python with step-by-step examples using simple models, the catch22 feature set, and UEA/UCR repository benchmarking with statistical tests. header: - image: /assets/images/data_science_2.jpg - og_image: /assets/images/data_science_2.jpg - overlay_image: /assets/images/data_science_2.jpg + image: /assets/images/data_science_3.jpg + og_image: /assets/images/data_science_3.jpg + overlay_image: /assets/images/data_science_3.jpg show_overlay_excerpt: false - teaser: /assets/images/data_science_2.jpg - twitter_image: /assets/images/data_science_2.jpg + teaser: /assets/images/data_science_3.jpg + twitter_image: /assets/images/data_science_3.jpg keywords: - Time-series classification - Catch22 diff --git a/_posts/2024-10-15-t-test_vs_z-test_when_why_use_each.md b/_posts/2024-10-15-t-test_vs_z-test_when_why_use_each.md new file mode 100644 index 00000000..b042ac06 --- /dev/null +++ b/_posts/2024-10-15-t-test_vs_z-test_when_why_use_each.md @@ -0,0 +1,239 @@ +--- +author_profile: false +categories: +- Data Science +classes: wide +date: '2024-10-15' +excerpt: This article provides an in-depth comparison between the t-test and z-test, highlighting their differences, appropriate usage, and real-world applications, with examples of one-sample, two-sample, and paired t-tests. +header: + image: /assets/images/data_science_5.jpg + og_image: /assets/images/data_science_5.jpg + overlay_image: /assets/images/data_science_5.jpg + show_overlay_excerpt: false + teaser: /assets/images/data_science_5.jpg + twitter_image: /assets/images/data_science_5.jpg +keywords: +- T-Test +- Z-Test +- Hypothesis Testing +- Statistical Analysis +- Sample Size +seo_description: Learn about the key differences between the t-test and z-test, when to use each test based on sample size, variance, and distribution, and explore real-world applications for both tests. +seo_title: 'Understanding T-Test vs. Z-Test: Differences and Applications' +seo_type: article +summary: A comprehensive guide to understanding the differences between t-tests and z-tests, covering when to use each test, their assumptions, and examples of one-sample, two-sample, and paired t-tests. +tags: +- T-Test +- Z-Test +- Hypothesis Testing +- Statistical Analysis +title: 'T-Test vs. Z-Test: When and Why to Use Each' +--- + +## 1. Introduction to Hypothesis Testing + +Hypothesis testing is a critical aspect of statistical analysis that allows researchers to make inferences about a population based on sample data. Whether you are testing a new drug’s effectiveness or comparing customer satisfaction across two brands, hypothesis testing provides a framework for making data-driven decisions. Two of the most commonly used statistical tests in hypothesis testing are the **t-test** and **z-test**. + +Both t-tests and z-tests are used to determine if there is a statistically significant difference between means or proportions. They are applied in scenarios where researchers want to compare observed data with expected data, or compare two groups of data to determine if there is a meaningful difference. However, these tests are not interchangeable, and their usage depends on factors like sample size, population variance, and whether the data follows a normal distribution. + +In this article, we will explore the differences between the t-test and z-test, understand when to use each, and provide real-world applications for both. We will also cover the types of t-tests, including one-sample, two-sample, and paired t-tests. + +## 2. Understanding the T-Test and Z-Test: An Overview + +### T-Test + +The t-test is a parametric test used to compare the means of two groups when the sample size is small (usually $$ n < 30 $$) or when the population variance is unknown. The test was developed by William Sealy Gosset under the pseudonym "Student," and it is commonly referred to as **Student's t-test**. + +The t-test uses the **t-distribution**, which is similar to the normal distribution but with thicker tails, allowing for greater variability when working with smaller samples. There are three main types of t-tests: + +1. **One-sample t-test**: Used to compare the mean of a single group to a known value or population mean. +2. **Two-sample t-test** (independent t-test): Used to compare the means of two independent groups. +3. **Paired t-test**: Used to compare means from the same group at different times (e.g., before and after treatment) or matched pairs of samples. + +### Z-Test + +The z-test is also a parametric test used to compare means or proportions when the sample size is large (usually $$ n \geq 30 $$) and when the population variance is known. The z-test is based on the **standard normal distribution** (also known as the z-distribution), which has a mean of 0 and a standard deviation of 1. + +The z-test is commonly used for the following scenarios: + +1. **One-sample z-test**: Used to compare the mean of a single group to a known population mean, with a known population variance. +2. **Two-sample z-test**: Used to compare the means of two independent groups, assuming the population variance is known. + +In summary, while both tests aim to assess whether there is a statistically significant difference between groups, the choice between a t-test and a z-test depends on the sample size, the availability of population variance information, and whether the sample data follows a normal distribution. + +## 3. Key Differences Between the T-Test and Z-Test + +While t-tests and z-tests share the same goal—comparing means and determining statistical significance—several important differences dictate when each test should be used. + +### 3.1 Sample Size Considerations + +One of the main differences between the t-test and z-test is the sample size. + +- **T-Test**: Typically used when the sample size is small (less than 30). In small samples, the variability is higher, and the t-distribution accounts for this extra uncertainty by having fatter tails than the normal distribution. + +- **Z-Test**: Applied when the sample size is larger (greater than or equal to 30). In large samples, the sample mean is likely to be normally distributed due to the **Central Limit Theorem**, which states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases. + +### 3.2 Variance Assumptions + +Another key distinction between the two tests is related to the availability of population variance data. + +- **T-Test**: Used when the population variance is unknown. In this case, the sample variance is used as an estimate for the population variance, which introduces more uncertainty and requires the use of the t-distribution. + +- **Z-Test**: Requires that the population variance is known. When the population variance is known, the normal distribution (z-distribution) is used because it assumes that the estimator for variance is more accurate. + +### 3.3 Normality of the Data + +The assumptions about the distribution of the underlying data are another differentiating factor. + +- **T-Test**: Does not require the data to be perfectly normally distributed, especially in larger samples. The t-distribution approaches the normal distribution as the sample size increases, making the t-test robust even for moderately non-normal data when $$ n \geq 30 $$. + +- **Z-Test**: Assumes that the data is normally distributed. If the sample size is large enough, the z-test can still be applied to non-normally distributed data due to the Central Limit Theorem, but for smaller samples, this assumption is critical. + +In practice, the t-test is more flexible and widely applicable because it does not require prior knowledge of population variance and can handle smaller sample sizes. + +## 4. When to Use the T-Test + +### 4.1 One-Sample T-Test + +The one-sample t-test is used to determine if the mean of a single sample differs significantly from a known population mean. For example, a company might want to test whether the average time taken to resolve customer complaints differs from the industry standard of 30 minutes. + +#### Hypothesis: + +- **Null Hypothesis ($$H_0$$)**: The sample mean is equal to the population mean ($$ \mu = \mu_0 $$). +- **Alternative Hypothesis ($$H_a$$)**: The sample mean is different from the population mean ($$ \mu \neq \mu_0 $$). + +The test statistic for the one-sample t-test is calculated as: + +$$ +t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}} +$$ + +Where: + +- $$ \bar{x} $$ = sample mean +- $$ \mu_0 $$ = population mean +- $$ s $$ = sample standard deviation +- $$ n $$ = sample size + +**Application**: This test is commonly used in quality control, where a product's measured characteristics (e.g., weight or size) are compared to a set standard. + +### 4.2 Two-Sample T-Test (Independent T-Test) + +The two-sample t-test is used to compare the means of two independent groups to see if there is a significant difference between them. For example, a researcher may want to test whether two different teaching methods result in different average test scores. + +#### Hypothesis: + +- **Null Hypothesis ($$H_0$$)**: The means of the two groups are equal ($$ \mu_1 = \mu_2 $$). +- **Alternative Hypothesis ($$H_a$$)**: The means of the two groups are not equal ($$ \mu_1 \neq \mu_2 $$). + +The test statistic is given by: + +$$ +t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} +$$ + +Where: + +- $$ \bar{x}_1, \bar{x}_2 $$ = means of the two samples +- $$ s_1^2, s_2^2 $$ = variances of the two samples +- $$ n_1, n_2 $$ = sizes of the two samples + +**Application**: Two-sample t-tests are widely used in A/B testing in marketing and product development to compare the effectiveness of two strategies or designs. + +### 4.3 Paired T-Test + +The paired t-test is used when there are two measurements from the same group, such as before and after treatment. This test evaluates whether there is a significant difference between these two related samples. + +#### Hypothesis: + +- **Null Hypothesis ($$H_0$$)**: The mean difference between the paired observations is zero. +- **Alternative Hypothesis ($$H_a$$)**: The mean difference between the paired observations is not zero. + +The test statistic is calculated as: + +$$ +t = \frac{\bar{d}}{s_d/\sqrt{n}} +$$ + +Where: + +- $$ \bar{d} $$ = mean of the differences between the paired observations +- $$ s_d $$ = standard deviation of the differences +- $$ n $$ = number of pairs + +**Application**: The paired t-test is frequently used in clinical trials to compare pre-treatment and post-treatment results for the same subjects. + +## 5. When to Use the Z-Test + +### 5.1 One-Sample Z-Test + +The one-sample z-test is used to compare the mean of a single sample to a known population mean when the population variance is known and the sample size is large. For example, a bank might want to test whether the average monthly spending of their credit card users differs from the national average. + +#### Hypothesis: + +- **Null Hypothesis ($$H_0$$)**: The sample mean is equal to the population mean ($$ \mu = \mu_0 $$). +- **Alternative Hypothesis ($$H_a$$)**: The sample mean is different from the population mean ($$ \mu \neq \mu_0 $$). + +The test statistic is: + +$$ +z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} +$$ + +Where: + +- $$ \sigma $$ = population standard deviation + +**Application**: One-sample z-tests are commonly used in quality assurance to compare sample data to a known standard. + +### 5.2 Two-Sample Z-Test + +The two-sample z-test is used to determine whether the means of two independent groups differ significantly when the population variance is known and the sample sizes are large. For example, a telecommunications company may want to compare the average call duration between two regions to determine if there is a significant difference. + +#### Hypothesis: + +- **Null Hypothesis ($$H_0$$)**: The means of the two groups are equal ($$ \mu_1 = \mu_2 $$). +- **Alternative Hypothesis ($$H_a$$)**: The means of the two groups are not equal ($$ \mu_1 \neq \mu_2 $$). + +The test statistic is: + +$$ +z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} +$$ + +Where: + +- $$ \sigma_1^2, \sigma_2^2 $$ = population variances of the two groups + +**Application**: Two-sample z-tests are often used in economics and business to compare performance metrics across different groups or regions. + +## 6. Real-World Applications of T-Tests and Z-Tests + +Both t-tests and z-tests are essential tools for making data-driven decisions in various fields. Here are some real-world applications: + +### 6.1 T-Test Applications + +- **Healthcare**: T-tests are widely used in clinical trials to compare the effectiveness of new treatments with existing ones, especially when sample sizes are small or variances are unknown. + +- **Education**: T-tests are employed to compare student performance across different teaching methods or curricula. + +- **Marketing**: A/B testing, which compares two versions of a webpage or marketing campaign, often relies on t-tests to assess which version performs better. + +### 6.2 Z-Test Applications + +- **Finance**: Z-tests are used to compare the mean returns of different investment portfolios or to assess whether a sample of financial returns differs from a known population average. + +- **Manufacturing**: Z-tests are commonly applied in quality control when comparing the mean characteristics of products (e.g., weight or dimensions) against a standard, especially when large sample sizes and known variances are involved. + +- **Public Health**: Z-tests are used to compare the prevalence of diseases or health outcomes across different populations, where large datasets and known population parameters are available. + +## 7. Conclusion + +T-tests and z-tests are fundamental statistical tools used in hypothesis testing to determine whether differences between groups are statistically significant. While both tests serve similar purposes, their appropriate use depends on factors like sample size, the availability of population variance, and the normality of the data. + +- The **t-test** is more flexible and widely used when dealing with small sample sizes or unknown variances. +- The **z-test** is ideal for larger samples where the population variance is known, and the data is normally distributed. + +By understanding the differences between these tests and knowing when to use each, researchers and analysts can make more accurate and informed decisions in a wide range of applications, from clinical trials and marketing experiments to finance and public health studies. + +--- diff --git a/assets/images/data_science_15.jpg b/assets/images/data_science_15.jpg new file mode 100644 index 00000000..06bbc12f Binary files /dev/null and b/assets/images/data_science_15.jpg differ diff --git a/assets/images/data_science_16.jpg b/assets/images/data_science_16.jpg new file mode 100644 index 00000000..4b7fac64 Binary files /dev/null and b/assets/images/data_science_16.jpg differ diff --git a/assets/images/data_science_17.jpg b/assets/images/data_science_17.jpg new file mode 100644 index 00000000..dd5dc3d3 Binary files /dev/null and b/assets/images/data_science_17.jpg differ diff --git a/assets/images/data_science_18.jpg b/assets/images/data_science_18.jpg new file mode 100644 index 00000000..383c13a0 Binary files /dev/null and b/assets/images/data_science_18.jpg differ diff --git a/assets/images/data_science_19.jpg b/assets/images/data_science_19.jpg new file mode 100644 index 00000000..0de6ed29 Binary files /dev/null and b/assets/images/data_science_19.jpg differ diff --git a/assets/images/data_science_20.jpg b/assets/images/data_science_20.jpg new file mode 100644 index 00000000..8e80e76b Binary files /dev/null and b/assets/images/data_science_20.jpg differ diff --git a/fix_frontmatter.py b/fix_frontmatter.py index 411a6aae..eca14cf1 100644 --- a/fix_frontmatter.py +++ b/fix_frontmatter.py @@ -3,7 +3,7 @@ import frontmatter import random -TOTAL_FILES = 14 +TOTAL_FILES = 20 def extract_date_from_filename(filename): # Assuming the filename format is 'YYYY-MM-DD-some-title.md'