You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2025-02-02-time_series_forecasting_sarima_seasonal_arima_explained.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,9 +80,9 @@ $$
80
80
81
81
Where:
82
82
83
-
-\( p \): Number of autoregressive terms
84
-
-\( d \): Number of differencing operations
85
-
-\( q \): Number of moving average terms
83
+
-$$ p $$: Number of autoregressive terms
84
+
-$$ d $$: Number of differencing operations
85
+
-$$ q $$: Number of moving average terms
86
86
87
87
While ARIMA works well for many datasets, it does not explicitly model **seasonal structure**. For example, monthly sales data may show a 12-month cycle, which ARIMA cannot capture directly.
88
88
@@ -96,9 +96,9 @@ $$
96
96
97
97
Where:
98
98
99
-
-\( p, d, q \): Non-seasonal ARIMA parameters
100
-
-\( P, D, Q \): Seasonal AR, differencing, and MA orders
101
-
-\( s \): Seasonality period (e.g., 12 for monthly data with yearly seasonality)
99
+
-$$ p, d, q $$: Non-seasonal ARIMA parameters
100
+
-$$ P, D, Q $$: Seasonal AR, differencing, and MA orders
101
+
-$$ s $$: Seasonality period (e.g., 12 for monthly data with yearly seasonality)
Copy file name to clipboardExpand all lines: _posts/2025-05-01-agentbased_models_abm_macroeconomics_mathematical_perspective.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,29 +62,29 @@ In macroeconomics, ABMs can simulate the evolution of the economy through the in
62
62
63
63
Although agent-based models are primarily computational, they rest on well-defined mathematical components. A typical ABM can be formalized as a discrete-time dynamical system:
64
64
65
-
Let the system state at time \( t \) be denoted as:
65
+
Let the system state at time $$ t $$ be denoted as:
66
66
67
67
$$
68
68
S_t = \{a_{1,t}, a_{2,t}, ..., a_{N,t}\}
69
69
$$
70
70
71
-
where \( a_{i,t} \) represents the state of agent \( i \) at time \( t \), and \( N \) is the total number of agents.
71
+
where $$ a_{i,t} $$ represents the state of agent $$ i $$ at time $$ t $$, and $$ N $$ is the total number of agents.
72
72
73
73
### 1. **Agent State and Behavior Functions**
74
74
75
75
Each agent has:
76
76
77
-
- A **state vector**\( a_{i,t} \in \mathbb{R}^k \) representing variables such as wealth, consumption, productivity, etc.
78
-
- A **decision function**\( f_i: S_t \rightarrow \mathbb{R}^k \) that determines how the agent updates its state:
77
+
- A **state vector**$$ a_{i,t} \in \mathbb{R}^k $$ representing variables such as wealth, consumption, productivity, etc.
78
+
- A **decision function**$$ f_i: S_t \rightarrow \mathbb{R}^k $$ that determines how the agent updates its state:
-\( \mathcal{E}_t \) is the macro environment (e.g., interest rates, inflation)
87
-
-\( \mathcal{I}_{i,t} \) is local information accessible to the agent
86
+
-$$ \mathcal{E}_t $$ is the macro environment (e.g., interest rates, inflation)
87
+
-$$ \mathcal{I}_{i,t} $$ is local information accessible to the agent
88
88
89
89
### 2. **Interaction Structure**
90
90
@@ -94,7 +94,7 @@ Agents may interact through a **network topology**, such as:
94
94
- Small-world or scale-free networks
95
95
- Spatial lattices
96
96
97
-
These interactions define information flow and market exchanges. Let \( G = (V, E) \) be a graph with nodes \( V \) representing agents and edges \( E \) representing communication or trade links.
97
+
These interactions define information flow and market exchanges. Let $$ G = (V, E) $$ be a graph with nodes $$ V $$ representing agents and edges $$ E $$ representing communication or trade links.
98
98
99
99
### 3. **Environment and Aggregation**
100
100
@@ -104,7 +104,7 @@ $$
104
104
\mathcal{E}_{t+1} = g(S_t)
105
105
$$
106
106
107
-
Where \( g \) is a function that computes macro variables (e.g., GDP, inflation, aggregate demand) from the microstate \( S_t \). This allows for **micro-to-macro feedback loops**.
107
+
Where $$ g $$ is a function that computes macro variables (e.g., GDP, inflation, aggregate demand) from the microstate $$ S_t $$. This allows for **micro-to-macro feedback loops**.
Copy file name to clipboardExpand all lines: _posts/2025-06-07-why_math_statistics_foundations_data_science.md
+51-13Lines changed: 51 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,28 +34,66 @@ tags:
34
34
title: Why Data Scientists Need Math and Statistics
35
35
---
36
36
37
-
A common misconception is that data science is mostly about applying libraries and frameworks. While tools are helpful, they cannot replace a solid understanding of **mathematics** and**statistics**. These disciplines provide the language and theory that power every algorithm behind the scenes.
37
+
It’s tempting to think that mastering a handful of libraries—pandas, Scikit-Learn, TensorFlow—is the fast track to data science success. Yet tools are abstractions built atop deep mathematical and statistical theory. Without understanding**why** an algorithm works—its assumptions, convergence guarantees, or failure modes—practitioners risk producing brittle models and misinterpreting outputs. Libraries accelerate development, but the true power of data science lies in the ability to reason about algorithms at a theoretical level.
38
38
39
-
## The Role of Mathematics
39
+
## 2. Mathematical Foundations: Linear Algebra and Calculus
40
40
41
-
At the core of many machine learning algorithms are mathematical concepts such as **linear algebra** and **calculus**. Linear algebra explains how models handle vectors and matrices, enabling operations like matrix decomposition and gradient calculations. Calculus is vital for understanding optimization techniques that drive model training. Without these foundations, it is difficult to grasp how algorithms converge or why they sometimes fail to do so.
41
+
At the heart of many predictive models are operations on vectors and matrices. Consider a data matrix $\mathbf{X}\in\mathbb{R}^{n\times p}$: understanding its **singular value decomposition**
42
+
$$
43
+
\mathbf{X} = U\,\Sigma\,V^\top
44
+
$$
45
+
reveals principal directions of variance, which underpin techniques like Principal Component Analysis. Eigenvalues and eigenvectors provide insight into covariance structure, guiding feature extraction and dimensionality reduction.
42
46
43
-
## Why Statistics Matters
47
+
Calculus provides the language of change, enabling optimization of complex loss functions. Gradient-based methods update parameters $\theta$ via
where $\eta$ is the learning rate and $\nabla_\theta L$ the gradient of the loss. Delving into second-order information—the Hessian matrix $H = \nabla^2_\theta L$—explains curvature and motivates algorithms like Newton’s method or quasi-Newton schemes (e.g., BFGS). These concepts illuminate why some problems converge slowly, why learning rates must be tuned, and how saddle points impede optimization.
44
52
45
-
Statistics helps data scientists quantify uncertainty, draw reliable conclusions, and validate models. Techniques like **hypothesis testing**, **confidence intervals**, and **probability distributions** reveal whether observed patterns are significant or simply random noise. Lacking statistical insight can lead to overfitting or underestimating model errors.
53
+
## 3. Statistical Principles: Inference, Uncertainty, and Validation
46
54
47
-
## Understanding Algorithms Beyond Code
55
+
Data science inevitably grapples with uncertainty. Statistics offers the framework to quantify and manage it. A common task is estimating the mean of a population from a sample of size $n$. The **confidence interval** for a normally distributed estimator $\hat\mu$ with known variance $\sigma^2$ is
where $z_{\alpha/2}$ corresponds to the desired coverage probability (e.g., $1.96$ for 95%). Hypothesis testing formalizes decision-making: by computing a $p$-value, one assesses the probability of observing data at least as extreme as the sample under a null hypothesis.
48
60
49
-
Popular algorithms—such as decision trees, regression models, and neural networks—are built on mathematical principles. Knowing the theory behind them clarifies their assumptions and limitations. Blindly applying a model without understanding its mechanics can produce misleading results, especially when the data violates those assumptions.
61
+
Probability distributions—Bernoulli, Poisson, Gaussian—model data generation processes and inform likelihood-based methods. Maximum likelihood estimation (MLE) chooses parameters $\theta$ to maximize
and its logarithm simplifies optimization to summing log-likelihoods. Statistical rigor guards against overfitting, data dredging, and false discoveries, ensuring that observed patterns reflect genuine signals rather than random noise.
50
66
51
-
## The Pitfalls of Ignoring Theory
67
+
## 4. Theory in Action: Demystifying Algorithms
52
68
53
-
When the underlying mathematics is ignored, it becomes challenging to debug models, tune hyperparameters, or interpret outcomes. Relying solely on automated tools may produce working code, but it often masks fundamental issues like data leakage, improper scaling, or incorrect loss functions. These mistakes can have severe consequences in real-world applications.
69
+
Every algorithm embodies mathematical and statistical choices. A **linear regression** model
70
+
$$
71
+
\hat y = X\beta + \varepsilon
72
+
$$
73
+
assumes that residuals $\varepsilon$ are independent, zero-mean, and homoscedastic. Violations—such as autocorrelation or heteroscedasticity—invalidate inference unless addressed. **Decision trees** rely on information‐theoretic splits, measuring impurity via entropy
74
+
$$
75
+
H(S) = -\sum_{k} p_k \log p_k,
76
+
$$
77
+
and choosing splits that maximize information gain. **Neural networks** approximate arbitrary functions by composing affine transformations and nonlinear activations, with backpropagation systematically computing gradients via the chain rule.
54
78
55
-
## Building a Strong Foundation
79
+
Understanding these mechanics clarifies why certain models excel on specific data types and fail on others. It empowers practitioners to select or adapt algorithms—pruning trees to prevent overfitting, regularizing regression with an $L_1$ penalty to induce sparsity, or choosing appropriate activation functions to avoid vanishing gradients.
56
80
57
-
Learning the basics of calculus, linear algebra, and statistics does not require becoming a mathematician. However, dedicating time to these topics builds intuition about how models work. This deeper knowledge empowers data scientists to select appropriate algorithms, customize them for specific problems, and communicate results effectively.
81
+
## 5. Common Errors from Theoretical Gaps
58
82
59
-
## Conclusion
83
+
Ignoring foundational theory leads to familiar pitfalls. Failing to standardize features in gradient‐based models can cause one dimension to dominate updates, slowing convergence. Overlooking multicollinearity in regression inflates variance of coefficient estimates, making interpretation meaningless. Misapplying hypothesis tests without correcting for multiple comparisons increases false positive rates. Blind reliance on automated pipelines may conceal data leakage—where test information inadvertently influences training—resulting in overly optimistic performance estimates.
60
84
61
-
Data science thrives on a solid grounding in mathematics and statistics. Understanding the theory behind algorithms not only improves model performance but also safeguards against hidden errors. Investing in these fundamentals is essential for anyone aspiring to be a competent data scientist.
Building fluency in mathematics and statistics need not be daunting. Effective approaches include:
88
+
89
+
-**Structured Coursework**: Enroll in linear algebra and real analysis to master vector spaces, eigenvalues, and limits.
90
+
-**Applied Exercises**: Derive gradient descent updates by hand for simple models, then verify them in code.
91
+
-**Textbook Deep Dives**: Study “Linear Algebra and Its Applications” (Strang) and “Statistical Inference” (Casella & Berger) for rigorous yet accessible treatments.
92
+
-**Algorithm Implementations**: Recreate k-means clustering, logistic regression, or principal component analysis from first principles to internalize assumptions.
93
+
-**Peer Discussions**: Teach core concepts—Bayes’ theorem, eigen decomposition—to colleagues or study groups, reinforcing understanding through explanation.
94
+
95
+
These practices foster the intuition that transforms abstract symbols into actionable insights.
96
+
97
+
## 7. Embracing Theory for Sustainable Data Science
98
+
99
+
A robust grounding in mathematics and statistics elevates data science from a toolkit of shortcuts to a discipline of informed reasoning. When practitioners grasp the language of vectors, gradients, probabilities, and tests, they become adept at diagnosing model behavior, innovating new methods, and communicating results with credibility. Investing time in these core disciplines yields dividends: faster debugging, more reliable models, and the ability to adapt as algorithms and data evolve. In the evolving landscape of data science, theory remains the constant that empowers us to turn data into dependable knowledge.
0 commit comments