Skip to content

Commit 507089a

Browse files
committed
fix: fix latex formulas
1 parent 756d93a commit 507089a

10 files changed

+296
-141
lines changed

_posts/2025-01-31-nonlinear_growth_models_macroeconomics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ $$
9696
\dot{A} = \phi A^\beta L_A
9797
$$
9898

99-
Where \( \beta > 1 \) leads to accelerating technological growth, while \( \beta < 1 \) introduces convergence or stagnation risks.
99+
Where $$ \beta > 1 $$ leads to accelerating technological growth, while $$ \beta < 1 $$ introduces convergence or stagnation risks.
100100

101101
---
102102
author_profile: false

_posts/2025-02-02-time_series_forecasting_sarima_seasonal_arima_explained.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -80,9 +80,9 @@ $$
8080

8181
Where:
8282

83-
- \( p \): Number of autoregressive terms
84-
- \( d \): Number of differencing operations
85-
- \( q \): Number of moving average terms
83+
- $$ p $$: Number of autoregressive terms
84+
- $$ d $$: Number of differencing operations
85+
- $$ q $$: Number of moving average terms
8686

8787
While ARIMA works well for many datasets, it does not explicitly model **seasonal structure**. For example, monthly sales data may show a 12-month cycle, which ARIMA cannot capture directly.
8888

@@ -96,9 +96,9 @@ $$
9696

9797
Where:
9898

99-
- \( p, d, q \): Non-seasonal ARIMA parameters
100-
- \( P, D, Q \): Seasonal AR, differencing, and MA orders
101-
- \( s \): Seasonality period (e.g., 12 for monthly data with yearly seasonality)
99+
- $$ p, d, q $$: Non-seasonal ARIMA parameters
100+
- $$ P, D, Q $$: Seasonal AR, differencing, and MA orders
101+
- $$ s $$: Seasonality period (e.g., 12 for monthly data with yearly seasonality)
102102

103103
For example:
104104

@@ -128,24 +128,24 @@ $$
128128
\Phi(B^s) \phi(B) (1 - B)^d (1 - B^s)^D y_t = \Theta(B^s) \theta(B) \varepsilon_t
129129
$$
130130

131-
Where \( \varepsilon_t \) is white noise.
131+
Where $$ \varepsilon_t $$ is white noise.
132132

133133
## 5. Parameter Selection: Seasonal and Non-Seasonal
134134

135-
### Step 1: Seasonal Period \( s \)
135+
### Step 1: Seasonal Period $$ s $$
136136

137137
Choose based on frequency (e.g., 12 for monthly).
138138

139-
### Step 2: Differencing \( d \), \( D \)
139+
### Step 2: Differencing $$ d $$, $$ D $$
140140

141141
Use plots and ADF tests to determine.
142142

143143
### Step 3: AR/MA Orders
144144

145145
Use ACF and PACF plots to estimate:
146146

147-
- \( p, q \) for non-seasonal
148-
- \( P, Q \) for seasonal
147+
- $$ p, q $$ for non-seasonal
148+
- $$ P, Q $$ for seasonal
149149

150150
### Step 4: Use Auto ARIMA (Python)
151151

_posts/2025-05-01-agentbased_models_abm_macroeconomics_mathematical_perspective.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -62,29 +62,29 @@ In macroeconomics, ABMs can simulate the evolution of the economy through the in
6262

6363
Although agent-based models are primarily computational, they rest on well-defined mathematical components. A typical ABM can be formalized as a discrete-time dynamical system:
6464

65-
Let the system state at time \( t \) be denoted as:
65+
Let the system state at time $$ t $$ be denoted as:
6666

6767
$$
6868
S_t = \{a_{1,t}, a_{2,t}, ..., a_{N,t}\}
6969
$$
7070

71-
where \( a_{i,t} \) represents the state of agent \( i \) at time \( t \), and \( N \) is the total number of agents.
71+
where $$ a_{i,t} $$ represents the state of agent $$ i $$ at time $$ t $$, and $$ N $$ is the total number of agents.
7272

7373
### 1. **Agent State and Behavior Functions**
7474

7575
Each agent has:
7676

77-
- A **state vector** \( a_{i,t} \in \mathbb{R}^k \) representing variables such as wealth, consumption, productivity, etc.
78-
- A **decision function** \( f_i: S_t \rightarrow \mathbb{R}^k \) that determines how the agent updates its state:
77+
- A **state vector** $$ a_{i,t} \in \mathbb{R}^k $$ representing variables such as wealth, consumption, productivity, etc.
78+
- A **decision function** $$ f_i: S_t \rightarrow \mathbb{R}^k $$ that determines how the agent updates its state:
7979

8080
$$
8181
a_{i,t+1} = f_i(a_{i,t}, \mathcal{E}_t, \mathcal{I}_{i,t})
8282
$$
8383

8484
Where:
8585

86-
- \( \mathcal{E}_t \) is the macro environment (e.g., interest rates, inflation)
87-
- \( \mathcal{I}_{i,t} \) is local information accessible to the agent
86+
- $$ \mathcal{E}_t $$ is the macro environment (e.g., interest rates, inflation)
87+
- $$ \mathcal{I}_{i,t} $$ is local information accessible to the agent
8888

8989
### 2. **Interaction Structure**
9090

@@ -94,7 +94,7 @@ Agents may interact through a **network topology**, such as:
9494
- Small-world or scale-free networks
9595
- Spatial lattices
9696

97-
These interactions define information flow and market exchanges. Let \( G = (V, E) \) be a graph with nodes \( V \) representing agents and edges \( E \) representing communication or trade links.
97+
These interactions define information flow and market exchanges. Let $$ G = (V, E) $$ be a graph with nodes $$ V $$ representing agents and edges $$ E $$ representing communication or trade links.
9898

9999
### 3. **Environment and Aggregation**
100100

@@ -104,7 +104,7 @@ $$
104104
\mathcal{E}_{t+1} = g(S_t)
105105
$$
106106

107-
Where \( g \) is a function that computes macro variables (e.g., GDP, inflation, aggregate demand) from the microstate \( S_t \). This allows for **micro-to-macro feedback loops**.
107+
Where $$ g $$ is a function that computes macro variables (e.g., GDP, inflation, aggregate demand) from the microstate $$ S_t $$. This allows for **micro-to-macro feedback loops**.
108108

109109
## Key Features of ABMs in Macroeconomics
110110

_posts/2025-06-07-why_math_statistics_foundations_data_science.md

Lines changed: 51 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -34,28 +34,66 @@ tags:
3434
title: Why Data Scientists Need Math and Statistics
3535
---
3636

37-
A common misconception is that data science is mostly about applying libraries and frameworks. While tools are helpful, they cannot replace a solid understanding of **mathematics** and **statistics**. These disciplines provide the language and theory that power every algorithm behind the scenes.
37+
It’s tempting to think that mastering a handful of libraries—pandas, Scikit-Learn, TensorFlow—is the fast track to data science success. Yet tools are abstractions built atop deep mathematical and statistical theory. Without understanding **why** an algorithm works—its assumptions, convergence guarantees, or failure modes—practitioners risk producing brittle models and misinterpreting outputs. Libraries accelerate development, but the true power of data science lies in the ability to reason about algorithms at a theoretical level.
3838

39-
## The Role of Mathematics
39+
## 2. Mathematical Foundations: Linear Algebra and Calculus
4040

41-
At the core of many machine learning algorithms are mathematical concepts such as **linear algebra** and **calculus**. Linear algebra explains how models handle vectors and matrices, enabling operations like matrix decomposition and gradient calculations. Calculus is vital for understanding optimization techniques that drive model training. Without these foundations, it is difficult to grasp how algorithms converge or why they sometimes fail to do so.
41+
At the heart of many predictive models are operations on vectors and matrices. Consider a data matrix $\mathbf{X}\in\mathbb{R}^{n\times p}$: understanding its **singular value decomposition**
42+
$$
43+
\mathbf{X} = U\,\Sigma\,V^\top
44+
$$
45+
reveals principal directions of variance, which underpin techniques like Principal Component Analysis. Eigenvalues and eigenvectors provide insight into covariance structure, guiding feature extraction and dimensionality reduction.
4246

43-
## Why Statistics Matters
47+
Calculus provides the language of change, enabling optimization of complex loss functions. Gradient-based methods update parameters $\theta$ via
48+
$$
49+
\theta_{t+1} = \theta_t - \eta\,\nabla_\theta L(\theta_t),
50+
$$
51+
where $\eta$ is the learning rate and $\nabla_\theta L$ the gradient of the loss. Delving into second-order information—the Hessian matrix $H = \nabla^2_\theta L$—explains curvature and motivates algorithms like Newton’s method or quasi-Newton schemes (e.g., BFGS). These concepts illuminate why some problems converge slowly, why learning rates must be tuned, and how saddle points impede optimization.
4452

45-
Statistics helps data scientists quantify uncertainty, draw reliable conclusions, and validate models. Techniques like **hypothesis testing**, **confidence intervals**, and **probability distributions** reveal whether observed patterns are significant or simply random noise. Lacking statistical insight can lead to overfitting or underestimating model errors.
53+
## 3. Statistical Principles: Inference, Uncertainty, and Validation
4654

47-
## Understanding Algorithms Beyond Code
55+
Data science inevitably grapples with uncertainty. Statistics offers the framework to quantify and manage it. A common task is estimating the mean of a population from a sample of size $n$. The **confidence interval** for a normally distributed estimator $\hat\mu$ with known variance $\sigma^2$ is
56+
$$
57+
\hat\mu \pm z_{\alpha/2}\,\frac{\sigma}{\sqrt{n}},
58+
$$
59+
where $z_{\alpha/2}$ corresponds to the desired coverage probability (e.g., $1.96$ for 95%). Hypothesis testing formalizes decision-making: by computing a $p$-value, one assesses the probability of observing data at least as extreme as the sample under a null hypothesis.
4860

49-
Popular algorithms—such as decision trees, regression models, and neural networks—are built on mathematical principles. Knowing the theory behind them clarifies their assumptions and limitations. Blindly applying a model without understanding its mechanics can produce misleading results, especially when the data violates those assumptions.
61+
Probability distributions—Bernoulli, Poisson, Gaussian—model data generation processes and inform likelihood-based methods. Maximum likelihood estimation (MLE) chooses parameters $\theta$ to maximize
62+
$$
63+
\mathcal{L}(\theta) = \prod_{i=1}^n p(x_i \mid \theta),
64+
$$
65+
and its logarithm simplifies optimization to summing log-likelihoods. Statistical rigor guards against overfitting, data dredging, and false discoveries, ensuring that observed patterns reflect genuine signals rather than random noise.
5066

51-
## The Pitfalls of Ignoring Theory
67+
## 4. Theory in Action: Demystifying Algorithms
5268

53-
When the underlying mathematics is ignored, it becomes challenging to debug models, tune hyperparameters, or interpret outcomes. Relying solely on automated tools may produce working code, but it often masks fundamental issues like data leakage, improper scaling, or incorrect loss functions. These mistakes can have severe consequences in real-world applications.
69+
Every algorithm embodies mathematical and statistical choices. A **linear regression** model
70+
$$
71+
\hat y = X\beta + \varepsilon
72+
$$
73+
assumes that residuals $\varepsilon$ are independent, zero-mean, and homoscedastic. Violations—such as autocorrelation or heteroscedasticity—invalidate inference unless addressed. **Decision trees** rely on information‐theoretic splits, measuring impurity via entropy
74+
$$
75+
H(S) = -\sum_{k} p_k \log p_k,
76+
$$
77+
and choosing splits that maximize information gain. **Neural networks** approximate arbitrary functions by composing affine transformations and nonlinear activations, with backpropagation systematically computing gradients via the chain rule.
5478

55-
## Building a Strong Foundation
79+
Understanding these mechanics clarifies why certain models excel on specific data types and fail on others. It empowers practitioners to select or adapt algorithms—pruning trees to prevent overfitting, regularizing regression with an $L_1$ penalty to induce sparsity, or choosing appropriate activation functions to avoid vanishing gradients.
5680

57-
Learning the basics of calculus, linear algebra, and statistics does not require becoming a mathematician. However, dedicating time to these topics builds intuition about how models work. This deeper knowledge empowers data scientists to select appropriate algorithms, customize them for specific problems, and communicate results effectively.
81+
## 5. Common Errors from Theoretical Gaps
5882

59-
## Conclusion
83+
Ignoring foundational theory leads to familiar pitfalls. Failing to standardize features in gradient‐based models can cause one dimension to dominate updates, slowing convergence. Overlooking multicollinearity in regression inflates variance of coefficient estimates, making interpretation meaningless. Misapplying hypothesis tests without correcting for multiple comparisons increases false positive rates. Blind reliance on automated pipelines may conceal data leakage—where test information inadvertently influences training—resulting in overly optimistic performance estimates.
6084

61-
Data science thrives on a solid grounding in mathematics and statistics. Understanding the theory behind algorithms not only improves model performance but also safeguards against hidden errors. Investing in these fundamentals is essential for anyone aspiring to be a competent data scientist.
85+
## 6. Cultivating Analytical Intuition: Learning Strategies
86+
87+
Building fluency in mathematics and statistics need not be daunting. Effective approaches include:
88+
89+
- **Structured Coursework**: Enroll in linear algebra and real analysis to master vector spaces, eigenvalues, and limits.
90+
- **Applied Exercises**: Derive gradient descent updates by hand for simple models, then verify them in code.
91+
- **Textbook Deep Dives**: Study “Linear Algebra and Its Applications” (Strang) and “Statistical Inference” (Casella & Berger) for rigorous yet accessible treatments.
92+
- **Algorithm Implementations**: Recreate k-means clustering, logistic regression, or principal component analysis from first principles to internalize assumptions.
93+
- **Peer Discussions**: Teach core concepts—Bayes’ theorem, eigen decomposition—to colleagues or study groups, reinforcing understanding through explanation.
94+
95+
These practices foster the intuition that transforms abstract symbols into actionable insights.
96+
97+
## 7. Embracing Theory for Sustainable Data Science
98+
99+
A robust grounding in mathematics and statistics elevates data science from a toolkit of shortcuts to a discipline of informed reasoning. When practitioners grasp the language of vectors, gradients, probabilities, and tests, they become adept at diagnosing model behavior, innovating new methods, and communicating results with credibility. Investing time in these core disciplines yields dividends: faster debugging, more reliable models, and the ability to adapt as algorithms and data evolve. In the evolving landscape of data science, theory remains the constant that empowers us to turn data into dependable knowledge.

_posts/2025-06-08-data_visualization_tools.md

Lines changed: 0 additions & 49 deletions
This file was deleted.

0 commit comments

Comments
 (0)