You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<pclass="card-img-top"><imgsrc="posts/clt-intuitive-derivation/images/normal_distribution.webp" alt="A diagram of the Normal Distribution bell curve. The central peak is at the mean mu. The curve shows that 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3." style="height: 150px;" class="thumbnail-image card-img"/></p>
381
+
<pclass="card-img-top"><imgsrc="posts/clt-intuitive-derivation/images/normal_distribution.png" alt="A diagram of the Normal Distribution bell curve. The central peak is at the mean mu. The curve shows that 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3." style="height: 150px;" class="thumbnail-image card-img"/></p>
382
382
<divclass="card-body post-contents">
383
383
<h5class="no-anchor card-title listing-title">
384
384
The Bell Curve rising: An Intuitive CLT Derivation
<pclass="card-img-top"><imgsrc="posts/clt-intuitive-derivation/images/normal_distribution.webp" alt="A diagram of the Normal Distribution bell curve. The central peak is at the mean mu. The curve shows that 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3." style="height: 150px;" class="thumbnail-image card-img"/></p>
318
+
<pclass="card-img-top"><imgsrc="posts/clt-intuitive-derivation/images/normal_distribution.png" alt="A diagram of the Normal Distribution bell curve. The central peak is at the mean mu. The curve shows that 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3." style="height: 150px;" class="thumbnail-image card-img"/></p>
319
319
<divclass="card-body post-contents">
320
320
<h5class="no-anchor card-title listing-title">
321
321
The Bell Curve rising: An Intuitive CLT Derivation
<p>The <strong>Central Limit Theorem (CLT)</strong> answers an important question: Why does the <strong>bell curve</strong> (or Normal Distribution) show up everywhere in the real world?</p>
307
307
<divclass="quarto-figure quarto-figure-center">
308
308
<figureclass="figure">
309
-
<p><imgsrc="images/normal_distribution.webp" class="img-fluid figure-img" alt="A diagram of the Normal Distribution bell curve. The central peak is at the mean mu. The curve shows that 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3."></p>
309
+
<p><imgsrc="images/normal_distribution.png" class="img-fluid figure-img" alt="A diagram of the Normal Distribution bell curve. The central peak is at the mean mu. The curve shows that 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3."></p>
310
310
<figcaption>The Normal Distribution, showing the 68-95-99.7 rule.</figcaption>
<p>The rigorous answer for this choice lies in analyzing the magnitude of our integrand. The integral we must solve is <spanclass="math inline">\(\frac{1}{2\pi} \int \varphi_X(\theta)^m e^{-in\theta} d\theta\)</span>. The dominant term that dictates the behavior of the integral for large <spanclass="math inline">\(m\)</span> is <spanclass="math inline">\(\varphi_X(\theta)^m\)</span>. To analyze its magnitude, we use a key property of complex numbers: the magnitude of a number raised to a power is the magnitude of the number raised to that same power, i.e., <spanclass="math inline">\(|z^n|=|z|^n\)</span>. Therefore, the magnitude of our term is simply <spanclass="math inline">\(|\varphi_X(\theta)|^m\)</span>.</p>
446
446
<p>Our first step is to prove that the magnitude of the characteristic function, <spanclass="math inline">\(|\varphi_X(\theta)|\)</span>, is at most 1. The intuition for this comes from visualizing the sum of complex numbers as vector addition—since complex numbers add component-wise, just like vectors, the process can be seen as joining vectors tip-to-tail. The <strong>triangle inequality</strong> (<spanclass="math inline">\(|\sum z_i| \le \sum |z_i|\)</span>) simply states that the length of the final resulting vector can never be greater than the sum of the lengths of all the individual vectors.</p>
\]</span> Since <spanclass="math inline">\(|e^{ik\theta}|=1\)</span> for any real <spanclass="math inline">\(k\)</span> and <spanclass="math inline">\(\theta\)</span>, this simplifies to: <spanclass="math display">\[
450
-
||\varphi_X(\theta)| \le \sum_k p_k = 1
450
+
|\varphi_X(\theta)| \le \sum_k p_k = 1
451
451
\]</span> The equality <spanclass="math inline">\(|\varphi_X(\theta)|=1\)</span> holds if and only if all the complex numbers <spanclass="math inline">\(e^{ik\theta}\)</span> (for which <spanclass="math inline">\(p_k>0\)</span>) point in the same direction. For any non-trivial distribution (with at least two different outcomes), this only happens when <spanclass="math inline">\(\theta=0\)</span>. At <spanclass="math inline">\(\theta=0\)</span>, every term <spanclass="math inline">\(e^{ik\cdot 0}\)</span> is just 1. For any <spanclass="math inline">\(\theta \neq 0\)</span>, the different values of <spanclass="math inline">\(k\)</span> cause the terms to have different phases, so they are no longer perfectly aligned, and the magnitude of their sum is strictly less than 1. This means that the function <spanclass="math inline">\(|\varphi_X(\theta)|\)</span> has a unique, global maximum value of 1 at precisely <spanclass="math inline">\(\theta=0\)</span>.</p>
452
452
<p>Now, consider what happens when we raise this to a large power, <spanclass="math inline">\(m\)</span>. The peak at <spanclass="math inline">\(\theta=0\)</span>, where the value is <spanclass="math inline">\(1^m = 1\)</span>, remains. However, for any other value of <spanclass="math inline">\(\theta\)</span> where <spanclass="math inline">\(|\varphi_X(\theta)| < 1\)</span>, the value of <spanclass="math inline">\(|\varphi_X(\theta)|^m\)</span> plummets towards zero exponentially fast as <spanclass="math inline">\(m\)</span> grows. For instance, if at some point the magnitude is 0.99, for <spanclass="math inline">\(m=1000\)</span> it becomes <spanclass="math inline">\(0.99^{1000} \approx 4 \times 10^{-5}\)</span>.</p>
453
453
<p>This creates an extremely sharp “spike” in the integrand’s magnitude, centered at <spanclass="math inline">\(\theta=0\)</span>. The result is that the only part of the integral that contributes significantly to the final value comes from an infinitesimally small neighborhood around <spanclass="math inline">\(\theta=0\)</span>. The contributions from all other regions are exponentially suppressed and become negligible. Therefore, to accurately approximate the integral for large <spanclass="math inline">\(m\)</span>, we <em>must</em> approximate the function <spanclass="math inline">\(g(\theta)\)</span> in this tiny, all-important region around the origin. The Taylor series is the fundamental mathematical tool for this. Expanding around <spanclass="math inline">\(\theta=0\)</span> is not a mere convenience; it is a mathematically necessary step.</p>
0 commit comments