Skip to content

Commit 40a89fe

Browse files
authored
Update cross lecture links to use the new website (#43)
1 parent 44ed771 commit 40a89fe

11 files changed

+27
-29
lines changed

lectures/B01 Machine Learning Overview.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,7 @@ Given the state of the world (obtained from sensory data), the agent must *learn
213213
214214
In contrast to supervised and unsupervised learning, an agent is able to affect its data set by making actions, e.g., a robot can change its input video data stream by turning the head of its camera.
215215
216-
In this course, we focus on the active inference approach to trial design, see the [Intelligent Agent lesson](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Intelligent-Agents-and-Active-Inference.ipynb) for details.
216+
In this course, we focus on the active inference approach to trial design, see the [Intelligent Agent lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B12%20Intelligent%20Agents%20and%20Active%20Inference.html) for details.
217217
218218
"""
219219

lectures/B02 Probability Theory Review.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1028,7 +1028,7 @@ For proof, see [https://en.wikipedia.org/wiki/Product_distribution](https://en.w
10281028
md"""
10291029
Generally, this integral does not lead to an analytical expression for ``p_z(z)``.
10301030
1031-
For example, [the product of two independent variables that are both Gaussian-distributed does not lead to a Gaussian distribution](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/The-Gaussian-Distribution.ipynb#product-of-gaussians).
1031+
For example, [the product of two independent variables that are both Gaussian-distributed does not lead to a Gaussian distribution](https://biaslab.github.io/BMLIP-colorized/lectures/B05%20The%20Gaussian%20Distribution.html#product-of-gaussians).
10321032
10331033
* Exception: the distribution of the product of two variables that both have [log-normal distributions](https://en.wikipedia.org/wiki/Log-normal_distribution) is again a lognormal distribution. (If ``X`` has a normal distribution, then ``Y=\exp(X)`` has a log-normal distribution.)
10341034

lectures/B03 Bayesian Machine Learning.jl

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -465,7 +465,7 @@ This latter point accentuates that the common practice in machine learning to di
465465
md"""
466466
## Bayesian Machine Learning and the Scientific Method Revisited
467467
468-
The Bayesian design process provides a unified framework for the Scientific Inquiry method. We can now add equations to the design loop. (Trial design to be discussed in [Intelligent Agent lesson](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Intelligent-Agents-and-Active-Inference.ipynb).)
468+
The Bayesian design process provides a unified framework for the Scientific Inquiry method. We can now add equations to the design loop. (Trial design to be discussed in [Intelligent Agent lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B12%20Intelligent%20Agents%20and%20Active%20Inference.html).)
469469
470470
![](https://github.com/bertdv/BMLIP/blob/2024_pdfs/lessons/notebooks/./figures/scientific-inquiry-loop-w-BML-eqs.png?raw=true)
471471
@@ -891,7 +891,7 @@ end
891891

892892
# ╔═╡ 6a2b1f5a-d294-11ef-25d0-e996c07958b9
893893
md"""
894-
(If the GIF animation is not rendered, you can try to [view it here](https://github.com/bertdv/BMLIP/blob/master/lessons/notebooks/Bayesian-Machine-Learning.ipynb)).
894+
(If the GIF animation is not rendered, you can try to [view it here](https://biaslab.github.io/BMLIP-colorized/lectures/B03%20Bayesian%20Machine%20Learning.html)).
895895
896896
"""
897897

@@ -927,8 +927,6 @@ end
927927

928928
# ╔═╡ 6a2b9676-d294-11ef-241a-89ff7aa676f9
929929
md"""
930-
(If the GIF animation is not rendered, you can try to [view it here](https://github.com/bertdv/BMLIP/blob/master/lessons/notebooks/Bayesian-Machine-Learning.ipynb)).
931-
932930
Over time, the relative evidence of model ``m_1`` converges to 0. Can you explain this behavior?
933931
934932
"""

lectures/B04 Factor Graphs.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -630,7 +630,7 @@ Visually, the modularity of conditional independencies in the model are displaye
630630
631631
Computationally, message passing-based inference uses the Distributive Law to avoid any unnecessary computations.
632632
633-
What is the relevance of this lesson? RxInfer is not yet a finished project. Still, my prediction is that in 5-10 years, this lesson on Factor Graphs will be the final lecture of part-A of this class, aimed at engineers who need to develop machine learning applications. In principle you have all the tools now to work out the 4-step machine learning recipe (1. model specification, 2. parameter learning, 3. model evaluation, 4. application) that was proposed in the [Bayesian machine learning lesson](https://nbviewer.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Bayesian-Machine-Learning.ipynb#Bayesian-design). You can propose any model and execute the (learning, evaluation, and application) stages by executing the corresponding inference task automatically in RxInfer.
633+
What is the relevance of this lesson? RxInfer is not yet a finished project. Still, my prediction is that in 5-10 years, this lesson on Factor Graphs will be the final lecture of part-A of this class, aimed at engineers who need to develop machine learning applications. In principle you have all the tools now to work out the 4-step machine learning recipe (1. model specification, 2. parameter learning, 3. model evaluation, 4. application) that was proposed in the [Bayesian machine learning lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B03%20Bayesian%20Machine%20Learning.html#Bayesian-design). You can propose any model and execute the (learning, evaluation, and application) stages by executing the corresponding inference task automatically in RxInfer.
634634
635635
Part-B of this class would be about on advanced methods on how to improve automated inference by RxInfer or a similar probabilistic programming package. The Bayesian approach fully supports separating model specification from the inference task.
636636

lectures/B05 The Gaussian Distribution.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ p(x | \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2 }} \,\exp\left\{-\frac{(x-\m
9797

9898
# ╔═╡ b9a50d0c-d294-11ef-0e60-2386cf289478
9999
md"""
100-
Alternatively, the $(HTML("<span id='natural-parameterization'>*canonical* (a.k.a. *natural* or *information* ) parameterization</span>")) of the Gaussian distribution is given by
100+
Alternatively, the $(HTML("<span id='natural-parameterization'></span>"))*canonical* (a.k.a. *natural* or *information* ) parameterization of the Gaussian distribution is given by
101101
102102
```math
103103
\begin{equation*}
@@ -165,7 +165,7 @@ A **linear transformation** ``z=Ax+b`` of a Gaussian variable ``x \sim \mathcal{
165165
p(z) = \mathcal{N} \left(z \,|\, A\mu_x+b, A\Sigma_x A^T \right) \tag{SRG-4a}
166166
```
167167
168-
In fact, after a linear transformation ``z=Ax+b``, no matter how ``x`` is distributed, the mean and variance of ``z`` are always given by ``\mu_z = A\mu_x + b`` and ``\Sigma_z = A\Sigma_x A^T``, respectively (see [probability theory review lesson](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Probability-Theory-Review.ipynb#linear-transformation)). In case ``x`` is not Gaussian, higher order moments may be needed to specify the distribution for ``z``.
168+
In fact, after a linear transformation ``z=Ax+b``, no matter how ``x`` is distributed, the mean and variance of ``z`` are always given by ``\mu_z = A\mu_x + b`` and ``\Sigma_z = A\Sigma_x A^T``, respectively (see [probability theory review lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B02%20Probability%20Theory%20Review.html#linear-transformation)). In case ``x`` is not Gaussian, higher order moments may be needed to specify the distribution for ``z``.
169169
170170
"""
171171

lectures/B06 The Multinomial Distribution.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ This distribution depends on the observations **only** through the quantities ``
126126

127127
# ╔═╡ d8439866-d294-11ef-230b-dfde21aedfbf
128128
md"""
129-
We need a prior for the parameters ``\mu = (\mu_1,\mu_2,\ldots,\mu_K)``. In the [binary coin toss example](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Bayesian-Machine-Learning.ipynb#beta-prior),
129+
We need a prior for the parameters ``\mu = (\mu_1,\mu_2,\ldots,\mu_K)``. In the [binary coin toss example](https://biaslab.github.io/BMLIP-colorized/lectures/B03%20Bayesian%20Machine%20Learning.html#beta-prior),
130130
131131
we used a [beta distribution](https://en.wikipedia.org/wiki/Beta_distribution) that was conjugate with the binomial and forced us to choose prior pseudo-counts.
132132

lectures/B07 Regression.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ p(w|D) &\propto p(D|w)\cdot p(w) \\
152152
\end{align*}
153153
```
154154
155-
with natural parameters (see the [natural parameterization of Gaussian](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/The-Gaussian-Distribution.ipynb#natural-parameterization)):
155+
with natural parameters (see the [natural parameterization of Gaussian](https://biaslab.github.io/BMLIP-colorized/lectures/B05%20The%20Gaussian%20Distribution.html#natural-parameterization)):
156156
157157
```math
158158
\begin{align*}

lectures/B08 Generative Classification.jl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ Hence, using the one-hot coding formulation for ``y_{nk}``, the generative model
167167
md"""
168168
We will refer to this model as the **Gaussian-Categorical Model** ($(HTML("<span id='GCM'>GCM</span>"))).
169169
170-
* N.B. In the literature, this model (with possibly unequal ``\Sigma_k`` across classes) is often called the Gaussian Discriminant Analysis model and the special case with equal covariance matrices ``\Sigma_k=\Sigma`` is also called Linear Discriminant Analysis. We think these names are a bit unfortunate as it may lead to confusion with the [discriminative method for classification](https://nbviewer.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Discriminative-Classification.ipynb).
170+
* N.B. In the literature, this model (with possibly unequal ``\Sigma_k`` across classes) is often called the Gaussian Discriminant Analysis model and the special case with equal covariance matrices ``\Sigma_k=\Sigma`` is also called Linear Discriminant Analysis. We think these names are a bit unfortunate as it may lead to confusion with the [discriminative method for classification](https://biaslab.github.io/BMLIP-colorized/lectures/B09%20Discriminative%20Classification.html).
171171
172172
"""
173173

@@ -219,8 +219,8 @@ Recall (from the previous slide) the log-likelihood (LLH)
219219
md"""
220220
Maximization of the LLH for the GDA model breaks down into
221221
222-
* **Gaussian density estimation** for parameters ``\mu_k, \Sigma``, since the first term contains exactly the log-likelihood for MVG density estimation. We've already done this, see the [Gaussian distribution lesson](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/The-Gaussian-Distribution.ipynb#ML-for-Gaussian).
223-
* **Multinomial density estimation** for class priors ``\pi_k``, since the second term holds exactly the log-likelihood for multinomial density estimation, see the [Multinomial distribution lesson](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/The-Multinomial-Distribution.ipynb#ML-for-multinomial).
222+
* **Gaussian density estimation** for parameters ``\mu_k, \Sigma``, since the first term contains exactly the log-likelihood for MVG density estimation. We've already done this, see the [Gaussian distribution lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B05%20The%20Gaussian%20Distribution.html#ML-for-Gaussian).
223+
* **Multinomial density estimation** for class priors ``\pi_k``, since the second term holds exactly the log-likelihood for multinomial density estimation, see the [Multinomial distribution lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B06%20The%20Multinomial%20Distribution.html#ML-for-multinomial).
224224
225225
"""
226226

@@ -452,7 +452,7 @@ The following answer was provided:
452452
> 1. Bayesian evidence for model performance assessment. This means you can use the whole data set for training without an ad-hoc split into testing and training data sets.
453453
454454
455-
> 2. Uncertainty about parameters in the model is a measure that allows you to do *active learning*, ie, choose data that is most informative (see also the [lesson on intelligent agents](https://nbviewer.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Intelligent-Agents-and-Active-Inference.ipynb)). This will allow you to train on small data sets, whereas the deterministic DNNs generally require much larger data sets.
455+
> 2. Uncertainty about parameters in the model is a measure that allows you to do *active learning*, ie, choose data that is most informative (see also the [lesson on intelligent agents](https://biaslab.github.io/BMLIP-colorized/lectures/B12%20Intelligent%20Agents%20and%20Active%20Inference.html)). This will allow you to train on small data sets, whereas the deterministic DNNs generally require much larger data sets.
456456
457457
458458
> 3. Prediction with uncertainty/confidence bounds.

lectures/B09 Discriminative Classification.jl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ What model should we use for the posterior distribution ``p(y_n \in \mathcal{C}_
118118
md"""
119119
#### Likelihood
120120
121-
We will take inspiration from the [generative classification](https://nbviewer.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Generative-Classification.ipynb#softmax) approach, where we derived the class posterior
121+
We will take inspiration from the [generative classification](https://biaslab.github.io/BMLIP-colorized/lectures/B08%20Generative%20Classification.html#softmax) approach, where we derived the class posterior
122122
123123
```math
124124
p(y_{nk} = 1\,|\,x_n,\beta_k,\gamma_k) = \sigma(\beta_k^T x_n + \gamma_k)
@@ -544,7 +544,7 @@ Computing the gradient ``\nabla_{\theta_k} \mathrm{L}(\theta)`` leads to (for [p
544544

545545
# ╔═╡ 25f386e4-d294-11ef-2cec-f56f4a6feb19
546546
md"""
547-
Compare this to the [gradient for *linear* regression](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Regression.ipynb#regression-gradient):
547+
Compare this to the [gradient for *linear* regression](https://biaslab.github.io/BMLIP-colorized/lectures/B07%20Regression.html#regression-gradient):
548548
549549
```math
550550
\nabla_\theta \mathrm{L}(\theta) = \sum_n \left(y_n - \theta^T x_n \right) x_n
@@ -570,7 +570,7 @@ The parameter vector ``\theta`` for logistic regression can be estimated through
570570
\hat{\theta}^{(i+1)} = \hat{\theta}^{(i)} + \eta \cdot \left. \nabla_\theta \mathrm{L}(\theta) \right|_{\theta = \hat{\theta}^{(i)}}
571571
```
572572
573-
Note that, while in the Bayesian approach we get to update ``\theta`` with [**Kalman-gain-weighted** prediction errors](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/The-Gaussian-Distribution.ipynb#precision-weighted-update) (which is optimal), in the maximum likelihood approach, we weigh the prediction errors with **input** values (which is less precise).
573+
Note that, while in the Bayesian approach we get to update ``\theta`` with [**Kalman-gain-weighted** prediction errors](https://biaslab.github.io/BMLIP-colorized/lectures/B05%20The%20Gaussian%20Distribution.html#precision-weighted-update) (which is optimal), in the maximum likelihood approach, we weigh the prediction errors with **input** values (which is less precise).
574574
575575
"""
576576

@@ -580,7 +580,7 @@ md"""
580580
581581
Let us perform ML estimation of ``w`` on the data set from the introduction. To allow an offset in the discrimination boundary, we add a constant 1 to the feature vector ``x``. We only have to specify the (negative) log-likelihood and the gradient w.r.t. ``w``. Then, we use an off-the-shelf optimisation library to minimize the negative log-likelihood.
582582
583-
We plot the resulting maximum likelihood discrimination boundary. For comparison we also plot the ML discrimination boundary obtained from the [code example in the generative Gaussian classifier lesson](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Generative-Classification.ipynb#code-generative-classification-example).
583+
We plot the resulting maximum likelihood discrimination boundary. For comparison we also plot the ML discrimination boundary obtained from the [code example in the generative Gaussian classifier lesson](https://biaslab.github.io/BMLIP-colorized/lectures/B08%20Generative%20Classification.html#code-generative-classification-example).
584584
585585
"""
586586

0 commit comments

Comments
 (0)