Mathematical examination on generalization bounds for two-layer neural networks with ReLU activation.
This project derives and analyzes generalization bounds for two-layer neural networks:
- Naive bound using empirical Rademacher complexity (explains why standard bounds fail for overparameterized models)
- Symmetrization inequality tailored to ReLU networks
- Scale-invariant complexity measure exploiting ReLU's positive homogeneity
- Tighter, width-independent bound for faithful capacity assessment
- Course: Learning Theory: Exam - Two-Layer Neural Networks
- Institution: Université Paris-Dauphine -- PSL, Department of MIDO
- Supervisor: Katia Meziani
- Author: Arthur Danjou
- Program: Master 2 ISF (Initial Track)
- Academic Year: 2025/2026
The main file exercise.tex is a LaTeX-based exam solution that covers three main parts:
- Derives generalization bound using empirical Rademacher complexity
- Shows
$\mathcal{R}_{S_n}(\mathcal{H}) \le 2 B_w B_u C \sqrt{m/n}$ - Explains why standard bounds fail for overparameterized networks (
$m \gg n$ )
- Establishes inequality: E[sup|Z|] <= 2 * E[sup phi(Z)]
- Exploits ReLU property:
$|z| = \phi(z) + \phi(-z)$ - Uses distributional equality of
$\sigma$ and$-\sigma$
- Leverages positive homogeneity of ReLU activation
- Introduces scale-invariant parameterization
- Yields width-independent generalization bound
| File | Description |
|---|---|
exercise.tex |
LaTeX source (main file) |
exercise.pdf |
Compiled PDF version |
logo dauphine.jpg |
University logo |
Compile the LaTeX source:
pdflatex exercise.texOr use your preferred LaTeX editor.