Something I completely skipped over is how to actually solve this optimization problem. Partially because I thought it was outside the scope, but also partially because I don't know it well enough myself to explain it simply. It would be nice to say something about whether it's an iterative solution, whether it's some sort of gradient descent, or whether there's a closed form solution.