feat(stochtrace): Add LOO-based estimators of the diagonal#280
feat(stochtrace): Add LOO-based estimators of the diagonal#280sethaxen wants to merge 21 commits into
Conversation
|
The thesis does not discuss resphering in the context of the diagonal estimators, but it can be applied. For XDiag, it would require additional matvecs, so I don't think this makes sense to offer. I'm going to quickly look into whether it makes sense for XNysDiag. |
|
I spent some time looking into resphering for the diagonal estimators. The key observation is that so when using a Hutchinson-style estimator for the diagonal, what we're really doing is simultaneously estimating a bunch of traces for the operators When resphering, I think it's a mistake to try to project each test vector onto an orthoprojector (Yeah this seems to just be another QR downdate formula). What's tricky is that for each residual/test vector pair, we need to resphere the test vector differently for each diagonal entry. It's quite possible this can be efficiently evaluated using just quantities we already have, at least for XNysDiag, but since this is going well beyond the source material, I'm going to shelf this idea for another time. |
|
Thanks for checking!
I did get a similar feeling reading through the derivation/sketch. It's nice that it seems possible, but I agree with you, it should probably be handled separately (if at all). |
|
I refactored the new diagonal tests using I've opted to keep xdiag and xnysdiag tests in their own files, but some of the tests could in principle be merged if we lumped them into the same test file, but not all of them. What are your thoughts? |
|
Yes, I think it is nice and clear now. No need to merge, I am okay with xnysdiag and xdiag in separate files. |
|
I implemented the requested changes to the test.
Would you like this here or in a separate PR? |
| def hermitian_matrix_eigvals_decaying(n, /, key, *, dtype=None): | ||
| """Hermitian matrix whose eigenvalues decay geometrically (0.7^k).""" | ||
| eigvals = 0.7 ** np.arange(n) |
There was a problem hiding this comment.
How about including 0.7 in the arguments? We can also do this in the future as soon as we need to change it, but while we're at it...
| def hermitian_matrix_eigvals_decaying(n, /, key, *, dtype=None): | |
| """Hermitian matrix whose eigenvalues decay geometrically (0.7^k).""" | |
| eigvals = 0.7 ** np.arange(n) | |
| def hermitian_matrix_eigvals_decaying(n, /, key, *, base=0.7, dtype=None): | |
| """Hermitian matrix whose eigenvalues decay geometrically (x^k).""" | |
| eigvals = x ** np.arange(n) |
|
Thanks!
Both are fine. I'd say if you're doing another (few) commits for this PR, we can handle them in the same sweep. Otherwise, happy to review a separate one as well :) |
Following after #263, this PR adds
leave_one_out_xdiagleave_one_out_xnysdiagThe implementations are pretty similar to those of
leave_one_out_xtraceandleave_one_out_xnystrace, with a notable exception being that while the trace is always a scalar, the diagonal can be a pytree. To make this work, this PR also updatesestimator_leave_one_outandestimator_leave_one_out_mean_and_semto support pytrees.Closes #275
Empirical comparison with thesis experiments
XDiag is introduced in the same paper as XTrace, while XNysDiag is introduced in Chapter 16 of Ethan Epperley's thesis. Figure 16.1 (below) shows experimental results, which I was able to closely replicate with the implementations in this PR:

EDIT: Like we saw with XNysTrace in #263 (comment), for the "exp" experiment, XNysDiag using Nystrom with eigh plateaus at a slightly higher abs relative error than using Nystrom with shifted Cholesky.