Skip to content

feat(stochtrace): Add LOO-based estimators of the diagonal#280

Open
sethaxen wants to merge 21 commits into
pnkraemer:mainfrom
sethaxen:xdiag_xnysdiag
Open

feat(stochtrace): Add LOO-based estimators of the diagonal#280
sethaxen wants to merge 21 commits into
pnkraemer:mainfrom
sethaxen:xdiag_xnysdiag

Conversation

@sethaxen

@sethaxen sethaxen commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Following after #263, this PR adds

  • leave_one_out_xdiag
  • leave_one_out_xnysdiag

The implementations are pretty similar to those of leave_one_out_xtrace and leave_one_out_xnystrace, with a notable exception being that while the trace is always a scalar, the diagonal can be a pytree. To make this work, this PR also updates estimator_leave_one_out and estimator_leave_one_out_mean_and_sem to support pytrees.

Closes #275

Empirical comparison with thesis experiments

XDiag is introduced in the same paper as XTrace, while XNysDiag is introduced in Chapter 16 of Ethan Epperley's thesis. Figure 16.1 (below) shows experimental results, which I was able to closely replicate with the implementations in this PR:
Screenshot_2026-06-09_16-06-40 xdiag_experiment

EDIT: Like we saw with XNysTrace in #263 (comment), for the "exp" experiment, XNysDiag using Nystrom with eigh plateaus at a slightly higher abs relative error than using Nystrom with shifted Cholesky.

@sethaxen

sethaxen commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

The thesis does not discuss resphering in the context of the diagonal estimators, but it can be applied. For XDiag, it would require additional matvecs, so I don't think this makes sense to offer. I'm going to quickly look into whether it makes sense for XNysDiag.

Comment thread tests/test_stochtrace/test_leave_one_out/test_estimator_loo.py
Comment thread tests/test_stochtrace/test_leave_one_out/test_xnysdiag.py Outdated
Comment thread tests/test_stochtrace/test_leave_one_out/test_xnysdiag.py Outdated
@pnkraemer pnkraemer added the enhancement New feature or request label Jun 10, 2026
@sethaxen

Copy link
Copy Markdown
Contributor Author

I spent some time looking into resphering for the diagonal estimators. The key observation is that

$$\begin{aligned} (B \omega) \circ \omega &= \mathrm{diag}(B \omega \omega^\dagger)\\\ \mathrm{diag}(B \omega \omega^\dagger)_j &= e_j^\dagger B \omega \omega^\dagger e_j = \omega^\dagger (e_j e_j^\dagger B) \omega \end{aligned}$$

so when using a Hutchinson-style estimator for the diagonal, what we're really doing is simultaneously estimating a bunch of traces for the operators $C_j=E_j B$, where $E_j = e_j e_j^\top$. We can use this result to work out the variances of the estimators for the various test vectors (and it's a lot easier than what I did to get #264 (comment)).

When resphering, I think it's a mistake to try to project each test vector onto an orthoprojector $P$ such that $BP=B$ like we do for trace estimates. Rather, I suspect that for each diagonal entry, we want an orthoprojector such that $ P C_j = C_j P = C_j$. If we have this, then Corollary 13.4 of the thesis would follow, and resphering would reduce the variance of the estimator. If we have an orthonormal matrix such that $BQ=0$, then it seems we could construct such a projector as

$$P = I - Q (I - f f^\dagger) Q^\dagger, \qquad f = \frac{Q^\dagger e_j}{\lVert Q^\dagger e_j \rVert}$$

(Yeah this seems to just be another QR downdate formula).

What's tricky is that for each residual/test vector pair, we need to resphere the test vector differently for each diagonal entry. It's quite possible this can be efficiently evaluated using just quantities we already have, at least for XNysDiag, but since this is going well beyond the source material, I'm going to shelf this idea for another time.

@pnkraemer

Copy link
Copy Markdown
Owner

Thanks for checking!

but since this is going well beyond the source material, I'm going to shelf this idea for another time

I did get a similar feeling reading through the derivation/sketch. It's nice that it seems possible, but I agree with you, it should probably be handled separately (if at all).

@sethaxen

Copy link
Copy Markdown
Contributor Author

I refactored the new diagonal tests using parametrize_with_cases. Does the overall approach look like what you had in mind? If so, would you like me to also refactor the xtrace and xnystrace tests here?

I've opted to keep xdiag and xnysdiag tests in their own files, but some of the tests could in principle be merged if we lumped them into the same test file, but not all of them. What are your thoughts?

Comment thread tests/test_stochtrace/test_leave_one_out/conftest.py Outdated
Comment thread tests/test_stochtrace/test_leave_one_out/conftest.py Outdated
Comment thread tests/test_stochtrace/test_leave_one_out/test_xnysdiag.py Outdated
@pnkraemer

Copy link
Copy Markdown
Owner

Yes, I think it is nice and clear now. No need to merge, I am okay with xnysdiag and xdiag in separate files.

@sethaxen

Copy link
Copy Markdown
Contributor Author

I implemented the requested changes to the test.

would you like me to also refactor the xtrace and xnystrace tests here?

Would you like this here or in a separate PR?

Comment thread matfree/test_util.py
Comment on lines +19 to +21
def hermitian_matrix_eigvals_decaying(n, /, key, *, dtype=None):
"""Hermitian matrix whose eigenvalues decay geometrically (0.7^k)."""
eigvals = 0.7 ** np.arange(n)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about including 0.7 in the arguments? We can also do this in the future as soon as we need to change it, but while we're at it...

Suggested change
def hermitian_matrix_eigvals_decaying(n, /, key, *, dtype=None):
"""Hermitian matrix whose eigenvalues decay geometrically (0.7^k)."""
eigvals = 0.7 ** np.arange(n)
def hermitian_matrix_eigvals_decaying(n, /, key, *, base=0.7, dtype=None):
"""Hermitian matrix whose eigenvalues decay geometrically (x^k)."""
eigvals = x ** np.arange(n)

@pnkraemer

Copy link
Copy Markdown
Owner

Thanks!

Would you like this here or in a separate PR?

Both are fine. I'd say if you're doing another (few) commits for this PR, we can handle them in the same sweep. Otherwise, happy to review a separate one as well :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement the XDiag and XNysDiag estimators

2 participants