Skip to content

17 numerical stability issues in gradient for log1mexp#18

Merged
jjcmoon merged 3 commits intomainfrom
17-numerical-stability-issues-in-gradient-for-log1mexp
Nov 19, 2025
Merged

17 numerical stability issues in gradient for log1mexp#18
jjcmoon merged 3 commits intomainfrom
17-numerical-stability-issues-in-gradient-for-log1mexp

Conversation

@rmanhaeve
Copy link
Contributor

Added a failing test and fix for gradient instability for log1mexp. This new implementation does not seem to be sensitive to the value of eps.

@jjcmoon
Copy link
Member

jjcmoon commented Nov 19, 2025

Clamping sets the gradient to zero, even when it might be very large. So although clamping avoids nans, it's not a real solution. I did some debugging, and it seems like the culprit was actually the use of torch.where, which has some funky behaviour in pytorch. I'm now no longer using torch.where, solving the issue. Let me know if this works for you.

@jjcmoon jjcmoon merged commit e420172 into main Nov 19, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Numerical stability issues in gradient for log1mexp

2 participants