-
Notifications
You must be signed in to change notification settings - Fork 38
add ForwardDiff@1 #378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add ForwardDiff@1 #378
Conversation
4885c9d
to
ecd9968
Compare
Test env still doesn't resolve to FD=1 since that would downgrade a bunch of other packages it seems. I'm trying to slowly track them down, starting with NNLib here FluxML/NNlib.jl#637 (comment), but it might be a while until everything's updated. In the meantime, I removed ForwardDiff 0.10 from the test compat, to force the tests to run on FD=1. |
ForwardDiff test fails at using Distributions, Bijectors, ForwardDiff, LinearAlgebra, Test
dist = Dirichlet([1000 * one(Float64), eps(Float64)])
b = Bijectors.SimplexBijector()
r = rand(dist)
x = if any(r .> 0.9999)
[0.0, 1.0][sortperm(r)]
else
r
end
y = b(x)
ForwardDiff.jacobian(inverse(b), y)[1:(end - 1), :]
@test logabsdet(ForwardDiff.jacobian(inverse(b), y)[1:(end - 1), :])[1] ≈
logabsdetjac(inverse(b), y) I bisected the failure to JuliaDiff/ForwardDiff.jl#695 - it seems that the now (more correct) inequality comparisons are messing with Lines 87 to 93 in 8a525f1
a and b are reals, whereas x is a ForwardDiff.Dual . Specifically, in this instance, we have
x = Dual(1.0, eps(Float64))
a = 0
b = 1 and before that PR, On the other hand, after that PR, It seems to me, mathematically, that Resolved by changing the parameters of the Dirichlet distribution so that it doesn't generate samples like |
Failing tests in |
48dfaf5
to
2a5e310
Compare
f257ca9
to
2583065
Compare
c63a5d8
to
6cf4d24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 5 changed files in this pull request and generated 1 comment.
Files not reviewed (3)
- src/Bijectors.jl: Language not supported
- test/interface.jl: Language not supported
- test/transform.jl: Language not supported
On buildkite: short while ago, I asked the JuliaGPU folks to add this (GPU tests need to be run on a server with GPU, which a standard github CI doesn't). We don't currently have GPU testing set up. And I think this is why the buildkite is failing. (It doesn't look great to have a failed test though, I'll fix it soon.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
@@ -145,12 +145,10 @@ end | |||
@testset "Multivariate" begin | |||
vector_dists = [ | |||
Dirichlet(2, 3), | |||
Dirichlet([1000 * one(Float64), eps(Float64)]), | |||
Dirichlet([eps(Float64), 1000 * one(Float64)]), | |||
Dirichlet([10.0, 0.1]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure these new tests could replace the current ones. Tests like
Dirichlet([1000 * one(Float64), eps(Float64)]),
Dirichlet([eps(Float64), 1000 * one(Float64)]),
are aimed at the numerical stability of very extrate examples of Dirichlet distributions, i.e. one axis has a very tiny probability mass in average.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually the sample that's the problem. For the sample x = [1.0, 0.0]
, the transformed variable is y = [36.0436]
which is outside of the range for which Float64 is numerically stable.
The issue comes from these lines:
Bijectors.jl/src/bijectors/simplex.jl
Lines 89 to 90 in d8d781b
@inbounds z = LogExpFunctions.logistic(y[1] - log(T(K - 1))) | |
@inbounds x[1] = _clamp((z - ϵ) / (one(T) - 2ϵ), 0, 1) |
As y[1]
tends to +Inf
, z
tends to 1, and the expression (z - ϵ) / (one(T) - 2ϵ)
tends towards 1.0000000000000002
. If that expression is greater than 1, then it gets _clamp
ed to 1, and the derivative is set to 0.
The difference between FD 0.10 and FD 1.0 is that the new version sets the derivative to 0 if (z - ϵ) / (one(T) - 2ϵ)
is greater than, or equal to, 1. And that in turn means that there is a larger range of y[1]
for which the derivative gets clamped. Unfortunately, Float64 36.0436 falls into that category (35.8 would have been fine, or alternatively, BigFloat is ok up until around 175).
As far as I can tell the fact that it used to work with FD 0.10 might have been a happy accident – I wrote more about this in a comment above, but (to me) it makes sense for FD to set the derivative to 0 at the point (z - ϵ) / (one(T) - 2ϵ) == 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not fully sure how to resolve this though, which is why I haven't really come back to this PR. Obviously changing the sample fixes the tests (and the easiest way to change the sample was to change the distribution from which it was drawn), but I can't tell if there's a workaround in the code that makes it work again for (z - ϵ) / (one(T) - 2ϵ) == 1.0
, or more generally for large y
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinging @devmotion for your thoughts too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure. I always had the feeling that this stick-breaking transform (explained in eg the Stan docs) can be numerically problematic. I also always thought that these eps workarounds are unsatisfying. But I'm not sure what exactly would be broken when they would be removed, maybe would be interesting to see.
@penelopeysm, please feel free to merge. |
Hooray, all tests passing (except known Enzyme failures!)
Closes #377
Closes #376