Fix differentiation of nonlocal projectors at p=0 by switching to solid harmonics#1318
Fix differentiation of nonlocal projectors at p=0 by switching to solid harmonics#1318Technici4n wants to merge 4 commits into
Conversation
|
Hm, feels wrong to change the interface just to catch a numerical stability bug. Looking at the original function I think just writing iszero(r) && return 0; r = normalize(r)) at the beginning of the function should fix any issue? |
|
To be clear, the code on master is bad: the potential numerical issue has nothing to do with eps(), it's really about under and overflow, and normalize() fixes this. It's potentially slower though (needs a sqrt), so maybe we want to do the normalization only in case of potential under/overflow (something like max(abs(r))^l > typemax(T) or something)? Or we find a library that actually has thought about such things and steal it. |
|
Why don't we just depend on https://github.com/jishnub/SphericalHarmonics.jl? |
|
Well the spherical harmonic is fundamentally not differentiable at 0. Only the product of the spherical harmonic and the hankel transform is. The problem is precisely that for l=1 the derivative should not be zero! |
antoine-levitt
left a comment
There was a problem hiding this comment.
Hm, I think I buy your argument but I'm not happy about it... Apart from minor stuff, the important bit is that we can't have psp_fourier be one thing and local_fourier be another.
| l == 2 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^2, r) / 15 | ||
| l == 3 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^3, r) / 105 | ||
| l == 4 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^4, r) / 945 | ||
| l == 5 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^5, r) / 10395 |
There was a problem hiding this comment.
just have a normalization tuple and return quadrature() / normalization[l+1]
There was a problem hiding this comment.
I did it like this to keep a literal pow for speed:tm: :D
There was a problem hiding this comment.
if it's a tuple both are equivalent for the compiler so it should be the same speed?
There was a problem hiding this comment.
Think so too. Only thing is to take care of proper type propagation to make sure that things are inlined (e.g. one may need an @inbounds to suppress the bounds checking code)
There was a problem hiding this comment.
I would like to keep the explicit ^ <literal> here.
| function hankel(quadrature, r::AbstractVector, r2_f::AbstractVector, l::Integer, p::T)::T where {T<:Real} | ||
| @assert length(r) == length(r2_f) | ||
| 4T(π) * quadrature(r) do i, ri | ||
| if abs(p) <= 10 * eps(T) |
There was a problem hiding this comment.
The eps here is wrong; this has nothing to do with eps, as fp are always relative and manipulating very small numbers and dividing with other very small numbers is completely fine. The issue here is underflow and p==0, so something with floatmin (with very generous margins, probably) is more appropriate.
There was a problem hiding this comment.
Actually we can bypass this altogether with something like if abs(p) < minfloat p = minfloat and it should be fine?
There was a problem hiding this comment.
That's just taken from ylm_real
There was a problem hiding this comment.
I know, it was wrong there as well :-p
There was a problem hiding this comment.
That quick fix was my bad. Avoiding underflow is better of course.
There was a problem hiding this comment.
sphericalbesselj_fast is quite numerically unstable for small
There was a problem hiding this comment.
Oh yeah sorry, I was looking at ylm_real which is stable, but this sphericalbesselj_fast is very unstable, eg (sin(x) - cos(x) * x) is bad
There was a problem hiding this comment.
Ideally the bound should be something like \epsilon^(1/l) or maybe \epsilon^(1/l+1)? But I am hesitant to write it like this because it might not be very GPU friendly?
| \hat f( q) = 4 \pi R_{l}^{m}(q) (-i)^{l} | ||
| \int_{{\mathbb R}^+} r^2 R(r) \ j_{l}(|q| r) / |q|^l dr, | ||
| ``` | ||
| which is better behaved for algorithmic differentiation at ``q=0``. |
There was a problem hiding this comment.
don't necessarily mention AD explicitly here? Say we choose in DFTK to work with solid harmonics because they're nicer (polynomials) and this helps in particular with AD. Also introduce the modified hankel transform here?
There was a problem hiding this comment.
I agree. AD is just one thing where this helps.
| end | ||
|
|
||
| r = norm(rvec) | ||
| # Catch cases of numerically very small r |
There was a problem hiding this comment.
same remark as above, nothing to do with eps
There was a problem hiding this comment.
I think that epsilon is reasonable since the norm computation is itself approximate. Perhaps sqrt(epsilon) would be more correct.
There was a problem hiding this comment.
Nah that's not an issue, norm(lambda x) = lambda norm(x) holds true (in a relative error sense) for a very wide range of lambdas, up to under/overflow, the magnitude just goes into the exponent
There was a problem hiding this comment.
Indeed you are right, for most issubnormal is exclusive to IEEE float types, and there are a few subtractions in the solid harmonics, so maybe an eps-based check is still reasonable as a conservative bound?
|
|
||
| # [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω). | ||
| # [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω) | ||
| # and we regularize by 1/p^l. |
There was a problem hiding this comment.
This is very annoying: you're changing the contract of eval_psp_projector_fourier (which doesn't return a fourier transform anymore), but you don't do this for the local part. Is this really really necessary? If so we need to rename things
There was a problem hiding this comment.
The local part has l=0 so the regularized vs unregularized hankel transforms are equivalent there. Same for the NLCC as well
There was a problem hiding this comment.
OK, that's fair, it never actually returned a fourier transform. I think we need to clarify what "radial part" means for such a function, since the comment
Evaluate the radial part of the `i`-th projector for angular momentum `l`
in real-space at the vector with modulus `r`.
is imprecise.
There was a problem hiding this comment.
As a suggestion, to make this clearer we could rename the _fourier functions to _reciprocal where they are not actually fourier transforms. This may help, but it may also add confusion; I'm divided.
| end | ||
|
|
||
| function hankel(r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual} | ||
| function hankel(quadrature, r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual} |
There was a problem hiding this comment.
maybe this is not necessary anymore?
| Returns the ``(l,m)`` real solid harmonic, defined as ``R_l^m(r) = r^l Y_l^m(r)``. | ||
| The solid harmonics are homogeneous polynomials of degree l, | ||
| which makes them cleanly defined and differentiable at the origin, | ||
| unlike the spherical harmonics. |
There was a problem hiding this comment.
Also refer to wikipedia in case we need to extend this later ?
There was a problem hiding this comment.
I don't find a clean table on Wikipedia. Better use the real spherical harmonic table.
There was a problem hiding this comment.
sure, but what I meant is refer to the spherical harmonics table and indicate "what one has to do" to get to these expressions.
|
|
||
| # [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω). | ||
| # [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω) | ||
| # and we regularize by 1/p^l. |
There was a problem hiding this comment.
As a suggestion, to make this clearer we could rename the _fourier functions to _reciprocal where they are not actually fourier transforms. This may help, but it may also add confusion; I'm divided.
| end | ||
|
|
||
| function hankel(r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual} | ||
| function hankel(quadrature, r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual} |
Spherical coordinates at p=0 are awkward, and this causes problems for AD wrt. k at G+k=0 for projectors that have angular momentum l=1: the derivative should not be zero, but our guards against small p make it 0. Additionally, only the product of the Hankel transform and the spherical harmonics is differentiable, but not each of them on their own.
The best fix I could find find (thanks Claude, although all the code is mine) was to move the$H(0) \cdot \partial_\alpha R_{lm}(0)$ which is always well-defined! And since $R_{lm}$ is a homogeneous polynomial of degree l this cleanly explains why only projectors at l=1 contribute to the derivative.
1/p^lfrom the spherical harmonic to the Hankel transform. This gives a "regularized" Hankel transform (whose derivative is always 0 at p=0) times a solid harmonic. With this trick the derivative is