Skip to content

Fix differentiation of nonlocal projectors at p=0 by switching to solid harmonics#1318

Open
Technici4n wants to merge 4 commits into
masterfrom
nonlocal-deriv
Open

Fix differentiation of nonlocal projectors at p=0 by switching to solid harmonics#1318
Technici4n wants to merge 4 commits into
masterfrom
nonlocal-deriv

Conversation

@Technici4n

Copy link
Copy Markdown
Collaborator

Spherical coordinates at p=0 are awkward, and this causes problems for AD wrt. k at G+k=0 for projectors that have angular momentum l=1: the derivative should not be zero, but our guards against small p make it 0. Additionally, only the product of the Hankel transform and the spherical harmonics is differentiable, but not each of them on their own.

The best fix I could find find (thanks Claude, although all the code is mine) was to move the 1/p^l from the spherical harmonic to the Hankel transform. This gives a "regularized" Hankel transform (whose derivative is always 0 at p=0) times a solid harmonic. With this trick the derivative is $H(0) \cdot \partial_\alpha R_{lm}(0)$ which is always well-defined! And since $R_{lm}$ is a homogeneous polynomial of degree l this cleanly explains why only projectors at l=1 contribute to the derivative.

@antoine-levitt

Copy link
Copy Markdown
Member

Hm, feels wrong to change the interface just to catch a numerical stability bug. Looking at the original function I think just writing iszero(r) && return 0; r = normalize(r)) at the beginning of the function should fix any issue?

@antoine-levitt

Copy link
Copy Markdown
Member

To be clear, the code on master is bad: the potential numerical issue has nothing to do with eps(), it's really about under and overflow, and normalize() fixes this. It's potentially slower though (needs a sqrt), so maybe we want to do the normalization only in case of potential under/overflow (something like max(abs(r))^l > typemax(T) or something)? Or we find a library that actually has thought about such things and steal it.

@antoine-levitt

Copy link
Copy Markdown
Member

Why don't we just depend on https://github.com/jishnub/SphericalHarmonics.jl?

@Technici4n

Copy link
Copy Markdown
Collaborator Author

Well the spherical harmonic is fundamentally not differentiable at 0. Only the product of the spherical harmonic and the hankel transform is. The problem is precisely that for l=1 the derivative should not be zero!

@antoine-levitt antoine-levitt left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think I buy your argument but I'm not happy about it... Apart from minor stuff, the important bit is that we can't have psp_fourier be one thing and local_fourier be another.

Comment thread src/common/hankel.jl Outdated
Comment thread src/common/hankel.jl
l == 2 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^2, r) / 15
l == 3 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^3, r) / 105
l == 4 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^4, r) / 945
l == 5 && return 4T(π) * quadrature((i, ri) -> r2_f[i] * ri^5, r) / 10395

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just have a normalization tuple and return quadrature() / normalization[l+1]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it like this to keep a literal pow for speed:tm: :D

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's a tuple both are equivalent for the compiler so it should be the same speed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think so too. Only thing is to take care of proper type propagation to make sure that things are inlined (e.g. one may need an @inbounds to suppress the bounds checking code)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep the explicit ^ <literal> here.

Comment thread src/common/hankel.jl
function hankel(quadrature, r::AbstractVector, r2_f::AbstractVector, l::Integer, p::T)::T where {T<:Real}
@assert length(r) == length(r2_f)
4T(π) * quadrature(r) do i, ri
if abs(p) <= 10 * eps(T)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eps here is wrong; this has nothing to do with eps, as fp are always relative and manipulating very small numbers and dividing with other very small numbers is completely fine. The issue here is underflow and p==0, so something with floatmin (with very generous margins, probably) is more appropriate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we can bypass this altogether with something like if abs(p) < minfloat p = minfloat and it should be fine?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just taken from ylm_real

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, it was wrong there as well :-p

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That quick fix was my bad. Avoiding underflow is better of course.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sphericalbesselj_fast is quite numerically unstable for small $p$ anyway so better keep it like this than use minfloat.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah sorry, I was looking at ylm_real which is stable, but this sphericalbesselj_fast is very unstable, eg (sin(x) - cos(x) * x) is bad

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the bound should be something like \epsilon^(1/l) or maybe \epsilon^(1/l+1)? But I am hesitant to write it like this because it might not be very GPU friendly?

Comment thread docs/src/developer/useful_formulas.md Outdated
Comment thread docs/src/developer/useful_formulas.md Outdated
\hat f( q) = 4 \pi R_{l}^{m}(q) (-i)^{l}
\int_{{\mathbb R}^+} r^2 R(r) \ j_{l}(|q| r) / |q|^l dr,
```
which is better behaved for algorithmic differentiation at ``q=0``.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't necessarily mention AD explicitly here? Say we choose in DFTK to work with solid harmonics because they're nicer (polynomials) and this helps in particular with AD. Also introduce the modified hankel transform here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. AD is just one thing where this helps.

end

r = norm(rvec)
# Catch cases of numerically very small r

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same remark as above, nothing to do with eps

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that epsilon is reasonable since the norm computation is itself approximate. Perhaps sqrt(epsilon) would be more correct.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah that's not an issue, norm(lambda x) = lambda norm(x) holds true (in a relative error sense) for a very wide range of lambdas, up to under/overflow, the magnitude just goes into the exponent

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed you are right, for most $\mathbf r$ values the spherical harmonics are very stable (until you get to subnormal numbers, more or less). But issubnormal is exclusive to IEEE float types, and there are a few subtractions in the solid harmonics, so maybe an eps-based check is still reasonable as a conservative bound?

Comment thread src/common/spherical_harmonics.jl Outdated
Comment thread src/pseudo/NormConservingPsp.jl Outdated
Comment thread src/pseudo/PspHgh.jl Outdated

# [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω).
# [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω)
# and we regularize by 1/p^l.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very annoying: you're changing the contract of eval_psp_projector_fourier (which doesn't return a fourier transform anymore), but you don't do this for the local part. Is this really really necessary? If so we need to rename things

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local part has l=0 so the regularized vs unregularized hankel transforms are equivalent there. Same for the NLCC as well

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that's fair, it never actually returned a fourier transform. I think we need to clarify what "radial part" means for such a function, since the comment

Evaluate the radial part of the `i`-th projector for angular momentum `l`
in real-space at the vector with modulus `r`.

is imprecise.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a suggestion, to make this clearer we could rename the _fourier functions to _reciprocal where they are not actually fourier transforms. This may help, but it may also add confusion; I'm divided.

end

function hankel(r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual}
function hankel(quadrature, r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is not necessary anymore?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should that be ?

Returns the ``(l,m)`` real solid harmonic, defined as ``R_l^m(r) = r^l Y_l^m(r)``.
The solid harmonics are homogeneous polynomials of degree l,
which makes them cleanly defined and differentiable at the origin,
unlike the spherical harmonics.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also refer to wikipedia in case we need to extend this later ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find a clean table on Wikipedia. Better use the real spherical harmonic table.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, but what I meant is refer to the spherical harmonics table and indicate "what one has to do" to get to these expressions.

Comment thread src/pseudo/PspHgh.jl Outdated

# [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω).
# [HGH98] (7-15) except they do it with plane waves normalized by 1/sqrt(Ω)
# and we regularize by 1/p^l.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a suggestion, to make this clearer we could rename the _fourier functions to _reciprocal where they are not actually fourier transforms. This may help, but it may also add confusion; I'm divided.

end

function hankel(r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual}
function hankel(quadrature, r::AbstractVector, r2_f::AbstractVector, l::Integer, p::TT) where {TT <: ForwardDiff.Dual}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should that be ?

Comment thread test/PspHgh.jl
Comment thread test/PspUpf.jl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants