[WIP] Additional exx kernels: Voxel averaged and Wigner-Seitz truncation#1282
[WIP] Additional exx kernels: Voxel averaged and Wigner-Seitz truncation#1282toschaefer wants to merge 11 commits into
Conversation
|
Nice! Particularly interested in the Wigner-Seitz truncation - is there any reason to do anything else? It seems to me this is the superior approach? Do we really want to support the voxel averaged? Also maybe some opportunities to share code with ewald.jl, but I haven't looked too closely. It'd be nice to have a generalized API for all these functions that return centered functions (we have a couple of those: densities guesses, local pseudos, projectors, coulomb kernels, etc.). I think long-term we want the ability to get their real-space and partial Fourier transforms (wrt 1 or 2 dimensions), so we can do eg 2D materials. But maybe it's not worth unifying since they have quite different requirements (numerical psp vs analytic but singular coulomb really are quite different computationally) CC @denizunel |
I think so, but I'd keep that for later, same as the generalized API. I'm currently looking at this on the side to streamline the code a bit, will hopefully push a review in the next week. |
For EXX of gapped systems Wigner-Seitz truncation is probably the best approach to reach the thermodynamic llimit.
We need it for surface science (strongly anisotropic cells) with coupled cluster theory. Since our plan is to make DFTK a standard workflow, voxel averaged would be essential. |
|
rebased to make merge easier |
|
Is this still WIP or ready for review? |
|
from my end, ready for review |
| - [`SphericallyTruncatedCoulomb`](@ref): θ(R-r)/r | ||
| - [`WignerSeitzTruncatedCoulomb`](@ref): χ(r)/r where χ(r)=1 inside Wigner-Seitz cell, otherwise 0. | ||
|
|
||
| If an interaction model features a singularity, that requires some special treatment, |
There was a problem hiding this comment.
no comma after singularity (germanism)
There was a problem hiding this comment.
features a singularity -> is long-range
| # where Γ is BZ volume) is used. | ||
| # (where Γ is BZ volume) is used. | ||
| # _compute_kernel_fourier(::InteractionKernel, basis, qpt, q) | ||
| # The single q-point version of compute_kernel_fourier |
There was a problem hiding this comment.
there's compute_ and eval_, should all these be merged?
There was a problem hiding this comment.
(we really need to have a unified API for localized functions that can be evaluated in real or reciprocal space)
There was a problem hiding this comment.
also should we have a eval_kernel_real? This is often useful for debug purposes (can be left as TODO if you don't feel like doing it)
There was a problem hiding this comment.
compute_ sets up and returns the full vector with all the complexity of how the sampling points (including the singularity) are defined for each G+q while eval_ is more the mathematical evaluation of the kernel at a given point G+q.
I added a TODO for eval_kernel_real.
There was a problem hiding this comment.
I don't get this. Both compute the same thing and should be consistent. If the point is that _compute_kernel_fourier(::InteractionKernel, basis, qpt, q) takes multiple kpoints then that should be a separate method of the same function. To be clear, I'd suggest
# computes khat(p). Not guaranteed to be fast
eval_kernel_fourier(::Coulomb, p) = 1/p^2
# eval_kernel_fourier computes khat(G+pshift) for all G in basis using a possibly faster algorithm than just looping over all G+pshift
# default implementation
function eval_kernel_fourier(k::InteractionKernel, basis, pshift)
map(G_vectors(basis)) do G
eval_kernel_fourier(G+pshift)
end
end
function eval_kernel_fourier(k::WignerSeitz, p)
r = eval_kernel_fourier(ShortRange(k.ω))
# add to r the long-range by "slow Fourier transform", just looping over all r_vectors
end
function eval_kernel_fourier(k::WignerSeitz, pshift)
# FFT based algorithm
end
# add a test that this overload is correct: for all kernels, the result eval_kernel_fourier(k,basis,pshift)
There was a problem hiding this comment.
I see. I think there was a misleading comment. I updated the doc string for InteractionKernel to explain why we have eval_kernel_fourier and compute_kernel_fourier for the splitting: (i) formula and (ii) discretization/singularity handling.
@doc raw"""
Abstract type for different interaction models.
### Architecture
Computing interaction kernels is split into two parts: the mathematical formula (e.g. 4\pi/G^2) and the grid discretization. This split is primarily driven by the need to handle singularities in long-range kernels.
1. **InteractionKernel:** Defines the pure mathematical formula (via `eval_kernel_fourier`).
2. **regularization:** Necessary for long-range kernels (like `Coulomb` and `LongRangeCoulomb`) diverge as ``G+q \to 0``. Evaluating them on a periodic grid requires a specific strategy to handle this divergence.
Because of this divergence, long-range `InteractionKernel`s contain a `regularization` field to dictate how the ``G
+q=0`` component is built via `_compute_kernel_fourier`. Short-range kernels have a finite limit at `G+q \to 0``` a
nd don't need a regularizatin.
### Available models:
- [`Coulomb`](@ref): 1/r
- [`ShortRangeCoulomb`](@ref): erfc(μr)/r
- [`LongRangeCoulomb`](@ref): erf(μr)/r
- [`SphericallyTruncatedCoulomb`](@ref): θ(R-r)/r
- [`WignerSeitzTruncatedCoulomb`](@ref): χ(r)/r (1 inside Wigner-Seitz cell, 0 otherwise)
### Available singularity corrections (regularizations):
- [`ProbeCharge`](@ref): Gygi-Baldereschi probe charge method
- [`ReplaceSingularity`](@ref): Set the G+q=0 component to a specific value
- [`VoxelAveraged`](@ref): Average the continuous kernel over the Brillouin zone voxel
See also: [`compute_kernel_fourier`](@ref)
"""
abstract type InteractionKernel end
Base.Broadcast.broadcastable(k::InteractionKernel) = Ref(k)
# TODO: should we have a eval_kernel_real?
# TODO: rename "k" in _compute_kernel_fourier(k...
# TODO: change notation: p instead of G, G+q, ...
|
|
||
| See also: [`compute_kernel_fourier`](@ref) | ||
| """ | ||
| abstract type InteractionKernel end |
There was a problem hiding this comment.
AbstractCoulombKernel? Interaction seems a bit generic (these are all Coulomb-like)
There was a problem hiding this comment.
After some back-and-forth we called it InteractionKernel in #1223 (comment)
| @@ -18,4 +21,4 @@ | |||
There was a problem hiding this comment.
I think we use relatively consistently (or at least we wanted to) p for the Fourier-space variable in \R^3 (so usually things like p=G+k, and q=k-k'). Why mention G at all in this file? Can't you just use p?
There was a problem hiding this comment.
Principally, I am happy to follow your stylistic preferences. I don't think we've mentioned this yet during the EXX development, and if I shall change it, I would prefer to postpone it for now, since I'm already working on the next PR (which also includes some changes in src/coulomb.jl).
There was a problem hiding this comment.
OK, maybe leave a TODO? I think generally this file should be about just providing interaction(p), regardless of where p is coming from (G+q, G...)
| @@ -26,7 +29,7 @@ | |||
There was a problem hiding this comment.
give the equation defining this. Gsq -> p? Or ps?)
There was a problem hiding this comment.
Gsq refers to G squared so psq? But I am not sure if I understand correctly...
| d_min = min(d_min, d) | ||
| end | ||
| if d_min > sqrt(eps(T)) | ||
| V_lr_real[idx] = erf(ω * d_min) / d_min |
There was a problem hiding this comment.
as above, define a erf(x)/x function?
There was a problem hiding this comment.
Hm I think we should have a general way of doing f(x)/x stably and AD-friendly, this is coming up pretty frequently (also with @Technici4n in the spherical harmonics stuff)
There was a problem hiding this comment.
Ah but of course it's my old friend the divided difference (high order in the case of f(x)/x^l). We have a general solution in https://github.com/xuequan818/MatrixFuns.jl but it might be a bit overkill...?
There was a problem hiding this comment.
(the reason I want to get this right is otherwise it will come up to bite us in the ass with AD)
There was a problem hiding this comment.
I want to do things like
function expm1_over_x(x)
if abs(x) < 1000floatmin(Float64)
x = 1000floatmin(Float64)
end
expm1(x) / x
end
but of course this screws up AD at 0. We can have a cutoff point and switch to a finite order taylor expansion. This is ugly and getting good precision of higher derivatives is annoying, but I'm not sure what else to do.
There was a problem hiding this comment.
Ok, I will extend the TODO comments by a general statement, that we need a clever and AD-friendly solution for a possible f(x)/x machinery.
There was a problem hiding this comment.
yes, below abstract type InteractionKernel end.
| V_lr_real = zeros(Complex{T}, basis.fft_size...) | ||
| for idx in CartesianIndices(V_lr_real) | ||
| r_frac = r_vectors[idx] | ||
| r_centered = r_frac .- round.(r_frac) # MIC |
There was a problem hiding this comment.
write it out explicitly (I had to think to get what it was referring to)
| r_centered = r_frac .- round.(r_frac) # MIC | ||
| r_cart = model.lattice * r_centered | ||
| d_min = norm(r_cart) | ||
| for dx in -nx:nx, dy in -ny:ny, dz in -nz:nz # Check neighbors for non-orthorhombic cells |
There was a problem hiding this comment.
I have to say I don't understand what's going on here, why isn't this just erf(r)/r for r in r_vectors?, maybe factor out the geometry stuff to another function?
There was a problem hiding this comment.
Took me a while to understand what I did here. Added better comments to make it understandable.
| Gnorm2 = sum(abs2, G_cart) | ||
| found_singularity = (iG==1 && iszero(q)) | ||
| Rcut = cbrt(basis.model.unit_cell_volume*3/4/π) | ||
| if !found_singularity |
There was a problem hiding this comment.
why don't you just write the condition here? Also why not iszero(G_cart)?
There was a problem hiding this comment.
(also same comment as above, hopefully we can avoid this special casing)
| end | ||
| kernel_fourier | ||
| end | ||
|
|
There was a problem hiding this comment.
Running out of time today, note to self to resume the review from here.
|
Thanks for the review! |
|
I'll try to get to it soon but it's pretty hectic at the moment, sorry... Since you're very active on this I don't want to be a blocker, but at the same time I think one of the key differentiator of DFTK is that we have relatively clean code (not really in the sense that we follow best practices, but rather in the sense that the code is relatively minimal and understandable), and code review really helps with that I find (in the sense that the code we let in has been understood at least once by at least two people). I think the best way to do that is that you fork DFTK and make your changes there. Then, when we merge a PR on the main DFTK repo (or even along the way when you make commits to the PR), you merge it on to your fork. Since the git history is preserved I think the whole process should be relatively painless (in the sense that you shouldn't have to deal with much conflicts) and you don't have to wait on us to merge PRs to progress. (also AI made everything git much smoother, I find) In exchange I promise I'll get to it eventually in a reasonable timeframe. @mfherbst might have a better idea? |
|
I totally agree with all of that. That’s why DFTK is my choice ;) Regarding the fork strategy, normally I do it exactly like that. In this case, I shot myself in the foot because I refactored the coulomb.jl files in both branches in a somewhat non-orthogonal way. This is why I put the k-points branch on hold until this one merged. That’s the only reason I brought it up, but please don't stress. I appreciate you taking the time to review it properly when you find time. |
antoine-levitt
left a comment
There was a problem hiding this comment.
OK, I'm fine with merging the current code with lots of todo, to be adressed (by you, me or Claude :-p) at some point in the future. To me the API is the most pressing concern - it'd be nice to merge eval_kernel_fourier and _compute_kernel_fourier and clarify the relationship between the two.
|
|
||
| See also: [`compute_kernel_fourier`](@ref) | ||
| """ | ||
| abstract type InteractionKernel end |
There was a problem hiding this comment.
OK, maybe leave a TODO? I think generally this file should be about just providing interaction(p), regardless of where p is coming from (G+q, G...)
| # where Γ is BZ volume) is used. | ||
| # (where Γ is BZ volume) is used. | ||
| # _compute_kernel_fourier(::InteractionKernel, basis, qpt, q) | ||
| # The single q-point version of compute_kernel_fourier |
There was a problem hiding this comment.
I don't get this. Both compute the same thing and should be consistent. If the point is that _compute_kernel_fourier(::InteractionKernel, basis, qpt, q) takes multiple kpoints then that should be a separate method of the same function. To be clear, I'd suggest
# computes khat(p). Not guaranteed to be fast
eval_kernel_fourier(::Coulomb, p) = 1/p^2
# eval_kernel_fourier computes khat(G+pshift) for all G in basis using a possibly faster algorithm than just looping over all G+pshift
# default implementation
function eval_kernel_fourier(k::InteractionKernel, basis, pshift)
map(G_vectors(basis)) do G
eval_kernel_fourier(G+pshift)
end
end
function eval_kernel_fourier(k::WignerSeitz, p)
r = eval_kernel_fourier(ShortRange(k.ω))
# add to r the long-range by "slow Fourier transform", just looping over all r_vectors
end
function eval_kernel_fourier(k::WignerSeitz, pshift)
# FFT based algorithm
end
# add a test that this overload is correct: for all kernels, the result eval_kernel_fourier(k,basis,pshift)
| end | ||
| ShortRangeCoulomb(; μ=0.2/u"Å") = ShortRangeCoulomb(austrip(μ)) | ||
| ShortRangeCoulomb(μ::Quantity) = ShortRangeCoulomb(austrip(μ)) | ||
| function eval_kernel_fourier(k::ShortRangeCoulomb, Gsq::T) where {T} |
There was a problem hiding this comment.
sorry I really should have reviewed the previous PR...
| end | ||
| function LongRangeCoulomb(; μ=0.2/u"Å", regularization=ProbeCharge()) | ||
| LongRangeCoulomb(austrip(μ), regularization) | ||
| end |
There was a problem hiding this comment.
@mfherbst I think we should keep the unit stuff away from the code as much as possible
mfherbst
left a comment
There was a problem hiding this comment.
Some minor comments. Main point is still the missing test. I've added some ideas what one could test. Feel free to object and defer if you don't have the time now, but I think it will help in the long run to have such consistency tests.
|
|
||
| ### Architecture | ||
|
|
||
| Computing interaction kernels is split into two parts: the mathematical formula (e.g. 4\pi/G^2) and the grid discretization. This split is primarily driven by the need to handle singularities in long-range kernels. |
There was a problem hiding this comment.
Refer to the new issue here to make it clear this is not yet finalised.
| # Each InteractionKernel should support the following functions: | ||
| # eval_kernel_fourier(::InteractionKernel, Gsq) | ||
| # eval_probe_charge_integral(::InteractionKernel, α) | ||
| # Should return ∫_{BZ} kernel(q) * e^(-α * q^2) dq | ||
| # This is needed for the ProbeCharge regularisation. Note, that no factor 1/Γ | ||
| # where Γ is BZ volume) is used. | ||
| # _compute_kernel_fourier(::InteractionKernel, basis, qpt, q) | ||
| # The single q-point version of compute_kernel_fourier |
There was a problem hiding this comment.
I would not remove this, it's a pretty concise and compact overview of what needs to be done.
| # TODO: introduce a clever and AD-friendly way to deal with f(x)/x for x->0. E.g. intoduce phi(x) = iszero(x) ? one(x) : expm1(x) / x | ||
| # Also very useful for other InteractionKernels. |
There was a problem hiding this comment.
This feels a bit out of place here.
There was a problem hiding this comment.
moved it to the other TODOs below InteractionKernel
| # TODO: This is a bit hackish as the parameter needs to be re-computed every kernel | ||
| # evaluation. Cleaner would be to move this further up in the call hierarchy, | ||
| # such that compute_kernel_fourier is never called without Rcut being set to | ||
| # not nothing |
There was a problem hiding this comment.
I agree with the first sentence, but I don't agree with the second (in the sense that this would be the solution. Maybe just have the first sentence and let us consider how to solve it when it actually becomes an issue. It does not cost that much compute after all ...
| d_min = min(d_min, d) | ||
| end | ||
| if d_min > sqrt(eps(T)) | ||
| V_lr_real[idx] = erf(ω * d_min) / d_min |
|
|
||
| # Use ReplaceSingularity regularisation to explicitly set as the G==0 | ||
| # component the exact limit of the kernel for G->0 | ||
| _compute_kernel_fourier(kRcut, ReplaceSingularity(2π*Rcut^2), basis, qpt, q) |
| k_wstrunc = compute_kernel_fourier(WignerSeitzTruncatedCoulomb(), basis) | ||
| E_wstrunc = exx_energy_only(basis, kpt, k_wstrunc, ψk_real, occk) | ||
| E_ref = -2.3456813523805415 | ||
| @test abs(E_ref - E_wstrunc) < 1e-6 |
There was a problem hiding this comment.
Is it possible to add a test that this is somewhat related to SphericallyTruncatedCoulomb, e.g. test that we get similar entries in k_wstrunc versus k_strunc or that some behaviour between the two agrees ? One may need to set Rcut for the SphericallyTruncatedCoulomb to a special value for this (e.g. the actual size of the cell etc.), but that sounds fair to me.
A good way to test this could be to consider evaluating the kernel on cells of the form a * Diagonal([1, 1, 1]) with larger and larger values of a, I would expect the two implementations to agree more and more as we make a larger.
There was a problem hiding this comment.
I'll also add a few tests on non-cubic cells, just that we have something where Wigner-Seitz should make a real difference.
There was a problem hiding this comment.
Thanks for the non-cubic test.
I am not sure if the sampling points of the kernel of SphericallyTruncatedCoulomb and WignerSeitzTruncatedCoulomb will get more and more similar when you increase the volume of a cubic cell. When I look at the spherically truncated case, the cos function will oscillate wildly with increasing R: v(G) = 4π/G² * (1 - cos(GR)).
Wigner-Seitz will also oscillate like crazy but probably not in the same way.
In the limit, the physics will be the same, but the individual sampling points probably not.
There was a problem hiding this comment.
In the limit, the physics will be the same, but the individual sampling points probably not. But does this not imply that at least the energy should get similar ?
Well if it does not really work, than let's not get too hung up about it.
There was a problem hiding this comment.
Yes, the energy should get similar, but in the limit this should hold for all kernels. Of course, converging at different rates...
So what you mean is to define a large a and then calculate exx_energy_only. The energies will agree to some extend and we simply define a test like @test E_wstrunc ≈ E_strunc atol=1e-3 with some apropriate atol?
There was a problem hiding this comment.
Yes, essentially this captures we get the right limit and that things converge at the right order (if we take two different as ... at least in theory
Two additional useful strategies for the Coulomb kernel for electron repulsion integrals (e.g. for exx).
Wigner-Seitz truncation
R. Sundararaman, T. A. Arias. Phys. Rev. B 87, 165122 (2013)
Voxel averaged (mean potential)
T. Schaefer et al, J. Chem. Phys. 160, 051101 (2024)
This requires Gauss-Legendre quadrature points, hence I added the package FastGaussQuadrature to the dependencies.