[Optimization] Photoionization ODE performance#1946
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the photochemistry solver by removing the radiation flux from the ODE integration and instead attenuating it algebraically based on the change in photon density. It also introduces a configuration option to skip radius plotting in the DTypeFront test. The review feedback correctly points out that the algebraic flux attenuation logic currently assumes a single chemical band and should be updated to store and compute attenuation factors per-band to support multi-band networks.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…ka-astro/quokka into BenWibking/photochemistry-flux-elim
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
Hmm, not sure what's causing the GPU failures here. It works on my local NVIDIA H200 node and on my laptop. I'll have to look at this next week. |
|
Unfortunately I think we should not merge this because, while it speeds up this particular case, it is not a generalizable solution that we can use generically — which is why the code was written the way it was in the first place. The problem is that this approach works only if photons can only be absorbed, not if they can also be created. Consider a situation in which we have an external radiation field entering a cell that supplies a non-zero photon number density and photon flux, which is being absorbed in the cell, but we also have some process in the cell that creates photons in the same band. In this case we could have a situation where essentially all the incoming photons get absorbed, but at the same time the emission process creates an equal number of photons. The emitted photons will be isotropic (neglecting the small effects of relativistic beaming), so in this situation the right answer is that the flux goes down to ~0, but the photon number density remains unchanged. In this situation you clearly can’t deduce the change in flux just from the change in photon density. The situation I’m describing it’s just an idle thought experiment — there are plenty of real astrophysical systems where this sort of effect is important. An example is photoevaporation of protoplanetary disks, where the disk may be shadowed from direct stellar radiation by a puffed-up inner rim, but the disk surface nonetheless receives a significant ionizing photon flux because recombinations to the ground state in the disk corona create ionizing photons that are isotropic in their direction and are produced at positions above the disk that aren’t shadowed. |
Sure, I agree that in general this will not work. But I think we do want to optimize the photoionization network for Gen2 if possible. 10% is a lot of node-hours. |
Ok, but how are you proposing this be handled in the future? The current network is a toy example; it’s not the one we’re planning on using for the Gen 2 runs. For production runs we will be using a more realistic cooling treatment that will have to be generated with JAFF, so the optimizations you’re making here won’t have any effect on the production runs. Moreover, for the production runs I would advocate that we should use a network that properly does photon redistribution by having a case A recombination coefficient and keeping the recombination to the ground state terms, rather than just adopting the on-the-spot approximation as in this toy example. We’re doing so here only because the similarity solution to such we’re comparing also uses the on-the-spot approximation. If we drop that approximation for the production runs, your trick won’t work for them. |
Right, this is just a performance prototype. I propose we have an option in JAFF to turn on this optimization. It can be off by default, since that's more general. I would advocate for using case B for the production simulations. I am skeptical that M1 can accurately compute the radiation anisotropy for case A, so I don't see how that would give us a better answer in practice. |
I don’t agree on case B, but we don’t need to resolve that now. I’m fine with going ahead with this PR as long as it is understood that what we’re doing is just scoping out a feature to add to JAFF. |
|
Just to clarify: the goal of this PR is an existence proof for how fast this ODE system can be solved and a real-world proof of the LLM-driven performance optimization strategy. Once we have this version optimized, we can try to ensure that JAFF reaches (or at least gets close) to this performance baseline for the final version of the network that we use in production. |
|
@markkrumholz Ok, eliminating the radiation flux actually makes this 10 times faster. But this is actually an artifact of how extremely small the current absolute radiation flux tolerance is. |
|
@BenWibking If we really want to remove flux for some networks / problems and have them for others, I can add a preprocessor symbol that keeps or removes the flux variable from Microphysics and photochemistry.hpp. |
I think this is the sort of thing that would be better to put into JAFF, rather than putting it in Quokka. JAFF already has an option for whether, when emitting a photochemistry network, if should emit equations for the photon number densities only or also for the fluxes. This is just a slight variation on that. |
…ka-astro/quokka into BenWibking/photochemistry-flux-elim
|
You would need to change array sizes. Right now, it's just |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
For concreteness, this table shows the latest performance results with a column for the relative cost of one radiation substep compared to one hydro-only step: |
|
One warning to fix: |
Right. This can be understood by the fact that we do not include the same isotropic, recombination term in flux as in photon number density. Removing the convergence check on radiation flux ( #1956 ) resolves this concern and gives the benefits of both worlds. |
Yes, I agree that is a better solution. It's both faster and more accurate. I'll put the flux equation back into this PR, but keep the other optimiztions. Those optimizations might still make it up to ~30% faster. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…ka-astro/quokka into BenWibking/photochemistry-flux-elim
|
The remaining microoptimizations don't appear to have a significant effect now that the tolerances are much looser. ROS2S does make a significant (+20%) walltime improvment on DTypeFront, but it can already be used on the development branch. Closing. |
Description
More performance optimizations (+measured improvement for
DTypeFrontproblem):developmentand on this branch.)sqrtandpowin cooling terms.(Note: Rate terms are NOT modified.
expis NEVER modified anywhere.)ROS2S is the fastest integrator (measured on 1 H200 GPU), both before and after the other optimizations.
However, the other changes in this PR now appear to make it slower. This is under investigation.
Measured 23 Jun 2026 on
development:Measured 23 Jun 2026 on this PR:
Each batch of performance improvements is generated with variations on this prompt:
Either a specific area of the code to focus on or an optimization strategy for the agent to follow can be appended to this generic prompt.
Related issues
N/A
Checklist
Before this pull request can be reviewed, all of these tasks should be completed. Denote completed tasks with an
xinside the square brackets[ ]in the Markdown source below:/azp run.