Skip to content

Conversation

@tamaraevst
Copy link

@tamaraevst tamaraevst commented Aug 22, 2025

Hello,

This is my preliminary attempt to replace GRChombo's AMRInterpolator with amrex particles, subsequently called as ParticleInterpolators (although we can rename it to whatever you like later on). The relevant folder that will eventually replace the AMRInterpolator is located in Source/ParticleInterpolators. I borrowed InterpolationQuery.hpp from the original AMRInterpolator, although renamed it to InterpolationQueryParticle.hpp to not have conflicts, as the AMRInterpolator still remains in the repo.

The key features:

  1. Lagrange interpolation is implemented as a separate class.
  2. A test inTests/using a polynomial is implemented to test the Lagrange interpolation.
  3. Particles are boundary aware. So if symmetric BCs are employed and the query point lies on the symmetric side and outside of the amrex grid, the particle is pushed back into the computational domain. The particle does store the value of the interpolated field with the required parity sign applied.
  4. CustomExtraction is ported and for now supports interpolation of one field for a given query of points.
  5. The BinaryBH example uses this CustomExtraction to interpolate $\chi$ along the line. Some arguments here are hard-coded to random values and are not user-specifiable; but this can be fixed very easily and is most likely the least of our problems.

TO-DO for me on a fresh head:

  • Modify check_domain() function that checks whether we are inside domain to be parity aware; otherwise it throws errors when we ask to interpolate inside the domain but on the side where symmetries are applied.

  • So far the code compiles on GPUs, but I need to run an example (I am having some technical difficulties). On CPUs everything is fine.

Please let me know your thoughts.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/ParticleInterpolators/ParticleInterpolators.impl.hpp
Tests/LagrangeTest/PolynomialTest.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@tamaraevst tamaraevst added the enhancement New feature or request label Aug 22, 2025
@tamaraevst tamaraevst linked an issue Aug 22, 2025 that may be closed by this pull request
@github-actions
Copy link

github-actions bot commented Aug 22, 2025

Cpp-Linter Report ⚠️

Some files did not pass the configured checks!

clang-tidy (v19.1.1) reports: 20 concern(s)
  • Source/BlackHoles/BHAMR.hpp:65:48: warning: [readability-identifier-length]

    parameter name 'q' is too short, expected at least 3 characters

       65 |     void set_query(InterpolationQueryParticle &q) override
          |                                                ^
  • Source/BlackHoles/BHAMR.hpp:90:16: warning: [readability-implicit-bool-conversion]

    implicit conversion 'InterpolationQueryParticle *' -> 'bool'

       90 |         return m_query ? &*m_query : nullptr;
          |                ^
          |                (       != nullptr)
  • Source/GRTeclynCore/AMReXParameters.hpp:229:57: warning: [readability-redundant-string-cstr]

    redundant call to 'c_str'

      229 |             if (!((N_full > 0 || N > 0) && !pp.contains(name.c_str()) &&
          |                                                         ^~~~~~~~~~~~
          |                                                         name
  • Source/GRTeclynCore/AMReXParameters.hpp:230:32: warning: [readability-redundant-string-cstr]

    redundant call to 'c_str'

      230 |                   !pp.contains(name_full.c_str())) &&
          |                                ^~~~~~~~~~~~~~~~~
          |                                name_full
  • Source/GRTeclynCore/AMReXParameters.hpp:231:58: warning: [readability-redundant-string-cstr]

    redundant call to 'c_str'

      231 |                 !((N_full < 0 && N < 0) && !(pp.contains(name.c_str()) &&
          |                                                          ^~~~~~~~~~~~
          |                                                          name
  • Source/GRTeclynCore/AMReXParameters.hpp:232:58: warning: [readability-redundant-string-cstr]

    redundant call to 'c_str'

      232 |                                              pp.contains(name_full.c_str()))))
          |                                                          ^~~~~~~~~~~~~~~~~
          |                                                          name_full
  • Source/GRTeclynCore/AMReXParameters.hpp:241:33: warning: [readability-redundant-string-cstr]

    redundant call to 'c_str'

      241 |                 if (pp.contains(name_full.c_str()))
          |                                 ^~~~~~~~~~~~~~~~~
          |                                 name_full
  • Source/GRTeclynCore/GRAMR.cpp:48:13: warning: [readability-convert-member-functions-to-static]

    method 'convert_derived_multifabs' can be made static

       48 | void GRAMR::convert_derived_multifabs(
          |             ^
  • Tests/Common/SimulationParameters.hpp:17:5: warning: [cppcoreguidelines-pro-type-member-init]

    constructor does not initialize these fields: num_points

       17 |     SimulationParameters(GRParmParse &pp) : SimulationParametersBase(pp)
          |     ^
  • Tests/LagrangeTest/InterpolatorTestLevel.hpp:30:10: warning: [cppcoreguidelines-explicit-virtual-functions]

    annotate this function with 'override' or (rarely) 'final'

       30 |     void initData()
          |          ^         
          |                     override
  • Tests/LagrangeTest/InterpolatorTestLevel.hpp:35:10: warning: [cppcoreguidelines-explicit-virtual-functions]

    annotate this function with 'override' or (rarely) 'final'

       35 |     void specificEvalRHS(amrex::MultiFab &a_soln, amrex::MultiFab &a_rhs,
          |          ^
       36 |                          const double a_time)
          |                                              
          |                                               override
  • Tests/LagrangeTest/InterpolatorTestLevel.hpp:35:26: warning: [bugprone-easily-swappable-parameters]

    2 adjacent parameters of 'specificEvalRHS' of similar type ('amrex::MultiFab &') are easily swapped by mistake

       35 |     void specificEvalRHS(amrex::MultiFab &a_soln, amrex::MultiFab &a_rhs,
          |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /home/runner/work/GRTeclyn/GRTeclyn/GRTeclyn/Tests/LagrangeTest/InterpolatorTestLevel.hpp:35:43: note: the first parameter in the range is 'a_soln'
       35 |     void specificEvalRHS(amrex::MultiFab &a_soln, amrex::MultiFab &a_rhs,
          |                                           ^~~~~~
    /home/runner/work/GRTeclyn/GRTeclyn/GRTeclyn/Tests/LagrangeTest/InterpolatorTestLevel.hpp:35:68: note: the last parameter in the range is 'a_rhs'
       35 |     void specificEvalRHS(amrex::MultiFab &a_soln, amrex::MultiFab &a_rhs,
          |                                                                    ^~~~~
  • Tests/LagrangeTest/LagrangeUnitTest.cpp:85:19: warning: [readability-identifier-length]

    variable name 'L' is too short, expected at least 3 characters

       85 |             auto &L        = gr_amr.getLevel(lev);       // level
          |                   ^
  • Tests/LagrangeTest/LagrangeUnitTest.cpp:87:25: warning: [readability-identifier-length]

    variable name 'ba' is too short, expected at least 3 characters

       87 |             const auto &ba = state.boxArray();           // box array
          |                         ^
  • Tests/LagrangeTest/LagrangeUnitTest.cpp:105:29: warning: [readability-identifier-length]

    variable name 'A' is too short, expected at least 3 characters

      105 |         std::vector<double> A(num_points);
          |                             ^
  • Tests/LagrangeTest/PolynomialTest.hpp:27:70: warning: [readability-identifier-length]

    parameter name 'c' is too short, expected at least 3 characters

       27 |     static void set_center(const std::array<double, AMREX_SPACEDIM> &c)
          |                                                                      ^
  • Tests/LagrangeTest/PolynomialTest.hpp:29:18: warning: [readability-identifier-length]

    loop variable name 'd' is too short, expected at least 2 characters

       29 |         for (int d = 0; d < AMREX_SPACEDIM; ++d)
          |                  ^
  • Tests/LagrangeTest/PolynomialTest.hpp:29:49: warning: [readability-braces-around-statements]

    statement should be inside braces

       29 |         for (int d = 0; d < AMREX_SPACEDIM; ++d)
          |                                                 ^
          |                                                  {
       30 |             my_center[d] = c[d];
          |                                 
  • Tests/LagrangeTest/PolynomialTest.hpp:46:34: warning: [readability-identifier-length]

    parameter name 'b' is too short, expected at least 3 characters

       46 |             [](const amrex::Box &b) { return amrex::grow(b, 2); },
          |                                  ^
  • Tests/LagrangeTest/PolynomialTest.hpp:93:13: warning: [bugprone-easily-swappable-parameters]

    2 adjacent parameters of 'compute' of similar type ('const amrex::GpuArray<amrex::Real, 3> &') are easily swapped by mistake

       93 |             amrex::GpuArray<amrex::Real, AMREX_SPACEDIM> const &dx,
          |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       94 |             amrex::GpuArray<amrex::Real, AMREX_SPACEDIM> const &center) const
          |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /home/runner/work/GRTeclyn/GRTeclyn/GRTeclyn/Tests/LagrangeTest/PolynomialTest.hpp:93:65: note: the first parameter in the range is 'dx'
       93 |             amrex::GpuArray<amrex::Real, AMREX_SPACEDIM> const &dx,
          |                                                                 ^~
    /home/runner/work/GRTeclyn/GRTeclyn/GRTeclyn/Tests/LagrangeTest/PolynomialTest.hpp:94:65: note: the last parameter in the range is 'center'
       94 |             amrex::GpuArray<amrex::Real, AMREX_SPACEDIM> const &center) const
          |                                                                 ^~~~~~

Have any feedback or feature suggestions? Share it here.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/ParticleInterpolators/ParticleInterpolators.impl.hpp
Tests/LagrangeTest/PolynomialTest.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@mirenradia
Copy link
Member

mirenradia commented Sep 3, 2025

Thanks for opening this PR, @tamaraevst. It looks like a good start. 🙂

When I do a proper review, I will have more detailed comments on the code.

For now, I have some broader points to raise:

Persistent particles

Since we usually want to interpolate the same variables at the same points repeatedly during a simulation, we don't want to be adding and redistributing particles each time we want to interpolate. Can you add a method to add a "persistent" interpolation query which can be made once with the particles added just the first time and re-used subsequently? We will need to remember to "redistribute" after a regrid but don't need to if a regrid hasn't happened since the last interpolation call. I believe redistribution is expensive so we should try to minimise it when we can.

Maybe the ParticleInterpolators class can store a std::unordered_map<std::string, InterpolationQueryParticle> object. The first template argument can be a std::string "key" for the interpolation query which subsequent callers of the interpolator can use to re-interpolate the same variables at the same point? This is just a very quick thought so there may be a better way.

Derivatives

The AMRInterpolator in GRChombo supported interpolating spatial derivatives in each direction of the variables (and this is still present in InterpolationQueryParticle). We used this feature in GRChombo in the apparent horizon finder so it is probably something we will eventually want.

We might want to defer this to a future PR and just have an error for the moment if derivatives are requested.

Derived Quantities

A key feature we will want is being able to interpolate derived quantities e.g. $\Psi_4$. Unfortunately, unlike in GRChombo where we had a persistent GRLevelData (MultiFabs in AMReX) for each level for the "diagnostic" quantities, now in GRTeclyn MultiFabs for derived quantities are created on the fly as they are needed. We will probably need to add a function to GRAMR to allow it to create a vector of MultiFabs (1 per level) and also compute a derived quantity on each level's MultiFab. We can probably defer this to a separate PR.

For now, in this PR perhaps it's worth adding the ability to interpolate data from an arbitrary amrex::Vector or std::vector of MultiFabs assuming they have exactly the same geometry, box layout and distribution mapping as that of the state date?

Unit test

Can you make the unit test run on a full "mini" example (i.e. with GRAMR and multiple levels) as for the old GRChombo AMRInterpolator test?

Development practices

  • Would you be able to follow our conventions for commit messages?
  • Your commits have lots of different names and email addresses. Would you be able to make them all consistent? You can set this up on each system in your gitconfig.
  • Would you be able to set up pre-commit to ensure all of your commits are correctly formatted?

- Leave an error message for derivative queries
- Track Redistribute() flag and allow the user to override it
Other relevant modifications to the interpolation class:
- Simplify copying from device to host memory in populate_from_query()
- Make check_domain() parity compatible
@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/ParticleInterpolators/ParticleInterpolators.impl.hpp
Tests/LagrangeTest/PolynomialTest.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

Rewrite a unit test for interpolation to include use of GRAMR and some parameter
inputs.

More details:
- Derived vars interpolation handled via Multifabs (for each level)
- We do not have parity supported for derived vars (throw errors in
  these cases for now)
- Interpolation test handles a polynomial as a derived variable and
  checks for errors using analytical expression
@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@mirenradia
Copy link
Member

  • Persistent particles: I was not sure about the necessity of implementing an unordered_map, unless we ask for multiple queries at the same time and each query would have a key or name. As an easier work-around, I have implemented a flag m_need_redistribute that tracks whether the particles need being redistributed. The redistribution is automatically triggered at the initial stage when particles are seeded. It is then set to false in the subsequent interpolation instances. There is also a function that enables the user to overwrite this flag (m_need_redistribute) to anything they wish it to be. We can discuss this further if some more complex infrastructure is desired.

OK, I think my previous suggestion about persistent particles didn't really make much sense given I have now realised that the way you are currently using the ParticleInterpolators object is instantiating a new one each time it's used. I think it only makes sense to have some notion of persistent particles if we keep this object around from the first time it is used to the last time it is used/end of the simulation and therefore the same particles are used throughout the simulation and not removed and re-added. I also therefore think your m_need_redistribute flag is also not that useful for the same reason unless you are interpolating the same variables at the same particle locations multiple times in the same specificPostTimeStep() call and I can't see why you'd want to do this.

Basically, we need to think of a way of keeping this object throughout the simulation. This way, the particles can be set once before the first interpolation (i.e. ParticleInterpolators::populate_from_query is only called once in the simulation). Perhaps it makes sense to use a separate ParticleInterpolators object for each different set of variables/interpolation points (in this case we should definitely drop the plural and rename it to ParticleInterpolator which was going to be one my suggestions anyway). We can then store the whole collection of all of them in GRAMR (where we stored the AMRInterpolator object in GRChombo). We could even use a std::unordered_map<std::string, ParticleInterpolator> so that each ParticleInterpolator object is accessed with a name (of std::string type). If we go through with this separate ParticleInterpolator object for each different type of interpolation, I think it might make sense to template ParticleInterpolator over the [max] number of components it can interpolate (which would be passed to the NArrayReal template parameter of amrex::ParticleContainer).

  • Derived quantities: I added an additional function interpolate_to_particle_from_derived_fields that takes in MultiFab and allows for this functionality for derived variables. This should address Implement particle interpolation for derived variables #144 , although it would need proper testing with common diagnostics. This also does not resolve the optimal way we would like to store the parity of derived variables. As of current implementation, if reflective BCs are set and we ask to interpolate a derived variable, the code throws an error.

We should add a proper derived quantity this to test but I guess first we need to add the ability to get a vector of MultiFabs with a specified derived quantity calculated on each of them. For the parity of these variables, we could just add an extra argument to your new function which is just a std::vector of BCParitys (with an AMREX_ASSERT to check the length of this vector is the same as the number of components of each of the MultiFabs). Alternatively we could store a vector of BCParitys as a member variable in the ParticleInterpolator object. This would then be set either from something the user passes in (doesn't have to be in the constructor) or calculated using the state variable parities.

  • Unit test: there is an updated unit test with GRAMR level and treating a polynomial, previously used in GRChombo, as a derived variable. For parity related issues for derived quantities (see above), this test uses Sommerfeld throughout.

Once parity support has been added for the vector of MultiFabs, we should use some reflective BCs in the test.

Maybe it's easiest to defer this to a future PR? This one is already quite large!

My other general concern and worry: I feel like there were a lot of issues I stumbled upon when having certain variables in device or host memory (and having to copy things from one another); or things running fine on CPUs, but not working on GPUs. There were also instances when I was not sure whether it is optimal for some functions to purely execute on host. I am sure my implementation is not perfect/well optimised, so I would really welcome any suggestions. As such, some functions are solely/mostly done on CPUs because, for example, they use other functions that are not GPU compatible. I think we should have a discussion on whether there are any parts of the code you would like to follow a different logic, or being more GPU friendly. I would anticipate that for some of these functions, we would need to rewrite other parts of the code that are not interpolation related.

I certainly ran into lots of problems with GPUs when I was working on the puncture tracking in #89. I have yet to go through all of your code in detail but I will keep this in mind when I do.

One thing I have just noticed is I think you are implicitly assuming that the query is the same on every rank (or at least only rank 0's query is used). In GRChombo, we had the ability for each process to request different variables and points. This is used e.g. in the apparent horizon finder. Would you be able to add this capability? Note that this will necessarily complicate the communication hence the use of MPI_Alltoallv (and actually the non-blocking MPI_Ialltoallv assuming MPI 3.0 or greater) in GRChombo's AMRInterpolator. It is then up to the user/creator of ParticleInterpolator (e.g. CustomExtraction) to make sure that data is only requested from the minimal number of processes (e.g. only rank 0 if you're just going to write to a small data file).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I only just noticed you modified this file. Would you be able to remove the changes here and open a separate PR if you want to change this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I may have ran autoupdate on it, so will revert to the older one on my next commit.

@mirenradia mirenradia marked this pull request as draft October 2, 2025 12:54
The following logic is enforced in this commit:
- Particle interpolator is stored in BHAMR (as well as particles are
  populated once)
- Particle interpolator is also now templated over the number of
  components
- We ask for Redistribute() of particles after regrid
- GRAMR contains a very short function to post-process the MultiFabs of derived
  variables into the correct format for interpolation
- GRAMR contains some virtual function to be overriden by BHAMR in order
  to set up query and populate the particles only once
- Handling for parity of derived vars is implemented. Parity handling
  for both state and derived is to be tested!
- Weyl4 is computed in the post-processing step for now; to be addressed later but
  is needed for WeylExtraction testing
This commit reverts to state var extraction in the main example.
This example works well.
Diagnostic extraction does not work with symmetric BCs for now.
Previous implementation was not allowing to Redistribute() after the regrid
Previous implementation was not allowing to Redistribute() after the regrid
@tamaraevst
Copy link
Author

As we discussed previously in the meeting, I have ran a couple of additional checks on $\chi$ extraction along the line (with parity off and on), and I finally get the same result for the BinaryBH example. This was necessary, as I moved lots of things around.

In particular, the user now needs to create an interpolator object for each of the vars/set of vars they want to extract in BHAMR (e.g. there would need to be 2 interpolator objects if one wants to interpolate Psi4 and \chi at the same time). I added some virtual functions to GRAMR to be provided in BHAMR so that the changes in the query can be tracked and population of particles happens only once. Redistribute() is now called after each regrid and I checked that if we do not regrid often, the number of calls to Redistribute() reduces. In practice, this needs more work, as it is initiated on every level; see my comments here.

Here is an example of \chi extracted at one of the points x=256, y=256, z=255 (or -1 if using symmetry along z). Punctures are plotted over to give an idea of when the two BHs merge.

chi_test

Onto Weyl4 next.

@mirenradia
Copy link
Member

I have done/am doing some basic profiling which I will add in a separate comment.

This PR is massive so is going to be quite difficult to review. In an effort to make it easier, when you are ready for this to be reviewed, please could you move all of the *Extraction classes into a separate pull request.

Again, when you are happy for this to be reviewed, could you also remove the debugging printing? I guess some of it can be kept, cleaned up and conditionally printed depending on some kind of verbosity parameter.

Once my AMReX PR (AMReX-Codes/amrex#4780) is merged that will add the ability for Amr to compute a derived quantity on all levels, we should add an interface to ParticleInterpolator such that only the name of the derived quantity is needed e.g. Weyl4. ParticleInterpolator can then go and call Amr::derive to compute the derived quantity and do the interpolation without the user needing to call it themselves and pass the Vector<MultiFab*> themselves.

@mirenradia
Copy link
Member

mirenradia commented Nov 14, 2025

I have done some basic profiling of the code on the CSD3 A100s using AMReX's Tiny Profiler. I made a few modifications to the code on the test/particle_interp branch. In particular, I added some extra BL_PROFILE statements in the ParticleInterpolator and CustomExtraction routines and also I significantly increased the frequency of interpolation (moving it from level 3 to level 5) and the number of interpolated points (15 to 4096) as this is similar (at least in terms of order of magnitude) to the number of points we would interpolate when doing GW extraction on multiple extraction spheres.

I used the following parameter file which is similar to params_profile.txt but with a few modifications.

Expand parameter file
# Based on the q1-d12 configuration of https://inspirehep.net/literature/1994195

# Many of the commented out parameters are relics from GRChombo and may not be
# implemented here.

#################################################
# Filesystem parameters

verbosity = 1

output_path = .
amr.check_file = BinaryBHChk_
amr.plot_file = BinaryBHPlt_
# amr.restart = BinaryBHChk_00100
# amr.file_name_digits = 5

checkpoint_interval = -1
plot_interval = -1

amr.plot_vars = chi Theta

#################################################
# Initial Data parameters

# provide 'offset' or 'center'

massA = 0.48847892320123
massB = 0.48847892320123

offsetA = 0.0 6.10679 0.0
offsetB =  0.0 -6.10679 0.0
# centerA = 512  518.10679 512
# centerB = 512 -505.89321 512

momentumA = -0.0841746 -0.000510846 0.0
momentumB = 0.0841746  0.000510846 0.0

#################################################
# Grid parameters

N_full = 256
L_full = 32

max_level = 5 # There are (max_level+1) grids, so min is zero

# -1 disables regridding
regrid_interval = 1 -1 -1 -1 -1 -1 -1 -1 -1 -1
regrid_threshold = 0.02

# Max and min box sizes
max_grid_size = 32
block_factor = 32

# Tag buffer size
amr.n_error_buf = 4 4 4
# num_ghosts = 3
# center = 8 8 8 # defaults to center of the grid

#################################################
# Boundary Conditions parameters

# Periodic directions - 0 = false, 1 = true
isPeriodic = 0 0 0
# if not periodic, then specify the boundary type
# 0 = static, 1 = sommerfeld, 2 = reflective
# (see BoundaryConditions.hpp for details)
hi_boundary = 1 1 1
lo_boundary = 1 1 2

# if reflective boundaries selected, must set
# parity of all vars (in order given by UserVariables.hpp)
# 0 = even
# 1,2,3 = odd x, y, z
# 7     = odd xyz
vars_parity = 0 0 4 6 0 5 0  0 0 4 6 0 5 0  0 1 2 3          0 1 2 3 1 2 3
              #chi and hij   K and Aij      Theta and Gamma  Gauge

# if sommerfeld boundaries selected, must select
# non zero asymptotic values
num_nonzero_asymptotic_vars = 5
nonzero_asymptotic_vars = chi h11 h22 h33 lapse
nonzero_asymptotic_values = 1.0 1.0 1.0 1.0 1.0

# if you are using extrapolating BC:
# extrapolation_order = 1
# num_extrapolating_vars = -1
# extrapolating_vars =

#################################################
# Evolution parameters

# dt will be dx*dt_multiplier on each grid level
dt_multiplier = 0.25
# stop_time = 2200.0
max_steps = 1

# Spatial derivative order (only affects CCZ4 RHS)
max_spatial_derivative_order = 4 # only 4 currently implemented

nan_check = 1

# Lapse evolution
lapse_advec_coeff = 1.0
lapse_coeff = 2.0
lapse_power = 1.0

# Shift evolution
shift_advec_coeff = 1.0
shift_Gamma_coeff = 0.75
eta = 1.0 # eta of gamma driver

# CCZ4 parameters
formulation = 0 # 1 for BSSN, 0 for CCZ4
kappa1 = 0.1
kappa2 = 0.
kappa3 = 1.
covariantZ4 = 1 # 0: keep kappa1; 1 [default]: replace kappa1 -> kappa1/lapse

# coefficient for KO numerical dissipation
sigma = 0.5

# min_chi = 1.e-4
# min_lapse = 1.e-4

#################################################
# Extraction parameters

# We don't have any extraction yet but the extraction spheres are used for
# tagging cells
activate_extraction = 0
num_extraction_radii = 2
extraction_radii = 110.0 150.0
extraction_levels = 3 2
num_points_phi = 32
num_points_theta = 48
write_extraction = 0
num_modes = 3
modes = 2 0  2 1  2 2

###########################
# GPU parameters

amrex.use_gpu_aware_mpi = 1
amrex.abort_on_out_of_gpu_memory = 1

###########################
# Tiny Profiler parameters

tiny_profiler.print_threshold = 0.001

Single GPU results

Click here to expand TinyProfiler output
TinyProfiler total time across processes [min...avg...max]: 20.18 ... 20.18 ... 20.18

-------------------------------------------------------------------------------------------------
Name                                              NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------
BinaryBHLevel::specificEvalRHS()                     252      15.23      15.23      15.23  75.44%
RungeKutta4                                           63      1.081      1.081      1.081   5.36%
ParticleInterpolators::interp()                       32     0.7487     0.7487     0.7487   3.71%
CustomExtraction::execute_query()                     32       0.65       0.65       0.65   3.22%
FillBoundary_nowait()                                316     0.6229     0.6229     0.6229   3.09%
FillPatchInterp(Fab)                                 291     0.3755     0.3755     0.3755   1.86%
FabArray::ParallelCopy_nowait()                      581     0.2924     0.2924     0.2924   1.45%
StateData::FillBoundary(geom)                      21188     0.2748     0.2748     0.2748   1.36%
GRAMRLevel::post_timestep()                           63     0.1744     0.1744     0.1744   0.86%
StateDataPhysBCFunct::()                             589     0.1525     0.1525     0.1525   0.76%
AmrMesh::MakeNewGrids()                                7     0.1375     0.1375     0.1375   0.68%
MultiFab::contains_nan()                              63    0.09937    0.09937    0.09937   0.49%
FillPatcher::fillRK()                                248    0.07735    0.07735    0.07735   0.38%
BinaryBHLevel::initialData                            16    0.07037    0.07037    0.07037   0.35%
amrex::Copy()                                         46    0.06866    0.06866    0.06866   0.34%
CellQuartic::interp()                               3120    0.04485    0.04485    0.04485   0.22%
FabArrayBase::CPC::define()                          176    0.02257    0.02257    0.02257   0.11%
FabArray::setDomainBndry()                            89   0.009485   0.009485   0.009485   0.05%
TagBoxArray::mapPRD                                   15   0.009161   0.009161   0.009161   0.05%
FabArrayBase::FB::FB()                                 8   0.006565   0.006565   0.006565   0.03%
OwnerMask()                                           15    0.00641    0.00641    0.00641   0.03%
ParticleInterpolators::interpolate_to_particle()      32   0.005586   0.005586   0.005586   0.03%
TagBoxArray::collate()                                15    0.00486    0.00486    0.00486   0.02%
BinaryBHLevel::tag_cells()                            15   0.002816   0.002816   0.002816   0.01%
FillPatchIterator::Initialize                         46   0.002563   0.002563   0.002563   0.01%
FillPatchTwoLevels                                    43   0.001874   0.001874   0.001874   0.01%
FabArray::setVal()                                    30   0.001859   0.001859   0.001859   0.01%
AmrLevel::FillPatch()                                 46   0.001746   0.001746   0.001746   0.01%
runGRTeclyn()                                          1   0.001554   0.001554   0.001554   0.01%
AmrLevel::storeRKCoarseData()                         31   0.001206   0.001206   0.001206   0.01%
Amr::timeStep()                                       63   0.001026   0.001026   0.001026   0.01%
GRAMRLevel::advance()                                 63   0.000733   0.000733   0.000733   0.00%
FillPatcher::storeRKCoarseData()                      31  0.0007274  0.0007274  0.0007274   0.00%
StateData::define()                                   16  0.0005546  0.0005546  0.0005546   0.00%
Amr::regrid()                                          2  0.0004452  0.0004452  0.0004452   0.00%
amrex::unpackBuffer                                    1  0.0004155  0.0004155  0.0004155   0.00%
FabArrayBase::getCPC()                               581  0.0003826  0.0003826  0.0003826   0.00%
DenseBins<T>::buildGPU                                 6  0.0003648  0.0003648  0.0003648   0.00%
FabArray::ParallelCopy()                             581  0.0003474  0.0003474  0.0003474   0.00%
ParticleContainer::RedistributeGPU()                   1  0.0003415  0.0003415  0.0003415   0.00%
UtilRenameDirectoryToOld()                             1  0.0002687  0.0002687  0.0002687   0.00%
Amr::defBaseLevel()                                    1   0.000239   0.000239   0.000239   0.00%
FabArrayBase::TheFPinfo()                            322  0.0002211  0.0002211  0.0002211   0.00%
Redistribute_partition                                 1   0.000193   0.000193   0.000193   0.00%
AmrLevel::FillRKPatch()                              252  0.0001834  0.0001834  0.0001834   0.00%
FabArray::FillBoundary()                             316  0.0001775  0.0001775  0.0001775   0.00%
ParticleCopyPlan::build                                1    0.00017    0.00017    0.00017   0.00%
FPinfo::FPinfo()                                      18  0.0001695  0.0001695  0.0001695   0.00%
ParticleInterpolators::ensure_redistributed()         32  0.0001625  0.0001625  0.0001625   0.00%
ParticleInterpolators::populate_from_query()           1  0.0001552  0.0001552  0.0001552   0.00%
DistributionMapping::SFCProcessorMapDoIt()            11  0.0001281  0.0001281  0.0001281   0.00%
Amr::grid_places()                                     7  0.0001272  0.0001272  0.0001272   0.00%
FabArray::ParallelCopy_finish()                      581  0.0001224  0.0001224  0.0001224   0.00%
DistributionMapping::LeastUsedCPUs()                  29  0.0001152  0.0001152  0.0001152   0.00%
ClusterList::intersect()                              15  0.0001146  0.0001146  0.0001146   0.00%
FabArrayBase::getFB()                                316  0.0001143  0.0001143  0.0001143   0.00%
Amr::InitAmr()                                         1  0.0001073  0.0001073  0.0001073   0.00%
BoxList::complementIn                                 34  9.874e-05  9.874e-05  9.874e-05   0.00%
BinaryBHLevel::variableSetUp()                         1   9.31e-05   9.31e-05   9.31e-05   0.00%
FillPatchSingleLevel                                  89  9.147e-05  9.147e-05  9.147e-05   0.00%
BinaryBHLevel::specificPostTimeStep()                 63  8.621e-05  8.621e-05  8.621e-05   0.00%
AmrLevel::AmrLevel(dm)                                16   7.61e-05   7.61e-05   7.61e-05   0.00%
Amr::coarseTimeStep()                                  1  7.298e-05  7.298e-05  7.298e-05   0.00%
Amr::bldFineLevels()                                   1  7.296e-05  7.296e-05  7.296e-05   0.00%
FillBoundary_finish()                                316  6.889e-05  6.889e-05  6.889e-05   0.00%
Amr::FinalizeInit()                                    1  6.821e-05  6.821e-05  6.821e-05   0.00%
ParticleBufferMap::define                              1  6.328e-05  6.328e-05  6.328e-05   0.00%
GRAMRLevel::errorEst()                                15  5.825e-05  5.825e-05  5.825e-05   0.00%
AmrMesh-cluster                                       15  5.765e-05  5.765e-05  5.765e-05   0.00%
FillPatchIterator::FillFromTwoLevels()                43  4.347e-05  4.347e-05  4.347e-05   0.00%
FillPatchIterator::FillFromLevel0()                    3  3.411e-05  3.411e-05  3.411e-05   0.00%
AmrLevel::RK()                                        63  2.171e-05  2.171e-05  2.171e-05   0.00%
Amr::initialInit()                                     1  1.729e-05  1.729e-05  1.729e-05   0.00%
Amr::InitializeInit()                                  1  1.729e-05  1.729e-05  1.729e-05   0.00%
BoxList::parallelComplementIn()                       22  1.531e-05  1.531e-05  1.531e-05   0.00%
ParticleContainer::defineBufferMap                     1  1.071e-05  1.071e-05  1.071e-05   0.00%
Amr::init()                                            1   7.71e-07   7.71e-07   7.71e-07   0.00%
Other                                                 44  0.0001964  0.0001964  0.0001964   0.00%
-------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------
Name                                              NCalls  Incl. Min  Incl. Avg  Incl. Max   Max %
-------------------------------------------------------------------------------------------------
runGRTeclyn()                                          1      20.18      20.18      20.18 100.00%
Amr::coarseTimeStep()                                  1      19.89      19.89      19.89  98.57%
Amr::timeStep()                                       63      19.89      19.89      19.89  98.57%
GRAMRLevel::advance()                                 63      18.02      18.02      18.02  89.28%
AmrLevel::RK()                                        63      17.92      17.92      17.92  88.78%
RungeKutta4                                           63      17.92      17.92      17.92  88.78%
BinaryBHLevel::specificEvalRHS()                     252      15.23      15.23      15.23  75.44%
GRAMRLevel::post_timestep()                           63      1.874      1.874      1.874   9.28%
AmrLevel::FillRKPatch()                              252      1.587      1.587      1.587   7.86%
FillPatcher::fillRK()                                248      1.543      1.543      1.543   7.65%
BinaryBHLevel::specificPostTimeStep()                 63      1.417      1.417      1.417   7.02%
CustomExtraction::execute_query()                     32      1.417      1.417      1.417   7.02%
ParticleInterpolators::interp()                       32     0.7487     0.7487     0.7487   3.71%
FabArray::FillBoundary()                             316     0.6298     0.6298     0.6298   3.12%
FillBoundary_nowait()                                316     0.6296     0.6296     0.6296   3.12%
StateDataPhysBCFunct::()                             589     0.4273     0.4273     0.4273   2.12%
FillPatchInterp(Fab)                                 291     0.4204     0.4204     0.4204   2.08%
FabArray::ParallelCopy()                             581     0.3158     0.3158     0.3158   1.56%
FabArray::ParallelCopy_nowait()                      581     0.3153     0.3153     0.3153   1.56%
AmrLevel::FillPatch()                                 46     0.3137     0.3137     0.3137   1.55%
Amr::init()                                            1     0.2871     0.2871     0.2871   1.42%
Amr::initialInit()                                     1     0.2871     0.2871     0.2871   1.42%
Amr::FinalizeInit()                                    1     0.2764     0.2764     0.2764   1.37%
Amr::bldFineLevels()                                   1     0.2763     0.2763     0.2763   1.37%
StateData::FillBoundary(geom)                      21188     0.2748     0.2748     0.2748   1.36%
FillPatchIterator::Initialize                         46     0.2433     0.2433     0.2433   1.21%
FillPatchIterator::FillFromTwoLevels()                43     0.2168     0.2168     0.2168   1.07%
FillPatchTwoLevels                                    43     0.2168     0.2168     0.2168   1.07%
Amr::grid_places()                                     7     0.2151     0.2151     0.2151   1.07%
AmrMesh::MakeNewGrids()                                7      0.215      0.215      0.215   1.07%
Amr::regrid()                                          2     0.1951     0.1951     0.1951   0.97%
FillPatchSingleLevel                                  89     0.1721     0.1721     0.1721   0.85%
MultiFab::contains_nan()                              63    0.09937    0.09937    0.09937   0.49%
BinaryBHLevel::initialData                            16    0.07037    0.07037    0.07037   0.35%
amrex::Copy()                                         46    0.06866    0.06866    0.06866   0.34%
GRAMRLevel::errorEst()                                15    0.04964    0.04964    0.04964   0.25%
CellQuartic::interp()                               3120    0.04485    0.04485    0.04485   0.22%
AmrLevel::storeRKCoarseData()                         31    0.02572    0.02572    0.02572   0.13%
FillPatcher::storeRKCoarseData()                      31    0.02451    0.02451    0.02451   0.12%
FabArrayBase::getCPC()                               581    0.02296    0.02296    0.02296   0.11%
FabArrayBase::CPC::define()                          176    0.02257    0.02257    0.02257   0.11%
TagBoxArray::mapPRD                                   15    0.02112    0.02112    0.02112   0.10%
ParticleInterpolators::interpolate_to_particle()      32    0.01741    0.01741    0.01741   0.09%
FillPatchIterator::FillFromLevel0()                    3    0.01588    0.01588    0.01588   0.08%
Amr::InitializeInit()                                  1    0.01072    0.01072    0.01072   0.05%
Amr::defBaseLevel()                                    1     0.0107     0.0107     0.0107   0.05%
FabArray::setDomainBndry()                            89   0.009485   0.009485   0.009485   0.05%
FabArrayBase::getFB()                                316   0.006679   0.006679   0.006679   0.03%
FabArrayBase::FB::FB()                                 8   0.006565   0.006565   0.006565   0.03%
OwnerMask()                                           15    0.00641    0.00641    0.00641   0.03%
TagBoxArray::collate()                                15    0.00486    0.00486    0.00486   0.02%
BinaryBHLevel::tag_cells()                            15   0.002816   0.002816   0.002816   0.01%
FabArray::setVal()                                    30   0.001859   0.001859   0.001859   0.01%
ParticleInterpolators::ensure_redistributed()         32     0.0018     0.0018     0.0018   0.01%
ParticleContainer::RedistributeGPU()                   1   0.001638   0.001638   0.001638   0.01%
AmrLevel::AmrLevel(dm)                                16  0.0006307  0.0006307  0.0006307   0.00%
StateData::define()                                   16  0.0005546  0.0005546  0.0005546   0.00%
FabArrayBase::TheFPinfo()                            322  0.0004498  0.0004498  0.0004498   0.00%
amrex::unpackBuffer                                    1  0.0004155  0.0004155  0.0004155   0.00%
DenseBins<T>::buildGPU                                 6  0.0003648  0.0003648  0.0003648   0.00%
UtilRenameDirectoryToOld()                             1  0.0002687  0.0002687  0.0002687   0.00%
AmrMesh-cluster                                       15  0.0002315  0.0002315  0.0002315   0.00%
FPinfo::FPinfo()                                      18  0.0002287  0.0002287  0.0002287   0.00%
Amr::InitAmr()                                         1   0.000202   0.000202   0.000202   0.00%
DistributionMapping::SFCProcessorMapDoIt()            11  0.0001939  0.0001939  0.0001939   0.00%
Redistribute_partition                                 1   0.000193   0.000193   0.000193   0.00%
ParticleCopyPlan::build                                1  0.0001807  0.0001807  0.0001807   0.00%
ParticleInterpolators::populate_from_query()           1  0.0001552  0.0001552  0.0001552   0.00%
BoxList::parallelComplementIn()                       22  0.0001316  0.0001316  0.0001316   0.00%
FabArray::ParallelCopy_finish()                      581  0.0001224  0.0001224  0.0001224   0.00%
DistributionMapping::LeastUsedCPUs()                  29  0.0001152  0.0001152  0.0001152   0.00%
ClusterList::intersect()                              15  0.0001146  0.0001146  0.0001146   0.00%
BoxList::complementIn                                 34  9.874e-05  9.874e-05  9.874e-05   0.00%
BinaryBHLevel::variableSetUp()                         1   9.31e-05   9.31e-05   9.31e-05   0.00%
ParticleContainer::defineBufferMap                     1  7.399e-05  7.399e-05  7.399e-05   0.00%
FillBoundary_finish()                                316  6.889e-05  6.889e-05  6.889e-05   0.00%
ParticleBufferMap::define                              1  6.328e-05  6.328e-05  6.328e-05   0.00%
Other                                                 44  0.0001964  0.0001964  0.0001964   0.00%
-------------------------------------------------------------------------------------------------

Unused ParmParse Variables:
  [TOP]::amr.plot_vars(nvals = 2)  :: [chi, Theta]
  [TOP]::extraction_levels(nvals = 2)  :: [3, 2]
  [TOP]::extraction_radii(nvals = 2)  :: [110.0, 150.0]
  [TOP]::modes(nvals = 6)  :: [2, 0, 2, 1, 2, 2]
  [TOP]::num_extraction_radii(nvals = 1)  :: [2]
  [TOP]::num_modes(nvals = 1)  :: [3]
  [TOP]::num_nonzero_asymptotic_vars(nvals = 1)  :: [5]
  [TOP]::num_points_phi(nvals = 1)  :: [32]
  [TOP]::num_points_theta(nvals = 1)  :: [48]
  [TOP]::vars_parity(nvals = 25)  :: [0, 0, 4, 6, 0, 5, 0, 0, 0, 4, 6, 0, 5, 0, 0, 1, 2, 3, 0, 1, 2, 3, 1, 2, 3]
  [TOP]::write_extraction(nvals = 1)  :: [0]

Device Memory Usage:
-----------------------------------------------------------------------------------
Name                                              Nalloc  Nfree    AvgMem    MaxMem
-----------------------------------------------------------------------------------
The_Arena::Initialize()                                1      1    11 MiB    59 GiB
StateData::define()                                 1788   1788  9128 MiB    11 GiB
GRAMRLevel::advance()                                876    876  8770 MiB  9168 MiB
RungeKutta4                                        44428  44428  4301 MiB  6400 MiB
FillPatchIterator::Initialize                       7004   7004    18 MiB  1619 MiB
FillPatcher::storeRKCoarseData()                    1525   1525  1170 MiB  1237 MiB
GRAMRLevel::post_timestep()                        16272  16272  4266 KiB   915 MiB
FillPatcher::fillRK()                                827    827   642 MiB   680 MiB
CellQuartic::interp()                               6240   6240  2087 KiB   214 MiB
FillPatchTwoLevels                                  1360   1360   261 KiB   102 MiB
ResizeRandomSeed                                       1      1    40 MiB    40 MiB
AmrMesh::MakeNewGrids()                             3172   3172    92 KiB    13 MiB
FillBoundary_nowait()                                 15     15  5488 KiB  5889 KiB
ParticleContainer::RedistributeGPU()                  13     13    24 KiB   999 KiB
FabArray::ParallelCopy_nowait()                      581    581  2790   B   850 KiB
ParticleInterpolators::populate_from_query()           5      5    13   B   301 KiB
OwnerMask()                                         1595   1595    23   B   287 KiB
ParticleInterpolators::interpolate_to_particle()     192    192    38   B   216 KiB
amrex::unpackBuffer                                   10     10   191 KiB   209 KiB
amrex::packBuffer                                      1      1     4   B   193 KiB
BinaryBHLevel::initialData                            27     27   170 KiB   171 KiB
TagBoxArray::mapPRD                                 3160   3160    79   B   128 KiB
BinaryBHLevel::specificEvalRHS()                     253    253    66 KiB   128 KiB
Redistribute_partition                                 6      6     4   B   112 KiB
amrex::Copy()                                         60     60    75 KiB   111 KiB
FabArray::setVal()                                    39     39    46 KiB   110 KiB
StateData::FillBoundary(geom)                      21692  21692   139   B    55 KiB
BinaryBHLevel::tag_cells()                             9      9    46 KiB    46 KiB
ParticleInterpolators::interp()                      160    160     7   B    43 KiB
DenseBins<T>::buildGPU                                36     36    37 KiB    42 KiB
MultiFab::contains_nan()                              63     63   135   B    27 KiB
ParticleCopyPlan::build                                7      7     0   B    25 KiB
ParticleBufferMap::define                              3      3  7072   B  7744   B
TagBoxArray::collate()                                45     45     0   B  4512   B
-----------------------------------------------------------------------------------

Managed Memory Usage:
----------------------------------------------------------------
Name                             Nalloc  Nfree  AvgMem    MaxMem
----------------------------------------------------------------
The_Managed_Arena::Initialize()       1      1  24   B  8192 KiB
----------------------------------------------------------------

Pinned Memory Usage:
-----------------------------------------------------------------------------------
Name                                              Nalloc  Nfree    AvgMem    MaxMem
-----------------------------------------------------------------------------------
The_Pinned_Arena::Initialize()                         1      1  1858   B  8192 KiB
FillBoundary_nowait()                                 15     15  5488 KiB  5889 KiB
FabArray::ParallelCopy_nowait()                      581    581  2790   B   850 KiB
OwnerMask()                                           15     15     2   B   223 KiB
BinaryBHLevel::initialData                            27     27   170 KiB   171 KiB
BinaryBHLevel::specificEvalRHS()                     253    253    66 KiB   128 KiB
amrex::Copy()                                         60     60    75 KiB   111 KiB
FabArray::setVal()                                    39     39    46 KiB   110 KiB
RungeKutta4                                            6      6   104 KiB   109 KiB
StateData::FillBoundary(geom)                      21692  21692   140   B    55 KiB
ParticleInterpolators::interp()                      320    320     5   B    52 KiB
BinaryBHLevel::tag_cells()                             9      9    46 KiB    46 KiB
FillPatcher::fillRK()                                217    217    34 KiB    36 KiB
ParticleContainer::RedistributeGPU()                  78     78    23 KiB    27 KiB
AmrMesh::MakeNewGrids()                               12     12     0   B    18 KiB
TagBoxArray::collate()                                60     60     0   B  5824   B
ParticleCopyPlan::build                                1      1     0   B  3856   B
MultiFab::contains_nan()                              63     63     0   B    16   B
ParticleInterpolators::interpolate_to_particle()     192    192     0   B    16   B
Redistribute_partition                                 1      1     0   B    16   B
-----------------------------------------------------------------------------------

Comms Memory Usage:
--------------------------------------------------------------
Name                           Nalloc  Nfree  AvgMem    MaxMem
--------------------------------------------------------------
The_Comms_Arena::Initialize()       1      1  61   B  8192 KiB
--------------------------------------------------------------

8 GPUs, 2 nodes (4 GPUs per node) results

Click here to expand Tiny Profiler output
TinyProfiler total time across processes [min...avg...max]: 5.924 ... 5.924 ... 5.925

-------------------------------------------------------------------------------------------------
Name                                              NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------
FabArray::ParallelCopy_finish()                      581     0.7379      1.856      2.229  37.63%
BinaryBHLevel::specificEvalRHS()                     252      1.916      1.928      1.942  32.77%
ParticleInterpolators::interp()                       32   0.008824      0.175      1.338  22.58%
FillBoundary_finish()                                316     0.2537     0.4617     0.9387  15.84%
CustomExtraction::execute_query()                     32   0.002244    0.08436     0.6514  10.99%
DistributionMapping::LeastUsedCPUs()                  29   0.005119     0.4932     0.5644   9.53%
ParticleInterpolators::interpolate_to_particle()      32   0.005954     0.2603     0.3458   5.84%
amrex::communicateParticlesFinish                      1   3.41e-07    0.08477     0.3391   5.72%
RungeKutta4                                           63     0.1445     0.1456      0.147   2.48%
FillPatchInterp(Fab)                                 291    0.05306    0.07494      0.116   1.96%
FabArray::ParallelCopy_nowait()                      581    0.06408    0.07805    0.08902   1.50%
StateDataPhysBCFunct::()                             589    0.05371    0.06501     0.0819   1.38%
StateData::FillBoundary(geom)                       2648    0.02667    0.03603    0.04837   0.82%
FillBoundary_nowait()                                316    0.03138    0.03476     0.0371   0.63%
GRAMRLevel::post_timestep()                           63    0.02551    0.02662    0.02946   0.50%
AmrMesh::MakeNewGrids()                                7    0.02296    0.02366    0.02451   0.41%
FillPatcher::fillRK()                                248    0.02128    0.02251    0.02339   0.39%
MultiFab::contains_nan()                              63    0.01503    0.01512    0.01519   0.26%
amrex::Copy()                                         46   0.009439   0.009557    0.00962   0.16%
BinaryBHLevel::initialData                            16   0.009333   0.009372   0.009412   0.16%
CellQuartic::interp()                                391    0.00559   0.006972   0.009323   0.16%
ParticleInterpolators::ensure_redistributed()         32   0.002057   0.005037   0.007867   0.13%
FabArray::setVal()                                    30  0.0008149   0.001375   0.005071   0.09%
FabArrayBase::CPC::define()                          176   0.004399   0.004602   0.004824   0.08%
FabArray::setDomainBndry()                            89   0.001956    0.00242   0.002879   0.05%
TagBoxArray::collate()                                15   0.001654   0.002309   0.002839   0.05%
amrex::communicateParticlesStart                       1  1.001e-05  0.0003037   0.002282   0.04%
TagBoxArray::mapPRD                                   15   0.001826   0.001942   0.002089   0.04%
OwnerMask()                                           15   0.001234   0.001296   0.001386   0.02%
runGRTeclyn()                                          1   0.001122   0.001198    0.00136   0.02%
FabArrayBase::FB::FB()                                 8   0.001227    0.00128   0.001307   0.02%
FillPatchTwoLevels                                    43    0.00105   0.001151   0.001274   0.02%
Amr::timeStep()                                       63  0.0005267  0.0006346   0.001242   0.02%
AmrLevel::storeRKCoarseData()                         31  0.0008077  0.0008451  0.0008938   0.02%
Amr::coarseTimeStep()                                  1  3.778e-05  0.0001753  0.0007375   0.01%
BinaryBHLevel::tag_cells()                            15  0.0005558  0.0005873  0.0007328   0.01%
AmrLevel::FillPatch()                                 46  0.0006471  0.0006853  0.0007267   0.01%
FillPatchIterator::Initialize                         46  0.0005442  0.0006051   0.000656   0.01%
GRAMRLevel::advance()                                 63  0.0003856  0.0004201  0.0005529   0.01%
Amr::regrid()                                          2    0.00025   0.000311  0.0005189   0.01%
ParticleInterpolators::populate_from_query()           1  1.796e-05  0.0001405  0.0004155   0.01%
FabArray::ParallelCopy()                             581   0.000333  0.0003681  0.0003975   0.01%
FabArrayBase::getCPC()                               581  0.0003487  0.0003631  0.0003941   0.01%
ParticleContainer::RedistributeGPU()                   1  0.0002948  0.0003516  0.0003931   0.01%
FillPatcher::storeRKCoarseData()                      31  0.0002959  0.0003296  0.0003922   0.01%
DenseBins<T>::buildGPU                                 6  0.0003028  0.0003301  0.0003554   0.01%
UtilCreateDirectoryDestructive()                       1  4.192e-05  0.0002296  0.0003258   0.01%
Amr::defBaseLevel()                                    1  0.0002153  0.0002686  0.0003174   0.01%
Amr::grid_places()                                     7  9.375e-05  0.0001499  0.0002742   0.00%
UtilRenameDirectoryToOld()                             0          0  3.032e-05  0.0002426   0.00%
StateData::define()                                   16  0.0001849  0.0001994  0.0002242   0.00%
FabArray::FillBoundary()                             316  0.0001839  0.0001995  0.0002188   0.00%
ParticleCopyPlan::doHandShake                          1  2.727e-05  0.0001557  0.0002065   0.00%
FabArrayBase::TheFPinfo()                            322  0.0001849  0.0001974  0.0002055   0.00%
Redistribute_partition                                 1  6.983e-06  4.689e-05  0.0001996   0.00%
amrex::unpackBuffer                                    1  9.493e-05  0.0001133  0.0001884   0.00%
FPinfo::FPinfo()                                      18  0.0001433  0.0001597  0.0001853   0.00%
DistributionMapping::SFCProcessorMapDoIt()            11  0.0001154  0.0001355   0.000167   0.00%
AmrLevel::FillRKPatch()                              252  0.0001353   0.000146  0.0001534   0.00%
BoxList::pci                                           4  7.494e-05  0.0001155  0.0001509   0.00%
ParticleCopyPlan::build                                1  0.0001092    0.00012  0.0001319   0.00%
ParticleBufferMap::define                              1  9.836e-05  0.0001057  0.0001194   0.00%
Amr::FinalizeInit()                                    1  1.243e-05  3.351e-05  0.0001112   0.00%
knapsack()                                            15  6.769e-05  9.174e-05  0.0001098   0.00%
FabArrayBase::getFB()                                316  9.546e-05  0.0001043  0.0001091   0.00%
BoxList::complementIn                                 34  7.494e-05   9.11e-05  0.0001082   0.00%
BinaryBHLevel::variableSetUp()                         1  5.568e-05  8.314e-05   0.000107   0.00%
amrex::unpackRemotes                                   1  1.824e-06  2.049e-05  0.0001059   0.00%
FillPatchSingleLevel                                  89  6.885e-05  8.624e-05  0.0001049   0.00%
ClusterList::intersect()                               1          0  1.308e-05  0.0001046   0.00%
Amr::bldFineLevels()                                   1  6.024e-05  7.713e-05  9.504e-05   0.00%
Amr::InitAmr()                                         1  5.606e-05  7.766e-05  9.196e-05   0.00%
AmrLevel::AmrLevel(dm)                                16  5.594e-05  6.481e-05  7.997e-05   0.00%
BinaryBHLevel::specificPostTimeStep()                 63  4.865e-05  5.677e-05  7.634e-05   0.00%
GRAMRLevel::errorEst()                                15  2.081e-05  3.411e-05   5.85e-05   0.00%
FillPatchIterator::FillFromTwoLevels()                43   4.25e-05  4.574e-05  4.765e-05   0.00%
ClusterList::chop()                                    1          0  5.911e-06  4.729e-05   0.00%
DistributionMapping::KnapSackDoIt()                   15  2.423e-05   3.06e-05  4.694e-05   0.00%
ParticleCopyPlan::buildMPIStart                        1  1.402e-05  2.342e-05  4.278e-05   0.00%
AmrMesh-cluster                                        1          0  4.769e-06  3.816e-05   0.00%
FillPatchIterator::FillFromLevel0()                    3  9.138e-06   2.33e-05  3.194e-05   0.00%
Amr::initialInit()                                     1  3.688e-06  1.204e-05  3.149e-05   0.00%
amrex::packBuffer                                      1  2.625e-06  2.098e-05  2.948e-05   0.00%
BoxList::parallelComplementIn()                       22  8.649e-06  1.554e-05  2.824e-05   0.00%
AmrLevel::RK()                                        63   1.69e-05  1.837e-05  2.049e-05   0.00%
Amr::InitializeInit()                                  1  9.568e-06  1.447e-05  1.952e-05   0.00%
ParticleContainer::defineBufferMap                     1  2.564e-06  5.697e-06  1.204e-05   0.00%
Amr::init()                                            1    4.9e-07  6.989e-07   9.03e-07   0.00%
Other                                                 19  2.239e-05   2.46e-05  3.044e-05   0.00%
-------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------
Name                                              NCalls  Incl. Min  Incl. Avg  Incl. Max   Max %
-------------------------------------------------------------------------------------------------
runGRTeclyn()                                          1      5.924      5.924      5.924 100.00%
Amr::coarseTimeStep()                                  1      5.812      5.812      5.812  98.11%
Amr::timeStep()                                       63      5.812      5.812      5.812  98.11%
GRAMRLevel::advance()                                 63        3.3      3.924      4.024  67.92%
AmrLevel::RK()                                        63      3.284      3.908      4.008  67.65%
RungeKutta4                                           63      3.284      3.908      4.008  67.65%
GRAMRLevel::post_timestep()                           63      1.787      1.888      2.511  42.38%
BinaryBHLevel::specificPostTimeStep()                 63     0.3662     0.6174      2.348  39.63%
CustomExtraction::execute_query()                     32     0.3661     0.6173      2.348  39.62%
FabArray::ParallelCopy()                             581     0.8277      1.939      2.311  39.00%
FabArray::ParallelCopy_finish()                      581     0.7379      1.856      2.229  37.63%
BinaryBHLevel::specificEvalRHS()                     252      1.916      1.928      1.942  32.77%
AmrLevel::FillRKPatch()                              252      1.057       1.68      1.791  30.23%
FillPatcher::fillRK()                                248      1.029      1.653      1.764  29.78%
AmrLevel::FillPatch()                                 46     0.1738      1.251      1.415  23.89%
FillPatchIterator::Initialize                         46     0.1635      1.241      1.405  23.72%
FillPatchIterator::FillFromTwoLevels()                43     0.1161      1.192      1.354  22.85%
FillPatchTwoLevels                                    43      0.116      1.192      1.354  22.85%
ParticleInterpolators::interp()                       32   0.008824      0.175      1.338  22.58%
FabArray::FillBoundary()                             316     0.2914      0.498     0.9769  16.49%
FillBoundary_finish()                                316     0.2537     0.4617     0.9387  15.84%
FillPatchSingleLevel                                  89    0.09942     0.4522     0.7987  13.48%
FabArrayBase::TheFPinfo()                            322   0.005469     0.4933     0.5646   9.53%
DistributionMapping::LeastUsedCPUs()                  29   0.005119     0.4932     0.5644   9.53%
FPinfo::FPinfo()                                      18   0.005273     0.4931     0.5644   9.53%
DistributionMapping::KnapSackDoIt()                   15   0.004663      0.492     0.5625   9.49%
ParticleInterpolators::interpolate_to_particle()      32     0.3549     0.3578     0.3606   6.09%
ParticleInterpolators::ensure_redistributed()         32    0.00337     0.0914     0.3484   5.88%
ParticleContainer::RedistributeGPU()                   1   0.001231    0.08637     0.3405   5.75%
amrex::communicateParticlesFinish                      1   3.41e-07    0.08477     0.3391   5.72%
AmrLevel::storeRKCoarseData()                         31     0.1397     0.1543     0.1648   2.78%
FillPatcher::storeRKCoarseData()                      31     0.1389     0.1535     0.1639   2.77%
FillPatchInterp(Fab)                                 291    0.05865    0.08191     0.1253   2.11%
StateDataPhysBCFunct::()                             589    0.08038      0.101     0.1203   2.03%
Amr::init()                                            1     0.1104     0.1105     0.1105   1.87%
Amr::initialInit()                                     1     0.1104     0.1105     0.1105   1.87%
Amr::FinalizeInit()                                    1     0.1084     0.1084     0.1086   1.83%
Amr::bldFineLevels()                                   1     0.1083     0.1084     0.1085   1.83%
Amr::grid_places()                                     7    0.09951    0.09962    0.09968   1.68%
AmrMesh::MakeNewGrids()                                7    0.09924    0.09947    0.09959   1.68%
FabArray::ParallelCopy_nowait()                      581    0.06906    0.08302    0.09398   1.59%
GRAMRLevel::errorEst()                                15    0.05999    0.06299    0.06548   1.11%
FillPatchIterator::FillFromLevel0()                    3    0.04393    0.04668    0.04923   0.83%
StateData::FillBoundary(geom)                       2648    0.02667    0.03603    0.04837   0.82%
Amr::regrid()                                          2    0.03875    0.03879    0.03886   0.66%
FillBoundary_nowait()                                316    0.03278    0.03614    0.03848   0.65%
MultiFab::contains_nan()                              63    0.01503    0.01512    0.01519   0.26%
TagBoxArray::mapPRD                                   15   0.007116   0.009241    0.01146   0.19%
amrex::Copy()                                         46   0.009439   0.009557    0.00962   0.16%
BinaryBHLevel::initialData                            16   0.009333   0.009372   0.009412   0.16%
CellQuartic::interp()                                391    0.00559   0.006972   0.009323   0.16%
FabArrayBase::getCPC()                               581   0.004771   0.004965   0.005187   0.09%
FabArray::setVal()                                    30  0.0008149   0.001375   0.005071   0.09%
FabArrayBase::CPC::define()                          176   0.004399   0.004602   0.004824   0.08%
FabArray::setDomainBndry()                            89   0.001956    0.00242   0.002879   0.05%
TagBoxArray::collate()                                15   0.001654   0.002309   0.002839   0.05%
amrex::communicateParticlesStart                       1  1.001e-05  0.0003037   0.002282   0.04%
Amr::InitializeInit()                                  1   0.001859   0.002017   0.002077   0.04%
Amr::defBaseLevel()                                    1   0.001841   0.002003   0.002061   0.03%
FabArrayBase::getFB()                                316   0.001331   0.001384   0.001413   0.02%
OwnerMask()                                           15   0.001234   0.001296   0.001386   0.02%
FabArrayBase::FB::FB()                                 8   0.001227    0.00128   0.001307   0.02%
BinaryBHLevel::tag_cells()                            15  0.0005558  0.0005873  0.0007328   0.01%
DistributionMapping::SFCProcessorMapDoIt()            11  0.0002806  0.0005028  0.0006047   0.01%
ParticleInterpolators::populate_from_query()           1  1.796e-05  0.0001405  0.0004155   0.01%
DenseBins<T>::buildGPU                                 6  0.0003028  0.0003301  0.0003554   0.01%
ParticleCopyPlan::build                                1  0.0001847  0.0002991  0.0003507   0.01%
UtilCreateDirectoryDestructive()                       1  4.192e-05  0.0002296  0.0003258   0.01%
AmrLevel::AmrLevel(dm)                                16  0.0002439  0.0002642  0.0003019   0.01%
BoxList::parallelComplementIn()                       22  0.0001814  0.0002159  0.0002506   0.00%
UtilRenameDirectoryToOld()                             0          0  3.032e-05  0.0002426   0.00%
ParticleCopyPlan::buildMPIStart                        1  5.286e-05  0.0001791  0.0002326   0.00%
StateData::define()                                   16  0.0001849  0.0001994  0.0002242   0.00%
ParticleCopyPlan::doHandShake                          1  2.727e-05  0.0001557  0.0002065   0.00%
Redistribute_partition                                 1  6.983e-06  4.689e-05  0.0001996   0.00%
AmrMesh-cluster                                        1          0  2.376e-05  0.0001901   0.00%
amrex::unpackBuffer                                    1  9.493e-05  0.0001133  0.0001884   0.00%
Amr::InitAmr()                                         1  0.0001362  0.0001628   0.000184   0.00%
BoxList::pci                                           4  7.494e-05  0.0001155  0.0001509   0.00%
ParticleContainer::defineBufferMap                     1  0.0001018  0.0001114   0.000123   0.00%
ParticleBufferMap::define                              1  9.836e-05  0.0001057  0.0001194   0.00%
knapsack()                                            15  6.769e-05  9.174e-05  0.0001098   0.00%
BoxList::complementIn                                 34  7.494e-05   9.11e-05  0.0001082   0.00%
BinaryBHLevel::variableSetUp()                         1  5.568e-05  8.314e-05   0.000107   0.00%
amrex::unpackRemotes                                   1  1.824e-06  2.049e-05  0.0001059   0.00%
ClusterList::intersect()                               1          0  1.308e-05  0.0001046   0.00%
ClusterList::chop()                                    1          0  5.911e-06  4.729e-05   0.00%
amrex::packBuffer                                      1  2.625e-06  2.098e-05  2.948e-05   0.00%
Other                                                 19  2.239e-05   2.46e-05  3.044e-05   0.00%
-------------------------------------------------------------------------------------------------

Unused ParmParse Variables:
  [TOP]::amr.plot_vars(nvals = 2)  :: [chi, Theta]
  [TOP]::extraction_levels(nvals = 2)  :: [3, 2]
  [TOP]::extraction_radii(nvals = 2)  :: [110.0, 150.0]
  [TOP]::modes(nvals = 6)  :: [2, 0, 2, 1, 2, 2]
  [TOP]::num_extraction_radii(nvals = 1)  :: [2]
  [TOP]::num_modes(nvals = 1)  :: [3]
  [TOP]::num_nonzero_asymptotic_vars(nvals = 1)  :: [5]
  [TOP]::num_points_phi(nvals = 1)  :: [32]
  [TOP]::num_points_theta(nvals = 1)  :: [48]
  [TOP]::vars_parity(nvals = 25)  :: [0, 0, 4, 6, 0, 5, 0, 0, 0, 4, 6, 0, 5, 0, 0, 1, 2, 3, 0, 1, 2, 3, 1, 2, 3]
  [TOP]::write_extraction(nvals = 1)  :: [0]

Device Memory Usage:
---------------------------------------------------------------------------------------------------------------------------------------
Name                                              Nalloc  Nfree  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
---------------------------------------------------------------------------------------------------------------------------------------
The_Arena::Initialize()                                8      8      70 MiB      74 MiB      77 MiB      59 GiB      59 GiB      59 GiB
StateData::define()                                 1788   1788    1128 MiB    1133 MiB    1138 MiB    1423 MiB    1428 MiB    1433 MiB
GRAMRLevel::advance()                                876    876    1094 MiB    1099 MiB    1104 MiB    1140 MiB    1146 MiB    1151 MiB
RungeKutta4                                        44512  44512     325 MiB     398 MiB     412 MiB     800 MiB     800 MiB     800 MiB
FillPatchIterator::Initialize                       7004   7004    3406 KiB      40 MiB      45 MiB     202 MiB     202 MiB     202 MiB
FillPatcher::storeRKCoarseData()                    1530   1530     121 MiB     147 MiB     168 MiB     127 MiB     154 MiB     177 MiB
GRAMRLevel::post_timestep()                        16272  16272     329 KiB     452 KiB     518 KiB     114 MiB     124 MiB     135 MiB
FillPatcher::fillRK()                               2148   2148      66 MiB      80 MiB      91 MiB      70 MiB      85 MiB      97 MiB
ResizeRandomSeed                                       8      8      40 MiB      40 MiB      40 MiB      40 MiB      40 MiB      40 MiB
CellQuartic::interp()                               6260   6260     142 KiB     278 KiB     582 KiB      17 MiB      28 MiB      35 MiB
FillPatchTwoLevels                                  1364   1364      74 KiB    1111 KiB    2134 KiB    8817 KiB      12 MiB      17 MiB
AmrMesh::MakeNewGrids()                             3248   3248      21 KiB      21 KiB      21 KiB    1716 KiB    1716 KiB    1716 KiB
ParticleContainer::RedistributeGPU()                 104    104      25 KiB      25 KiB      25 KiB     999 KiB     999 KiB     999 KiB
FillBoundary_nowait()                                240    240     564 KiB     582 KiB     593 KiB     608 KiB     629 KiB     641 KiB
ParticleInterpolators::populate_from_query()           5      5       0   B       5   B      47   B       0   B      37 KiB     301 KiB
FabArray::ParallelCopy_finish()                     2944   2944      82 KiB     109 KiB     132 KiB     145 KiB     204 KiB     246 KiB
FabArray::ParallelCopy_nowait()                     4855   4855      93 KiB     104 KiB     120 KiB     197 KiB     221 KiB     246 KiB
ParticleInterpolators::interpolate_to_particle()    1536   1536      86   B      91   B     100   B     216 KiB     216 KiB     216 KiB
amrex::packBuffer                                      1      1       0   B    1414   B      11 KiB       0   B      24 KiB     193 KiB
FillBoundary_finish()                                120    120     140 KiB     148 KiB     153 KiB     152 KiB     160 KiB     166 KiB
Redistribute_partition                                 6      6       0   B     736   B    5889   B       0   B      14 KiB     112 KiB
amrex::unpackBuffer                                    4      4       0   B      12 KiB      97 KiB       0   B      13 KiB     105 KiB
amrex::unpackRemotes                                   6      6       0   B      11 KiB      91 KiB       0   B      13 KiB     104 KiB
amrex::communicateParticlesStart                      10     10       0   B     709   B    5674   B      16   B      12 KiB      96 KiB
ParticleInterpolators::interp()                      160    160       0   B       3   B      14   B       0   B      10 KiB      43 KiB
OwnerMask()                                         1700   1700       1   B       1   B       2   B      28 KiB      35 KiB      43 KiB
DenseBins<T>::buildGPU                               288    288      38 KiB      38 KiB      38 KiB      42 KiB      42 KiB      42 KiB
MultiFab::contains_nan()                             504    504      69   B      69   B      70   B      27 KiB      27 KiB      27 KiB
ParticleCopyPlan::build                               49     49       0   B     234   B    1428   B    8480   B      10 KiB      25 KiB
BinaryBHLevel::initialData                           216    216      21 KiB      21 KiB      21 KiB      21 KiB      21 KiB      21 KiB
TagBoxArray::mapPRD                                 3160   3160      11   B      16   B      23   B      16 KiB      16 KiB      16 KiB
BinaryBHLevel::specificEvalRHS()                    2024   2024    5535   B    5571   B    5617   B      16 KiB      16 KiB      16 KiB
amrex::Copy()                                        480    480    9109   B    9177   B    9245   B      13 KiB      13 KiB      14 KiB
FabArray::setVal()                                   312    312    5874   B    5914   B    5955   B      13 KiB      13 KiB      13 KiB
StateData::FillBoundary(geom)                      21692  21692      14   B      20   B      29   B    5120   B    8528   B      10 KiB
ParticleBufferMap::define                             24     24    7192   B    7192   B    7193   B    7744   B    7744   B    7744   B
BinaryBHLevel::tag_cells()                            72     72    5824   B    5863   B    5903   B    5904   B    5944   B    5984   B
TagBoxArray::collate()                               314    314       0   B       0   B       0   B     496   B     716   B    1088   B
---------------------------------------------------------------------------------------------------------------------------------------

Managed Memory Usage:
----------------------------------------------------------------------------------------------------------------------
Name                             Nalloc  Nfree  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
----------------------------------------------------------------------------------------------------------------------
The_Managed_Arena::Initialize()       8      8      96   B     145   B     221   B    8192 KiB    8192 KiB    8192 KiB
----------------------------------------------------------------------------------------------------------------------

Pinned Memory Usage:
---------------------------------------------------------------------------------------------------------------------------------------
Name                                              Nalloc  Nfree  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
---------------------------------------------------------------------------------------------------------------------------------------
The_Pinned_Arena::Initialize()                         8      8    6195   B    6793   B    7123   B    8192 KiB    8192 KiB    8192 KiB
FillBoundary_nowait()                                240    240     564 KiB     582 KiB     593 KiB     608 KiB     629 KiB     641 KiB
FabArray::ParallelCopy_finish()                     2944   2944      82 KiB     109 KiB     132 KiB     145 KiB     204 KiB     246 KiB
FabArray::ParallelCopy_nowait()                     4855   4855      93 KiB     104 KiB     120 KiB     197 KiB     221 KiB     246 KiB
FillBoundary_finish()                                120    120     140 KiB     148 KiB     153 KiB     152 KiB     160 KiB     166 KiB
ParticleInterpolators::interp()                      320    320       0   B       2   B      10   B       0   B      13 KiB      52 KiB
OwnerMask()                                          120    120       0   B       0   B       0   B      20 KiB      27 KiB      35 KiB
ParticleContainer::RedistributeGPU()                 624    624      24 KiB      26 KiB      29 KiB      27 KiB      29 KiB      32 KiB
BinaryBHLevel::initialData                           216    216      21 KiB      21 KiB      21 KiB      21 KiB      21 KiB      21 KiB
BinaryBHLevel::specificEvalRHS()                    2024   2024    5535   B    5571   B    5618   B      16 KiB      16 KiB      16 KiB
amrex::Copy()                                        480    480    9109   B    9177   B    9245   B      13 KiB      13 KiB      14 KiB
FabArray::setVal()                                   312    312    5874   B    5914   B    5955   B      13 KiB      13 KiB      13 KiB
RungeKutta4                                           48     48      13 KiB      13 KiB      13 KiB      13 KiB      13 KiB      13 KiB
StateData::FillBoundary(geom)                      21692  21692      14   B      21   B      29   B    5184   B    8620   B      10 KiB
BinaryBHLevel::tag_cells()                            72     72    5824   B    5863   B    5903   B    5904   B    5944   B    5984   B
FillPatcher::fillRK()                               1536   1536    2907   B    4327   B    5357   B    3072   B    4560   B    5664   B
ParticleCopyPlan::build                                8      8       0   B      55   B     220   B    3856   B    3856   B    3856   B
TagBoxArray::collate()                               457    457       0   B       0   B       0   B     496   B    1062   B    3808   B
AmrMesh::MakeNewGrids()                               88     88       0   B       0   B       0   B    2016   B    2412   B    2816   B
ParticleCopyPlan::buildMPIStart                        2      2       0   B       0   B       1   B       0   B       6   B      32   B
MultiFab::contains_nan()                             504    504       0   B       0   B       0   B      16   B      16   B      16   B
ParticleInterpolators::interpolate_to_particle()    1536   1536       0   B       0   B       0   B      16   B      16   B      16   B
Redistribute_partition                                 1      1       0   B       0   B       0   B       0   B       2   B      16   B
---------------------------------------------------------------------------------------------------------------------------------------

Comms Memory Usage:
----------------------------------------------------------------------------------------------------------------------
Name                             Nalloc  Nfree  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
----------------------------------------------------------------------------------------------------------------------
FabArray::ParallelCopy_nowait()    9134   9134    2623 KiB    7201 KiB      11 MiB      35 MiB      45 MiB      57 MiB
FillBoundary_nowait()              5056   5056    1476 KiB    2988 KiB    6378 KiB      49 MiB      49 MiB      49 MiB
The_Comms_Arena::Initialize()         8      8     232   B     859   B    1580   B    8192 KiB    8192 KiB    8192 KiB
----------------------------------------------------------------------------------------------------------------------

Comment

In the 8 GPU case, we can see that CustomExtraction::execute_query() takes ~40% of the runtime. I don't think that's too bad given that the frequency of interpolation on level 5 is probably a bit higher than we would typically use for GW extraction and the configuration was small enough to fit on a single GPU so is definitely not close to occupying 8 GPUs fully. There is probably still some scope for optimisation and I will keep it in mind when reviewing but I don't think this is a bad start and would be fine with this as is.

@tamaraevst
Copy link
Author

I have done/am doing some basic profiling which I will add in a separate comment.

This PR is massive so is going to be quite difficult to review. In an effort to make it easier, when you are ready for this to be reviewed, please could you move all of the *Extraction classes into a separate pull request.

Again, when you are happy for this to be reviewed, could you also remove the debugging printing? I guess some of it can be kept, cleaned up and conditionally printed depending on some kind of verbosity parameter.

Once my AMReX PR (AMReX-Codes/amrex#4780) is merged that will add the ability for Amr to compute a derived quantity on all levels, we should add an interface to ParticleInterpolator such that only the name of the derived quantity is needed e.g. Weyl4. ParticleInterpolator can then go and call Amr::derive to compute the derived quantity and do the interpolation without the user needing to call it themselves and pass the Vector<MultiFab*> themselves.

Thanks Miren, the Extraction* classes are all dependent on particle interpolators, so I am unsure how you would like me to separate the two. Could you please clarify?

@mirenradia
Copy link
Member

Thanks Miren, the Extraction* classes are all dependent on particle interpolators, so I am unsure how you would like me to separate the two. Could you please clarify?

Yes that's correct but the dependency is just in one direction. None of the rest of the particle interpolator infrastructure depends on the Extraction* classes. The extraction classes can be moved in a separate PR that has this one as a pre-requisite (i.e. this would need to be merged first). Does that make sense?

tamaraevst and others added 4 commits November 17, 2025 12:50
These classes and the complete structure of extraction+interpolation can
be seen on the newly created branch particle-interp-extraction
These classes and the complete structure of extraction+interpolation can
be seen on the newly created branch particle-interp-extraction
@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/InterpolationQueryParticle.hpp
Source/ParticleInterpolators/LagrangeInterpolation.hpp
Source/ParticleInterpolators/Make.package
Source/ParticleInterpolators/ParticleInterpolators.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/InterpolationQueryParticle.hpp
Source/ParticleInterpolators/LagrangeInterpolation.hpp
Source/ParticleInterpolators/Make.package
Source/ParticleInterpolators/ParticleInterpolators.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@github-actions
Copy link

This PR modifies the following files which are ignored by .lint-ignore:

Source/AMRInterpolator/DerivativeSetup.hpp
Source/ParticleInterpolators/InterpolationQueryParticle.hpp
Source/ParticleInterpolators/LagrangeInterpolation.hpp
Source/ParticleInterpolators/Make.package
Source/ParticleInterpolators/ParticleInterpolators.hpp
Source/ParticleInterpolators/ParticleInterpolators.impl.hpp

Please consider removing the corresponding patterns from .lint-ignore so that these files can be linted.

@tamaraevst
Copy link
Author

Thanks Miren, the Extraction* classes are all dependent on particle interpolators, so I am unsure how you would like me to separate the two. Could you please clarify?

Yes that's correct but the dependency is just in one direction. None of the rest of the particle interpolator infrastructure depends on the Extraction* classes. The extraction classes can be moved in a separate PR that has this one as a pre-requisite (i.e. this would need to be merged first). Does that make sense?

Cool, thanks. So I have removed all of the Extraction* classes and tidied up the debugging statements. I still have some linter issues, but I hope this is will not be a bottleneck for the review.

All of the Extraction* related stuff has been moved to a new branch. Weyl4 extraction seems to be working now using the new ability in amrex to derive vars at all levels, but we will be addressing this in a separate PR.

@mirenradia
Copy link
Member

I have now updated my comment above with the missing profiling info. Sorry for the delay on this.

Copy link
Member

@mirenradia mirenradia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have finally gone through all of the changes in this PR. It took quite a long time!

Sorry for the very large number of comments. Many of them are small or straightforward to fix (e.g. just changing variable names).

I think the largest changes which will involve the most refactoring are unsurprisingly in the ParticleInterpolators files so I would probably look at those files first and focus on the most involved changes first.

With my suggested change for this class to compute derived quantities, I'm wondering how much sense the InterpolatorQueryParticle class structure makes. I guess the user still needs to provide the coordinates and the output pointers but specifying the component indices is a bit weird. I haven't thought of a better idea though...

On a couple of files, there are a few changes to e.g. whitespace that we should remove from this PR as they're irrelevant (I've commented on the individual files). Note that if you're using pre-commit (which will force a whitespace check), in order to commit restoring these files to how they were, you will need to pass --no-verify to skip the pre-commit checks.

Comment on lines +45 to 58

// we do not need std::unique_ptr here strictly speaking. We can actually omit
// this step.
void GRAMR::convert_derived_multifabs(
const amrex::Vector<std::unique_ptr<amrex::MultiFab>> &inputs,
amrex::Vector<const amrex::MultiFab *> &fields)
{
fields.clear();
fields.reserve(inputs.size());
for (auto const &level_content : inputs)
{
fields.push_back(level_content.get());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this function? We can instead use amrex::GetVecOfConstPtrs() or amrex::GetVecOfPtrs() defined in AMReX_Vector.H.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need another parameter file? Can we just remove it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Port AMRInterpolator

3 participants