Skip to content

Write metadata to a dedicated JSON file#737

Merged
svchb merged 76 commits intotrixi-framework:devfrom
marcelschurer:write2json
Sep 12, 2025
Merged

Write metadata to a dedicated JSON file#737
svchb merged 76 commits intotrixi-framework:devfrom
marcelschurer:write2json

Conversation

@marcelschurer
Copy link
Copy Markdown
Contributor

@marcelschurer marcelschurer commented Mar 24, 2025

This PR adds the new function write_meta_data(), which exports simulation_info and system_data to a dedicated JSON file at the start of the simulation. Previously, metadata was stored within the VTK output files. The structure and design of the new functionality closely follow the existing implementation in write_vtk.jl for consistency.

This is what the JSON file looks like for the hydrostatic_water_comun_2d.jl example:

meta.json:

{
  "simulation_info": {
    "julia_version": "1.11.5",
    "technical_setup": {
      "#threads": 8,
      "parallelization_backend": "PolyesterBackend"
    },
    "time_integrator": {
      "start_time": 0.0,
      "reltol": 0.001,
      "integrator_type": "RDPK3SpFSAL35",
      "adaptive": true,
      "final_time": 1.0,
      "controller": "PIDController",
      "abstol": 1.0e-6
    },
    "solver_version": "v0.2.6-67-g64fd67424",
    "solver_name": "TrixiParticles.jl"
  },
  "system_data": {
    "_fluid_1": {
      "acceleration": [
        0.0,
        -9.81
      ],
      "pressure_acceleration_formulation": "pressure_acceleration_continuity_density",
      "state_equation": {
        "model": "StateEquationCole",
        "reference_density": 1000.0,
        "background_pressure": 0.0,
        "exponent": 7.0
      },
      "density_calculator": "ContinuityDensity",
      "particle_spacing": 0.05,
      "viscosity_model": {
        "model": "ArtificialViscosityMonaghan",
        "alpha": 0.02,
        "epsilon": 0.01,
        "beta": 0.0
      },
      "system_type": "WeaklyCompressibleSPHSystem",
      "smoothing_length": 0.06,
      "sound_speed": 10.0,
      "smoothing_kernel": "SchoenbergCubicSplineKernel"
    },
    "_boundary_1": {
      "boundary_model": {
        "state_equation": {
          "model": "StateEquationCole",
          "reference_density": 1000.0,
          "background_pressure": 0.0,
          "exponent": 7.0
        },
        "density_calculator": "AdamiPressureExtrapolation",
        "model": "BoundaryModelDummyParticles",
        "smoothing_length": 0.06,
        "smoothing_kernel": "SchoenbergCubicSplineKernel"
      },
      "particle_spacing": 0.05,
      "system_type": "BoundarySPHSystem",
      "adhesion_coefficient": 0.0
    }
  }
}

Remark: Unit tests for this function will be added in a follow-up PR. In that upcoming PR, I will implement the functionality to read meta_data alongside the simulation data in read_vtk.jl.
See issue #848

@marcelschurer marcelschurer changed the title write2json Write metadata to a dedicated JSON file Mar 24, 2025
@codecov
Copy link
Copy Markdown

codecov bot commented May 19, 2025

Codecov Report

❌ Patch coverage is 2.84553% with 239 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (dev@6010689). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/io/io.jl 0.90% 220 Missing ⚠️
src/io/write_vtk.jl 27.77% 13 Missing ⚠️
src/callbacks/post_process.jl 0.00% 3 Missing ⚠️
src/callbacks/solution_saving.jl 0.00% 2 Missing ⚠️
src/preprocessing/particle_packing/system.jl 0.00% 1 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##             dev     #737   +/-   ##
======================================
  Coverage       ?   69.04%           
======================================
  Files          ?      108           
  Lines          ?     7386           
  Branches       ?        0           
======================================
  Hits           ?     5100           
  Misses         ?     2286           
  Partials       ?        0           
Flag Coverage Δ
unit 69.04% <2.84%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@LasNikas LasNikas marked this pull request as ready for review May 21, 2025 08:47
Comment thread src/io/write_vtk.jl Outdated
Comment thread src/preprocessing/particle_packing/system.jl Outdated
Comment thread src/TrixiParticles.jl Outdated
Comment thread src/io/io.jl Outdated
Comment thread src/callbacks/post_process.jl Outdated
Comment thread src/io/io.jl Outdated
@marcelschurer marcelschurer requested a review from LasNikas June 4, 2025 07:38
@LasNikas LasNikas requested review from efaulhaber and svchb June 4, 2025 07:53
@LasNikas
Copy link
Copy Markdown
Collaborator

LasNikas commented Jun 4, 2025

/run-gpu-tests

Comment thread src/io/io.jl Outdated
Comment thread src/io/io.jl Outdated
Comment thread src/io/io.jl Outdated
Comment thread src/io/io.jl
Comment thread src/io/io.jl Outdated
Comment thread src/io/io.jl Outdated
Comment thread src/io/io.jl Outdated
Comment thread src/io/io.jl
@sloede
Copy link
Copy Markdown
Member

sloede commented Sep 6, 2025

Out of curiosity - what's the purpose of having JSON files? Is this for some post processing or just to make simulations "reproducible"? If the postprocessing is done by Julia, why not serialize the relevant TrixiP objects into a JLD2 file or something similar?

I'm just asking since before, JSON would also have been my go-to move, but given that it can never fully capture what we can do in Julia itself (e.g., it can never accurately represent an initial condition we only supply as a function), it is also somewhat limiting.

@efaulhaber
Copy link
Copy Markdown
Member

The main goal is to restart a simulation. For this, we need the particle data, which we can read from the exported VTK files, and the additional data like which smoothing kernel was used. This additional data is supposed to be stored here.

But this is a good point. For example for boundary movement, we want to be able to store a Julia function.

@LasNikas
Copy link
Copy Markdown
Collaborator

LasNikas commented Sep 6, 2025

Currently, we write all meta information into fields of every file of the VTK series. The idea was to store the static meta information in a separate file at the start of the simulation, also to improve reproducibility.
However, I agree with you that for an exact restart, it would be more practical to handle this with JLD2.

@sloede
Copy link
Copy Markdown
Member

sloede commented Sep 9, 2025

Why not follow the Trixi.jl strategy for restarts:

Just store the elixir along with your results, and to restart, you replace the part that creates the ODE problem with some initial conditions by a call to something like load_restart_file that loads the simulation state from a restart file.

This way you have all information you need and can even change things up in an easy manner if you plan on modifying some of the methods used. IMHO this is much less opaque than loading a JLD2 file (which is potentially harmful since it executes unseen code).

@efaulhaber
Copy link
Copy Markdown
Member

We had this discussion a while ago, but I don't remember the reason. @svchb?

I think the reasoning was the following. If you modified an example file to run a simulation, you would have to do the exact same modification again in order to load the particle data to do interpolation and/or restart the simulation. With the metadata approach, all you need to load/restart is the output data.

Maybe we need to discuss this again, @LasNikas?

@LasNikas
Copy link
Copy Markdown
Collaborator

LasNikas commented Sep 9, 2025

The main reasons I’d like to have a separate meta file is to provide all simulation metadata in a general, accessible file format. Up to now, this information has been stored in the VTK files for every time step, but since it’s embedded in the binary dump, it’s not easy to quickly inspect or use for postprocessing or reproducibility. By writing the metadata to a dedicated file, we make it much easier to access, and utilize this information.

So again (sorry), the primary motivation here is to store metadata separately and in a clear, human-readable way.
Regarding restart functionality, I agree that we need to discuss further how best to handle this.

@sloede
Copy link
Copy Markdown
Member

sloede commented Sep 10, 2025

If it's really just about storing some meta data for further processing (which not necessarily takes place in Julia), then writing a JSON file seems reasonable - especially if it is not intended to be used as the mechanism for restarting simulations 👍

However, I think it's a IMHO a common misconception to think that JSON is a human-readable file format. Technically it is, yes, but it clearly wasn't designed for that purpose, and if data gets just a little bit more complex, actual human readability gets shot in a second.

If you want human readability, why not go for TOML? For example, the meta.toml from above would look like this, which IMHO is much nicer to parse as a human being (plus you can add comments!):

[simulation_info]
julia_version = "1.11.5"
solver_version = "v0.2.6-67-g64fd67424"
solver_name = "TrixiParticles.jl"

    [simulation_info.technical_setup]
    "#threads" = 8
    parallelization_backend = "PolyesterBackend"

    [simulation_info.time_integrator]
    start_time = 0.0
    reltol = 0.001
    integrator_type = "RDPK3SpFSAL35"
    adaptive = true
    final_time = 1.0
    controller = "PIDController"
    abstol = 1.0e-6

[system_data._fluid_1]
acceleration = [0.0, -9.81]
pressure_acceleration_formulation = "pressure_acceleration_continuity_density"
density_calculator = "ContinuityDensity"
particle_spacing = 0.05
system_type = "WeaklyCompressibleSPHSystem"
smoothing_length = 0.06
sound_speed = 10.0
smoothing_kernel = "SchoenbergCubicSplineKernel"

    [system_data._fluid_1.state_equation]
    model = "StateEquationCole"
    reference_density = 1000.0
    background_pressure = 0.0
    exponent = 7.0

    [system_data._fluid_1.viscosity_model]
    model = "ArtificialViscosityMonaghan"
    alpha = 0.02
    epsilon = 0.01
    beta = 0.0

[system_data._boundary_1]
particle_spacing = 0.05
system_type = "BoundarySPHSystem"
adhesion_coefficient = 0.0

    [system_data._boundary_1.boundary_model]
    density_calculator = "AdamiPressureExtrapolation"
    model = "BoundaryModelDummyParticles"
    smoothing_length = 0.06
    smoothing_kernel = "SchoenbergCubicSplineKernel"

        [system_data._boundary_1.boundary_model.state_equation]
        model = "StateEquationCole"
        reference_density = 1000.0
        background_pressure = 0.0
        exponent = 7.0

@LasNikas
Copy link
Copy Markdown
Collaborator

Thanks a lot @sloede pointing us to this! To be honest, I haven’t given much thought yet to which format would be best.
Since we’ve been exporting data as JSON in the postprocess callback so far, I thought it would also make sense to use JSON for the meta information.
However, .toml does sound appealing. @marcelschurer , would it be possible to adapt the metadata export to this format?
What do @efaulhaber @svchb think?

@svchb
Copy link
Copy Markdown
Collaborator

svchb commented Sep 11, 2025

  1. This PR was never about restart.
  2. This PR was about moving the meta-data out of the VTK files.

The point was not to create a human readable format. As I have the task to put everything into our RDM (at Hereon) platform, which works on the even less human readable format of JSON-ld I suggested to use JSON since in my opinion this is the only format that is relevant in this space. A more human readable format can easily be generated in HTML or TOML by numerous converters from JSON.

Comment thread src/io/io.jl Outdated
@sloede
Copy link
Copy Markdown
Member

sloede commented Sep 12, 2025

  1. This PR was never about restart.

    1. This PR was about moving the meta-data out of the VTK files.

The point was not to create a human readable format. As I have the task to put everything into our RDM (at Hereon) platform, which works on the even less human readable format of JSON-ld I suggested to use JSON since in my opinion this is the only format that is relevant in this space. A more human readable format can easily be generated in HTML or TOML by numerous converters from JSON.

Ah, thanks for the clarification. Indeed, in this case I see no urgent reason to use a different format than JSON.

@LasNikas LasNikas requested a review from svchb September 12, 2025 14:50
@svchb svchb merged commit 2d2b437 into trixi-framework:dev Sep 12, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking changes This change will break the public API and requires a new major release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants