Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
245e19d
Add JSON schema for openPMD
franzpoeschel Jun 26, 2023
a860269
Add convert-json-toml tool
franzpoeschel Jun 26, 2023
4c957b4
Add script for checking openPMD file against the schema
franzpoeschel Jun 26, 2023
a510957
Don't use spaces in SerialIOTest attribute names
franzpoeschel Jun 26, 2023
facfa15
Fix bugs detected by this verifier
franzpoeschel Jun 26, 2023
3749056
Add GitHub workflow
franzpoeschel Jun 26, 2023
ba3f5b1
Shorthand attributes
franzpoeschel Aug 7, 2023
a719bb6
Add dataset template mode
franzpoeschel Aug 7, 2023
3a5fc19
Fix path
franzpoeschel Jul 16, 2024
334f2d5
Fix reading from stdin
franzpoeschel Jul 16, 2024
90ce201
toml11 4.0 compatibility
franzpoeschel Aug 5, 2024
779cf12
Only check for existing Iterations in writeOnly mode
franzpoeschel Feb 17, 2025
f6b1f24
Some additions to schema
franzpoeschel Feb 17, 2025
f18a88b
Remove deprecated jsonschema.validators.RefResolver
franzpoeschel Feb 17, 2025
5e4a870
Use most recent version of jsonschema
franzpoeschel Mar 3, 2025
252c0d4
Allow empty variable-based series
franzpoeschel Mar 3, 2025
65aa12d
Use if-then-else for better-steered parsing
franzpoeschel Mar 3, 2025
83ed23a
hmm
franzpoeschel Mar 26, 2025
579e7b0
Remove json cfg after test
franzpoeschel Apr 7, 2025
81533b0
Update documentation, rename convert-toml-json tool
franzpoeschel Jul 15, 2025
985d505
Apply suggestions from code review
franzpoeschel Jul 15, 2025
3a2929f
Add reference to openPMD-validator
franzpoeschel Jul 15, 2025
d578167
Update README.md
franzpoeschel Jul 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,7 @@ jobs:
- name: Install
run: |
sudo apt-get update
sudo apt-get install g++ libopenmpi-dev libhdf5-openmpi-dev python3 python3-numpy python3-mpi4py python3-pandas python3-h5py-mpi
sudo apt-get install g++ libopenmpi-dev libhdf5-openmpi-dev python3 python3-numpy python3-mpi4py python3-pandas python3-h5py-mpi python3-pip
# TODO ADIOS2
- name: Build
env: {CXXFLAGS: -Werror, PKG_CONFIG_PATH: /usr/lib/x86_64-linux-gnu/pkgconfig}
Expand All @@ -275,6 +275,22 @@ jobs:
cmake --build build --parallel 4
ctest --test-dir build --output-on-failure

python3 -m pip install jsonschema==4.* referencing
cd share/openPMD/json_schema
PATH="../../../build/bin:$PATH" make -j 2
# We need to exclude the thetaMode example since that has a different
# meshesPath and the JSON schema needs to hardcode that.
Comment on lines +281 to +282
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to patch this in check.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very easily. The JSON schema is on the file system and the single .json files refer to each other by their file names. Changing this would require (1) traversing the entire JSON schema and overriding the meshes path, the particles path and the references and (2) somehow setting up python-jsonschema to cross-reference in-memory schemas which I don't even know if it supports that, both at runtime of check.py.

find ../../../build/samples/ \
! -path '*thetaMode*' \
! -path '/*many_iterations/*' \
! -name 'profiling.json' \
! -name '*config.json' \
-iname '*.json' \
| while read i; do
echo "Checking $i"
./check.py "$i"
done

musllinux_py10:
runs-on: ubuntu-22.04
if: github.event.pull_request.draft == false
Expand Down
6 changes: 5 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -685,11 +685,12 @@ set(openPMD_TEST_NAMES
# command line tools
set(openPMD_CLI_TOOL_NAMES
ls
convert-toml-json
)
set(openPMD_PYTHON_CLI_TOOL_NAMES
pipe
)
set(openPMD_PYTHON_CLI_MODULE_NAMES ${openPMD_CLI_TOOL_NAMES})
set(openPMD_PYTHON_CLI_MODULE_NAMES ls)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line change looks like a hack?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. Until now, both openPMD_CLI_TOOL_NAMES and openPMD_PYTHON_CLI_MODULE_NAMES were identical since both contained only openpmd-ls.
But they're not identical in general; and now, we additionally have openpmd-convert-json-toml as a CLI tool, but it is not written in Python.

# examples
set(openPMD_EXAMPLE_NAMES
1_structure
Expand Down Expand Up @@ -894,6 +895,9 @@ if(openPMD_BUILD_CLI_TOOLS)
endif()

target_link_libraries(openpmd-${toolname} PRIVATE openPMD)
target_include_directories(openpmd-${toolname} SYSTEM PRIVATE
$<TARGET_PROPERTY:openPMD::thirdparty::nlohmann_json,INTERFACE_INCLUDE_DIRECTORIES>
$<TARGET_PROPERTY:openPMD::thirdparty::toml11,INTERFACE_INCLUDE_DIRECTORIES>)
endforeach()
endif()

Expand Down
15 changes: 12 additions & 3 deletions include/openPMD/auxiliary/JSON_internal.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -219,16 +219,25 @@ namespace json
* @param options as a parsed JSON object.
* @param considerFiles If yes, check if `options` refers to a file and read
* from there.
* @param convertLowercase If yes, lowercase conversion is applied
* recursively to keys and values, except for some hardcoded places
* that should be left untouched.
*/
ParsedConfig parseOptions(std::string const &options, bool considerFiles);
ParsedConfig parseOptions(
std::string const &options,
bool considerFiles,
bool convertLowercase = true);

#if openPMD_HAVE_MPI

/**
* Parallel version of parseOptions(). MPI-collective.
*/
ParsedConfig
parseOptions(std::string const &options, MPI_Comm comm, bool considerFiles);
ParsedConfig parseOptions(
std::string const &options,
MPI_Comm comm,
bool considerFiles,
bool convertLowercase = true);

#endif

Expand Down
15 changes: 15 additions & 0 deletions share/openPMD/json_schema/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
convert := openpmd-convert-toml-json

json_files = attribute_defs.json attributes.json dataset_defs.json iteration.json mesh.json mesh_record_component.json particle_patches.json particle_species.json patch_record.json record.json record_component.json series.json

.PHONY: all
all: $(json_files)

# The target file should only be created if the conversion succeeded
$(json_files): %.json: %.toml
$(convert) @$^ > [email protected]
mv [email protected] $@

.PHONY: clean
clean:
for file in $(json_files); do rm -f "$$file" "$$file.tmp"; done
47 changes: 47 additions & 0 deletions share/openPMD/json_schema/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# JSON Validation

This folder contains a JSON schema for validation of openPMD files written as `.json` files.

## Usage

### Generating the JSON schema

For improved readability, maintainability and documentation purposes, the JSON schema is written in `.toml` format and needs to be "compiled" to `.json` files first before usage.
To do this, the openPMD-api installs a tool named `openpmd-convert-toml-json` which can be used to convert between JSON and TOML files in both directions, e.g.:

```bash
openpmd-convert-toml-json @series.toml > series.json
```

A `Makefile` is provided in this folder to automate generating the needed JSON files from the TOML files.

### Verifying a file against the JSON schema

In theory, the JSON schema should be applicable by any JSON validator. This JSON schema is written in terms of multiple files however, and most validators require special care to properly set up the links between the single files. A Python script `check.py` is provided in this folder which sets up the [Python jsonschema](https://python-jsonschema.readthedocs.io) library and verifies a file against it, e.g.:

```bash
./check.py path/to/my/dataset.json
```

For further usage notes check the documentation of the script itself `./check.py --help`.

## Caveats

The openPMD standard is not entirely expressible in terms of a JSON schema:

* Many semantic dependencies, e.g., that the `position/x` and `position/y` vectors of a particle species need to be of the same size, or that the `axisLabels` have the same dimensionality as the dataset itself, will go unchecked.
* The `meshesPath` is assumed to be `meshes/` and the `particlesPath` is assumed to be `particles/`. This dependency cannot be expressed.

While a large part of the openPMD standard can indeed be verified by checking against a static JSON schema, the standard is generally large enough to make this approach come to its limits. Verification of a JSON schema is similar to the use of a naive recursive-descent parser. Error messages may become unexpectedly verbose and not very informative, especially when parsing disjunctive statements such as "A Record is either a scalar Record Component or a vector of non-scalar Record Components". We have taken care to decide disjunctive statements early on, e.g. with json-schema's support for `if` statements, but error messages may in general become unwieldy even due to tiny mistakes far down in the parse tree.

The layout of attributes is assumed to be that which is created by the JSON backend of the openPMD-api. Both the longhand and shorthand forms are recognized:

```json
"meshesPath": {
"datatype": "STRING",
"value": "meshes/"
},
"particlesPath": "particles/"
```

For a custom-written verification of openPMD datasets, also consider using the [openPMD-validator](https://github.com/openPMD/openPMD-validator).
Loading
Loading