-
Notifications
You must be signed in to change notification settings - Fork 54
Add JSON schema #1426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JSON schema #1426
Conversation
46fcfcd to
78fc678
Compare
638c575 to
4f4be4b
Compare
199a223 to
5b09f81
Compare
b6b057b to
e99b0fb
Compare
3fb1ea9 to
e866604
Compare
ax3l
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you!
Wondering a bit if validation of an XML backend would be easier.... not sure.
| # We need to exclude the thetaMode example since that has a different | ||
| # meshesPath and the JSON schema needs to hardcode that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to patch this in check.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very easily. The JSON schema is on the file system and the single .json files refer to each other by their file names. Changing this would require (1) traversing the entire JSON schema and overriding the meshes path, the particles path and the references and (2) somehow setting up python-jsonschema to cross-reference in-memory schemas which I don't even know if it supports that, both at runtime of check.py.
The major issue of the JSON schema approach is that it is not a predictive parser (https://en.wikipedia.org/wiki/Recursive_descent_parser), meaning that it deals with If we want to seriously consider XML as an alternative, my opinion is that it must at least solve this particular issue .. which I doubt it can properly, since this involves a bit of parser theory and grammar transformation. This job is better done by handwritten parsers, which the openPMD-api and the openPMD-validator are. However, if we add an XML backend at some point, an XML schema is definitely at least worth a consideration. |
7a5fee0 to
e66de14
Compare
c9fb0c1 to
4df3d58
Compare
4df3d58 to
54a2850
Compare
54a2850 to
2a36992
Compare
2a36992 to
43b7bea
Compare
43b7bea to
785dae7
Compare
785dae7 to
321130a
Compare
686d5af to
58497f1
Compare
4193a3c to
cbecd5b
Compare
Written as .toml files for ease of documentation, maintailability and readability.
Needed for "compiling" the schema to JSON Also add a Makefile to further simplify this
Workflow documented in README.md
The JSON schema verification package does not like that
Both of the form "data not found in places where data was expected"
Verify all JSON-openPMD files written by testing against the schema
1. Support UNDEFINED datasets in template mode 2. gridUnitSI may now be a vector
Apparently it's better to make everything 100 times more complicated
anyOf and oneOf now only used for trivial distinctions, this makes schemas much more robust since errors can be caught early and error messages become actually useful.
Otherwise CI thinks this is an openPMD file
cbecd5b to
579e7b0
Compare
share/openPMD/json_schema/README.md
Outdated
| openpmd_convert-json-toml @series.toml > series.json | ||
| ``` | ||
|
|
||
| A `Makefile` is provided in this folder to simplify the application of this conversion tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makefile? Isn't this just built with CMake as one of the CLI tools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. The makefile does the conversion of the schema toml files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| A `Makefile` is provided in this folder to simplify the application of this conversion tool. | |
| A `Makefile` is provided in this folder to simplify the application of this conversion tool to the `.toml` files in this folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a bit clearer: A Makefile is provided in this folder to automate generating the needed JSON files from the TOML files.
?
| pipe | ||
| ) | ||
| set(openPMD_PYTHON_CLI_MODULE_NAMES ${openPMD_CLI_TOOL_NAMES}) | ||
| set(openPMD_PYTHON_CLI_MODULE_NAMES ls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line change looks like a hack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. Until now, both openPMD_CLI_TOOL_NAMES and openPMD_PYTHON_CLI_MODULE_NAMES were identical since both contained only openpmd-ls.
But they're not identical in general; and now, we additionally have openpmd-convert-json-toml as a CLI tool, but it is not written in Python.
CMakeLists.txt
Outdated
| # command line tools | ||
| set(openPMD_CLI_TOOL_NAMES | ||
| ls | ||
| convert-json-toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add this to setup.py -> entry_points -> console_scripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a Python module, but written in C++, so I doubt that will work?
share/openPMD/json_schema/Makefile
Outdated
| @@ -0,0 +1,15 @@ | |||
| convert := openpmd-convert-json-toml | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming it openpmd-convert-toml-json seems to be more intuitive for what we use it here, but you mentioned anyway that it works bi-directional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is bi-directional, but we can flip the name, since that's the intended use here.
share/openPMD/json_schema/README.md
Outdated
| * Many semantic dependencies, e.g. that the `position/x` and `position/y` vector of a particle species be of the same size, or that the `axisLabels` have the same dimensionality as the dataset itself, will go unchecked. | ||
| * The `meshesPath` is assumed to be `meshes/` and the `particlesPath` is assumed to be `particles/`. This dependency cannot be expressed. | ||
|
|
||
| While a large part of the openPMD standard can indeed be verified by checking against a JSON schema, the standard is generally large enough to make this approach come to its limits. Verification of a JSON schema is similar to the use of a naive recursive-descent parser. Error messages will often be unexpectedly verbose and not very informative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might clarify a bit:
| While a large part of the openPMD standard can indeed be verified by checking against a JSON schema, the standard is generally large enough to make this approach come to its limits. Verification of a JSON schema is similar to the use of a naive recursive-descent parser. Error messages will often be unexpectedly verbose and not very informative. | |
| While a large part of the openPMD standard can indeed be verified by checking against a static JSON schema, the standard is generally large enough to make this approach come to its limits. Verification of a JSON schema is similar to the use of a naive recursive-descent parser. Error messages will often be unexpectedly verbose and not very informative. |
share/openPMD/json_schema/README.md
Outdated
| } | ||
| ``` | ||
|
|
||
| Support for an abbreviated notation such as `"meshesPath": "meshes/"` is currently not (yet) available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this section, you might want to advertise & link openPMD-validator again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. We might additionally advertise the JSON schema in the validator. Adding the JSON validation to our CI brought up a number of bugs after all.
ax3l
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
TODO:
Maybe add support for abbreviated JSON representationsNot nowI have added a CLI tool
openpmd-convert_json_toml, exposing internally existing functionality to convert between TOML and JSON. I think it's better to define and maintain the schema in terms of TOML (it can be commented and is more legible), and just compile it to JSON for verification.x-ref #1436