Skip to content

JSON update #1043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Dec 18, 2021
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
6da2f4d
New error type: BackendConfigSchema
franzpoeschel Sep 13, 2021
2ce67dc
Update to JSON.hpp auxiliary header (now JSON_internal.hpp)
franzpoeschel Jul 8, 2021
b0b31b9
Use TracingJSON from the start already
franzpoeschel Aug 19, 2021
af40454
Use JSON to set dataset transform in ADIOS1
franzpoeschel Sep 13, 2021
65624fe
Remove Dataset::compression, ::transform and ::chunksize
franzpoeschel Sep 3, 2021
46b2b56
Use JSON config in MPI benchmarks
franzpoeschel Sep 14, 2021
e9ec75f
Improvements to ADIOS2 and HDF5
franzpoeschel Sep 13, 2021
399686b
Series global keys: backend and iteration_encoding
franzpoeschel Sep 13, 2021
241f9db
Documentation
franzpoeschel Sep 13, 2021
9d2ffcc
Use fancy C++ strings
franzpoeschel Sep 13, 2021
49a980d
Correct precedence in ADIOS2: env var vs. JSON param
franzpoeschel Oct 11, 2021
67ca661
Lower case transformation: Ignore some paths in JSON
franzpoeschel Oct 11, 2021
f2fac75
Add json::merge, including test
franzpoeschel Oct 21, 2021
3273b0c
Use {"backend": <backend_name>} in tests
franzpoeschel Oct 21, 2021
276c50e
Warn if using contradicting filename extension to backend key
franzpoeschel Oct 21, 2021
99a307c
Move JSON test to separate binary
franzpoeschel Dec 16, 2021
2508a78
Apply suggestions from code review
franzpoeschel Dec 16, 2021
b7312eb
Remove duplicate friend declarations
franzpoeschel Dec 16, 2021
6065d56
HDF5 fix (to be rebased)
franzpoeschel Dec 16, 2021
965bc07
Fix verbatim chevrons in Doxygen
ax3l Dec 17, 2021
bd99789
Some commenting on backend_via_json test
franzpoeschel Dec 17, 2021
f9c2f8c
Add breaking changes to NEWS.rst
franzpoeschel Dec 17, 2021
a69af6c
Doxygen: Warn Unused JSON Params
ax3l Dec 17, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -581,6 +581,11 @@ if(openPMD_HAVE_ADIOS1)
target_compile_definitions(openPMD.ADIOS1.Parallel PRIVATE openPMD_HAVE_MPI=0)
endif()

target_include_directories(openPMD.ADIOS1.Serial SYSTEM PRIVATE
$<TARGET_PROPERTY:openPMD::thirdparty::nlohmann_json,INTERFACE_INCLUDE_DIRECTORIES>)
target_include_directories(openPMD.ADIOS1.Parallel SYSTEM PRIVATE
$<TARGET_PROPERTY:openPMD::thirdparty::nlohmann_json,INTERFACE_INCLUDE_DIRECTORIES>)

set_target_properties(openPMD.ADIOS1.Serial PROPERTIES
POSITION_INDEPENDENT_CODE ON
CXX_VISIBILITY_PRESET hidden
Expand Down Expand Up @@ -772,6 +777,7 @@ set(openPMD_TEST_NAMES
Auxiliary
SerialIO
ParallelIO
JSON
)
# command line tools
set(openPMD_CLI_TOOL_NAMES
Expand Down Expand Up @@ -857,6 +863,11 @@ if(openPMD_BUILD_TESTING)
else()
target_link_libraries(${testname}Tests PRIVATE CatchMain)
endif()

if(${testname} STREQUAL JSON)
target_include_directories(${testname}Tests SYSTEM PRIVATE
$<TARGET_PROPERTY:openPMD::thirdparty::nlohmann_json,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
endforeach()
endif()

Expand Down
4 changes: 4 additions & 0 deletions NEWS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ Upgrade Guide
Python 3.10 is now supported.
openPMD-api now depends on `toml11 <https://github.com/ToruNiina/toml11>`__ 3.7.0+.

The following backend-specific members of the ``Dataset`` class have been removed: ``Dataset::setChunkSize()``, ``Dataset::setCompression()``, ``Dataset::setCustomTransform()``, ``Dataset::chunkSize``, ``Dataset::compression``, ``Dataset::transform``.
They are replaced by backend-specific options in the JSON-based backend configuration.
This can be passed in ``Dataset::options``.


0.14.0
------
Expand Down
7 changes: 7 additions & 0 deletions docs/source/details/adios1.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"adios2": {
"dataset": {
"transform": "blosc:compressor=zlib,shuffle=bit,lvl=1;nometa"
}
}
}
29 changes: 28 additions & 1 deletion docs/source/details/backendconfig.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,18 @@ The fundamental structure of this JSON configuration string is given as follows:

This structure allows keeping one configuration string for several backends at once, with the concrete backend configuration being chosen upon choosing the backend itself.

The configuration is read in a case-sensitive manner.
Options that can be configured via JSON are often also accessible via other means, e.g. environment variables.
The following list specifies the priority of these means, beginning with the lowest priority:

1. Default values
2. Automatically detected options, e.g. the backend being detected by inspection of the file extension
3. Environment variables
4. JSON configuration. For JSON, a dataset-specific configuration overwrites a global, Series-wide configuration.
5. Explicit API calls such as ``setIterationEncoding()``

The configuration is read in a case-insensitive manner, keys as well as values.
An exception to this are string values which are forwarded to other libraries such as ADIOS1 and ADIOS2.
Those are read "as-is" and interpreted by the backend library.
Generally, keys of the configuration are *lower case*.
Parameters that are directly passed through to an external library and not interpreted within openPMD API (e.g. ``adios2.engine.parameters``) are unaffected by this and follow the respective library's conventions.

Expand All @@ -36,6 +47,11 @@ For a consistent user interface, backends shall follow the following rules:
Backend-independent JSON configuration
--------------------------------------

The openPMD backend can be chosen via the JSON key ``backend`` which recognizes the alternatives ``["hdf5", "adios1", "adios2", "json"]``.

The iteration encoding can be chosen via the JSON key ``iteration_encoding`` which recognizes the alternatives ``["file_based", "group_based", "variable_based"]``.
Note that for file-based iteration encoding, specification of the expansion pattern in the file name (e.g. ``data_%T.json``) remains mandatory.

The key ``defer_iteration_parsing`` can be used to optimize the process of opening an openPMD Series (deferred/lazy parsing).
By default, a Series is parsed eagerly, i.e. opening a Series implies reading all available iterations.
Especially when a Series has many iterations, this can be a costly operation and users may wish to defer parsing of iterations to a later point adding ``{"defer_iteration_parsing": true}`` to their JSON configuration.
Expand Down Expand Up @@ -100,6 +116,17 @@ Explanation of the single keys:
``"none"`` can be used to disable chunking.
Chunking generally improves performance and only needs to be disabled in corner-cases, e.g. when heavily relying on independent, parallel I/O that non-collectively declares data records.

ADIOS1
^^^^^^

ADIOS1 allows configuring custom dataset transforms via JSON:

.. literalinclude:: adios1.json
:language: json

This configuration can be passed globally (i.e. for the ``Series`` object) to apply for all datasets.
Alternatively, it can also be passed for single ``Dataset`` objects to only apply for single datasets.


Other backends
^^^^^^^^^^^^^^
Expand Down
23 changes: 21 additions & 2 deletions examples/7_extended_write_serial.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,27 @@ main()
// this describes the datatype and shape of data as it should be written to disk
io::Datatype dtype = io::determineDatatype(partial_mesh);
auto d = io::Dataset(dtype, io::Extent{2, 5});
d.setCompression("zlib", 9);
d.setCustomTransform("blosc:compressor=zlib,shuffle=bit,lvl=1;nometa");
std::string datasetConfig = R"END(
{
"adios1": {
"dataset": {
"transform": "blosc:compressor=zlib,shuffle=bit,lvl=1;nometa"
}
},
"adios2": {
"dataset": {
"operators": [
{
"type": "zlib",
"parameters": {
"clevel": 9
}
}
]
}
}
})END";
d.options = datasetConfig;
mesh["x"].resetDataset(d);

io::ParticleSpecies electrons = cur_it.particles["electrons"];
Expand Down
21 changes: 19 additions & 2 deletions examples/7_extended_write_serial.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"""
from openpmd_api import Series, Access, Dataset, Mesh_Record_Component, \
Unit_Dimension
import json
import numpy as np


Expand Down Expand Up @@ -102,8 +103,24 @@
# component this describes the datatype and shape of data as it should be
# written to disk
d = Dataset(partial_mesh.dtype, extent=[2, 5])
d.set_compression("zlib", 9)
d.set_custom_transform("blosc:compressor=zlib,shuffle=bit,lvl=1;nometa")
dataset_config = {
"adios1": {
"dataset": {
"transform": "blosc:compressor=zlib,shuffle=bit,lvl=1;nometa"
}
},
"adios2": {
"dataset": {
"operators": [{
"type": "zlib",
"parameters": {
"clevel": 9
}
}]
}
}
}
d.options = json.dumps(dataset_config)
mesh["x"].reset_dataset(d)

electrons = cur_it.particles["electrons"]
Expand Down
8 changes: 6 additions & 2 deletions examples/8_benchmark_parallel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -148,10 +148,14 @@ int main(
// * The number of iterations. Effectively, the benchmark will be repeated for this many
// times.
#if openPMD_HAVE_ADIOS1 || openPMD_HAVE_ADIOS2
benchmark.addConfiguration("", 0, "bp", dt, 10);
benchmark.addConfiguration(
R"({"adios2": {"dataset":{"operators":[{"type": "blosc"}]}}})",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need a test or option for this, since blosc is not always there.

On the other hand: ADIOS2 will just warn if not, and keeps going without it.

My go-to compressor is parallel zstd in blosc though :)

Copy link
Contributor Author

@franzpoeschel franzpoeschel Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, you mean zstd instead of blosc? Or is there a way to pick zstd inside blosc? This current configuration will just select whatever defaults that ADIOS2 defines.

Copy link
Member

@ax3l ax3l Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, you mean zstd instead of blosc? Or is there a way to pick zstd inside blosc?

The latter! blosc is a meta-compressor and implements threaded compression routines for zlib, zstd, lz4, blosclz, etc.:
https://www.blosc.org/pages/blosc-in-depth

here is an example how I use blosc with threaded zstd compressor for openPMD output in WarpX:

And this is how I did it in ADIOS1 in PIConGPU:

See also Fig. 4 in https://arxiv.org/abs/1706.00522 - the threaded compressors are all via blosc.

"bp",
dt,
10 );
#endif
#if openPMD_HAVE_HDF5
benchmark.addConfiguration("", 0, "h5", dt, 10);
benchmark.addConfiguration( "{}", "h5", dt, 10 );
#endif

// Execute all previously configured benchmarks. Will return a MPIBenchmarkReport object
Expand Down
6 changes: 0 additions & 6 deletions include/openPMD/Dataset.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,16 +49,10 @@ class Dataset
Dataset( Extent );

Dataset& extend(Extent newExtent);
Dataset& setChunkSize(Extent const&);
Dataset& setCompression(std::string const&, uint8_t const);
Dataset& setCustomTransform(std::string const&);

Extent extent;
Datatype dtype;
uint8_t rank;
Extent chunkSize;
std::string compression;
std::string transform;
std::string options = "{}"; //!< backend-dependent JSON configuration
};
} // namespace openPMD
9 changes: 9 additions & 0 deletions include/openPMD/Error.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include <exception>
#include <string>
#include <utility>
#include <vector>

namespace openPMD
{
Expand Down Expand Up @@ -62,5 +63,13 @@ namespace error
public:
WrongAPIUsage( std::string what );
};

class BackendConfigSchema : public Error
{
public:
std::vector< std::string > errorLocation;

BackendConfigSchema( std::vector< std::string >, std::string what );
};
}
}
5 changes: 3 additions & 2 deletions include/openPMD/IO/ADIOS/ADIOS1IOHandler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

#include "openPMD/config.hpp"
#include "openPMD/auxiliary/Export.hpp"
#include "openPMD/auxiliary/JSON_internal.hpp"
#include "openPMD/IO/AbstractIOHandler.hpp"

#include <future>
Expand All @@ -42,7 +43,7 @@ namespace openPMD
friend class ADIOS1IOHandlerImpl;

public:
ADIOS1IOHandler(std::string path, Access);
ADIOS1IOHandler(std::string path, Access, json::TracingJSON );
~ADIOS1IOHandler() override;

std::string backendName() const override { return "ADIOS1"; }
Expand All @@ -61,7 +62,7 @@ namespace openPMD
friend class ADIOS1IOHandlerImpl;

public:
ADIOS1IOHandler(std::string path, Access);
ADIOS1IOHandler(std::string path, Access, json::TracingJSON );
~ADIOS1IOHandler() override;

std::string backendName() const override { return "DUMMY_ADIOS1"; }
Expand Down
2 changes: 1 addition & 1 deletion include/openPMD/IO/ADIOS/ADIOS1IOHandlerImpl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ namespace openPMD
private:
using Base_t = CommonADIOS1IOHandlerImpl< ADIOS1IOHandlerImpl >;
public:
ADIOS1IOHandlerImpl(AbstractIOHandler*);
ADIOS1IOHandlerImpl(AbstractIOHandler*, json::TracingJSON);
virtual ~ADIOS1IOHandlerImpl();

virtual void init();
Expand Down
24 changes: 12 additions & 12 deletions include/openPMD/IO/ADIOS/ADIOS2IOHandler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
#include "openPMD/IO/ADIOS/ADIOS2PreloadAttributes.hpp"
#include "openPMD/IO/IOTask.hpp"
#include "openPMD/IO/InvalidatableFile.hpp"
#include "openPMD/auxiliary/JSON.hpp"
#include "openPMD/auxiliary/JSON_internal.hpp"
#include "openPMD/auxiliary/Option.hpp"
#include "openPMD/backend/Writable.hpp"
#include "openPMD/config.hpp"
Expand Down Expand Up @@ -130,14 +130,14 @@ class ADIOS2IOHandlerImpl
ADIOS2IOHandlerImpl(
AbstractIOHandler *,
MPI_Comm,
nlohmann::json config,
json::TracingJSON config,
std::string engineType );

#endif // openPMD_HAVE_MPI

explicit ADIOS2IOHandlerImpl(
AbstractIOHandler *,
nlohmann::json config,
json::TracingJSON config,
std::string engineType );


Expand Down Expand Up @@ -277,15 +277,15 @@ class ADIOS2IOHandlerImpl

std::vector< ParameterizedOperator > defaultOperators;

auxiliary::TracingJSON m_config;
static auxiliary::TracingJSON nullvalue;
json::TracingJSON m_config;
static json::TracingJSON nullvalue;

void
init( nlohmann::json config );
init( json::TracingJSON config );

template< typename Key >
auxiliary::TracingJSON
config( Key && key, auxiliary::TracingJSON & cfg )
json::TracingJSON
config( Key && key, json::TracingJSON & cfg )
{
if( cfg.json().is_object() && cfg.json().contains( key ) )
{
Expand All @@ -298,7 +298,7 @@ class ADIOS2IOHandlerImpl
}

template< typename Key >
auxiliary::TracingJSON
json::TracingJSON
config( Key && key )
{
return config< Key >( std::forward< Key >( key ), m_config );
Expand All @@ -312,7 +312,7 @@ class ADIOS2IOHandlerImpl
* operators have been configured
*/
auxiliary::Option< std::vector< ParameterizedOperator > >
getOperators( auxiliary::TracingJSON config );
getOperators( json::TracingJSON config );

// use m_config
auxiliary::Option< std::vector< ParameterizedOperator > >
Expand Down Expand Up @@ -1398,15 +1398,15 @@ friend class ADIOS2IOHandlerImpl;
std::string path,
Access,
MPI_Comm,
nlohmann::json options,
json::TracingJSON options,
std::string engineType );

#endif

ADIOS2IOHandler(
std::string path,
Access,
nlohmann::json options,
json::TracingJSON options,
std::string engineType );

std::string backendName() const override { return "ADIOS2"; }
Expand Down
5 changes: 5 additions & 0 deletions include/openPMD/IO/ADIOS/CommonADIOS1IOHandler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include "openPMD/IO/AbstractIOHandler.hpp"
#include "openPMD/auxiliary/Filesystem.hpp"
#include "openPMD/auxiliary/DerefDynamicCast.hpp"
#include "openPMD/auxiliary/JSON_internal.hpp"
#include "openPMD/auxiliary/Memory.hpp"
#include "openPMD/auxiliary/StringManip.hpp"
#include "openPMD/IO/AbstractIOHandlerImpl.hpp"
Expand Down Expand Up @@ -89,11 +90,15 @@ namespace openPMD
std::unordered_map< std::shared_ptr< std::string >, ADIOS_FILE* > m_openReadFileHandles;
std::unordered_map< ADIOS_FILE*, std::vector< ADIOS_SELECTION* > > m_scheduledReads;
std::unordered_map< int64_t, std::unordered_map< std::string, Attribute > > m_attributeWrites;
// config options
std::string m_defaultTransform;
/**
* Call this function to get adios file id for a Writable. Will create one if does not exist
* @return returns an adios file id.
*/
int64_t GetFileHandle(Writable*);

void initJson( json::TracingJSON );
}; // ParallelADIOS1IOHandlerImpl
} // openPMD

Expand Down
5 changes: 3 additions & 2 deletions include/openPMD/IO/ADIOS/ParallelADIOS1IOHandler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

#include "openPMD/config.hpp"
#include "openPMD/auxiliary/Export.hpp"
#include "openPMD/auxiliary/JSON_internal.hpp"
#include "openPMD/IO/AbstractIOHandler.hpp"

#include <future>
Expand All @@ -42,9 +43,9 @@ namespace openPMD

public:
# if openPMD_HAVE_MPI
ParallelADIOS1IOHandler(std::string path, Access, MPI_Comm);
ParallelADIOS1IOHandler(std::string path, Access, json::TracingJSON , MPI_Comm);
# else
ParallelADIOS1IOHandler(std::string path, Access);
ParallelADIOS1IOHandler(std::string path, Access, json::TracingJSON);
# endif
~ParallelADIOS1IOHandler() override;

Expand Down
Loading