Skip to content

Support json serialization into std::ostream and Asio stream #10414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

yhabteab
Copy link
Member

@yhabteab yhabteab commented Apr 17, 2025

This PR kinda rewrites the whole JSON serialization process allowing now to directly write the JSON tokens to either std::ostream or an Asio stream. For the former case, the nlohmann/json library already provides a full support, while the latter is new and is fully implemented by our own. It satisfies the nlohmann::detail::output_adapter_protocol<> interface and can be used like any other output adapter for the nlohmann/json library. Please refer to the inline documentation of the individual classes for more details. You can also find an example of how to use the new encoder with an Asio stream here 5354178, and that is basically how we're going to use it for the /v1/objects endpoint. I know that @julianbrost didn't want to perform all the JSON serialization by ourselves, but the cost of not doing so is just even higher, since we would have to transform our Value types into nlohmann::json objects to be able to use the existing JSON serialization of the library. I was just too naive to think that my previously proposed solution would actually bring any improvement, but in fact it was even worse.

There are some open questions left regarding this though:

  • Since the AsioStreamAdapter<> class asynchronously writes each character to the stream, potentially causing the coroutine to yield at any time, and resume later on a different thread, how do we handle the object lock needed for container serialization like Dictionary or Array? I mean, I don't think that the objects passed to the encoder will ever be accessed/modified by another thread, since we always use only temporary objects, but still, we need to make sure that we don't end up with some undefined behaviours (yes it many not necessarily be a deadlock). The current implementation caches the container begin and end iterators and releases the lock explicitly before writing the JSON tokens to the stream, i.e. the lock is enforced when one want to access the iterators, so the lock is released before even starting to traverse the container.
  • Are all happy with the new encoder interface? It is now a bit more complex than before, but it does exactly what we need and is much more flexible. Even though the JsonEncoder only supports writing to a stream, there's still the previous helper function JsonEncode() which wraps the encoder and provides a simple interface if required.

Test cases for 5354178:

$ curl -k -s -S -i -u root:icinga -H 'Accept: application/json' \
 -X POST 'https://localhost:5667/v1/events' \
 -d '{ "queue": "myqueue", "types": [ "CheckResult" ] }'
HTTP/1.1 200 OK
Server: Icinga/v2.14.0-580-g4d1a2a95e
Content-Type: application/json

{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"random","execution_end":1744878369.045693,"execution_start":1744878369.044834,"exit_status":0,"output":"Hello from mbp-yhabteab. Icinga 2 has been running for 863 milliseconds. Version: v2.14.0-580-g4d1a2a95e","performance_data":[{"counter":false,"crit":null,"label":"time","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1744878369.045537,"warn":null},{"counter":false,"crit":null,"label":"value","max":null,"min":null,"type":"PerfdataValue","unit":"","value":554,"warn":null},{"counter":false,"crit":null,"label":"value_1m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":498.6,"warn":null},{"counter":false,"crit":null,"label":"value_5m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":443.20000000000005,"warn":null},{"counter":false,"crit":null,"label":"uptime","max":null,"min":null,"type":"PerfdataValue","unit":"","value":0.863487958908081,"warn":null}],"previous_hard_state":2,"schedule_end":1744878369.045693,"schedule_start":1744878369.0384912,"scheduling_source":"mbp-yhabteab","state":3,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":3,"state_type":1},"vars_before":{"attempt":1,"reachable":false,"state":2,"state_type":1}},"downtime_depth":0,"host":"big-switch-server-334","service":"agent-check-17","timestamp":1744878369.046833,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"random","execution_end":1744878369.28083,"execution_start":1744878369.279451,"exit_status":0,"output":"Hello from mbp-yhabteab. Icinga 2 has been running for 1 second. Version: v2.14.0-580-g4d1a2a95e","performance_data":[{"counter":false,"crit":null,"label":"time","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1744878369.280713,"warn":null},{"counter":false,"crit":null,"label":"value","max":null,"min":null,"type":"PerfdataValue","unit":"","value":554,"warn":null},{"counter":false,"crit":null,"label":"value_1m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":498.6,"warn":null},{"counter":false,"crit":null,"label":"value_5m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":443.20000000000005,"warn":null},{"counter":false,"crit":null,"label":"uptime","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1.0986640453338623,"warn":null}],"previous_hard_state":2,"schedule_end":1744878369.28083,"schedule_start":1744878369.27222,"scheduling_source":"mbp-yhabteab","state":3,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":3,"state_type":1},"vars_before":{"attempt":1,"reachable":false,"state":2,"state_type":1}},"downtime_depth":0,"host":"big-switch-server-27","service":"agent-check-5","timestamp":1744878369.282143,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"random","execution_end":1744878369.477017,"execution_start":1744878369.475539,"exit_status":0,"output":"Hello from mbp-yhabteab. Icinga 2 has been running for 1 second. Version: v2.14.0-580-g4d1a2a95e","performance_data":[{"counter":false,"crit":null,"label":"time","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1744878369.476928,"warn":null},{"counter":false,"crit":null,"label":"value","max":null,"min":null,"type":"PerfdataValue","unit":"","value":554,"warn":null},{"counter":false,"crit":null,"label":"value_1m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":498.6,"warn":null},{"counter":false,"crit":null,"label":"value_5m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":443.20000000000005,"warn":null},{"counter":false,"crit":null,"label":"uptime","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1.2948789596557617,"warn":null}],"previous_hard_state":0,"schedule_end":1744878369.477017,"schedule_start":1744878369.4682555,"scheduling_source":"mbp-yhabteab","state":3,"ttl":0,"type":"CheckResult","vars_after":{"attempt":2,"reachable":true,"state":3,"state_type":0},"vars_before":{"attempt":1,"reachable":true,"state":1,"state_type":0}},"downtime_depth":0,"host":"big-switch-server-373","service":"agent-check-12","timestamp":1744878369.478105,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"random","execution_end":1744878369.579804,"execution_start":1744878369.578383,"exit_status":0,"output":"Hello from mbp-yhabteab. Icinga 2 has been running for 1 second. Version: v2.14.0-580-g4d1a2a95e","performance_data":[{"counter":false,"crit":null,"label":"time","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1744878369.57971,"warn":null},{"counter":false,"crit":null,"label":"value","max":null,"min":null,"type":"PerfdataValue","unit":"","value":554,"warn":null},{"counter":false,"crit":null,"label":"value_1m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":498.6,"warn":null},{"counter":false,"crit":null,"label":"value_5m","max":null,"min":null,"type":"PerfdataValue","unit":"","value":443.20000000000005,"warn":null},{"counter":false,"crit":null,"label":"uptime","max":null,"min":null,"type":"PerfdataValue","unit":"","value":1.3976609706878662,"warn":null}],"previous_hard_state":1,"schedule_end":1744878369.579804,"schedule_start":1744878369.5709357,"scheduling_source":"mbp-yhabteab","state":3,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":3,"state_type":1},"vars_before":{"attempt":1,"reachable":false,"state":1,"state_type":1}},"downtime_depth":0,"host":"big-switch-server-247","service":"sshd","timestamp":1744878369.581242,"type":"CheckResult"}
...

fixes #10408
closes #10400

@yhabteab yhabteab added the area/api REST API label Apr 17, 2025
@cla-bot cla-bot bot added the cla/signed label Apr 17, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from e10ebdd to edc4112 Compare April 17, 2025 08:34
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I was thinking to add another overload that allows to directly write the JSON into an Asio stream,

❤️

but since the json library we use doesn't provide such an implementation we'd have to implement quite a lot of unusual things.

Actually, you'd just need a yc-aware streambuf (for ostream) which writes to a stream of your choice.

@julianbrost
Copy link
Contributor

So for my understanding: The nlohmann::json type is actually quite similar to icinga::Value in the sense that it contains a union to store the different types of JSON values:

/// the value of the current element
json_value m_value = {};

////////////////////////
// JSON value storage //
////////////////////////
/*!
@brief a JSON value
The actual storage for a JSON value of the @ref basic_json class. This
union combines the different storage types for the JSON value types
defined in @ref value_t.
JSON type | value_t type | used type
--------- | --------------- | ------------------------
object | object | pointer to @ref object_t
array | array | pointer to @ref array_t
string | string | pointer to @ref string_t
boolean | boolean | @ref boolean_t
number | number_integer | @ref number_integer_t
number | number_unsigned | @ref number_unsigned_t
number | number_float | @ref number_float_t
binary | binary | pointer to @ref binary_t
null | null | *no value is stored*
@note Variable-length types (objects, arrays, and strings) are stored as
pointers. The size of the union should not exceed 64 bits if the default
value types are used.
@since version 1.0.0
*/
union json_value
{
/// object (stored with pointer to save storage)
object_t* object;
/// array (stored with pointer to save storage)
array_t* array;
/// string (stored with pointer to save storage)
string_t* string;
/// binary (stored with pointer to save storage)
binary_t* binary;
/// boolean
boolean_t boolean;
/// number (integer)
number_integer_t number_integer;
/// number (unsigned integer)
number_unsigned_t number_unsigned;
/// number (floating-point)
number_float_t number_float;

Now what does the nlohmann::adl_serializer<icinga::Value> implementation you provide actually do? Doesn't that basically transform its input into a copy represented as nlohmann::json?

In other words: doesn't this just add a new copy of the JSON to potentially save a copy in another place?

Even then, it wouldn't make any sense, I mean, why would we ever want to (async)write every and each of the JSON tokens individually into an Asio stream? Instead, we can just serialize the JSON into an asio::streambuf and pass that to any Asio write methods (see 4d1a2a9 for example usages).

I don't really see how asio::streambuf helps here. Conceptually, the usage there looks pretty similar to what it would look like when using a std::stringstream. So if you did the same thing for /v1/objects/* responses, wouldn't this still allocate the full JSON at once?

why would we ever want to (async)write every and each of the JSON tokens individually into an Asio stream?

We don't want to send them to the network individually. But that sounds exactly like the situation buffered streams are made for.

@yhabteab
Copy link
Member Author

Now what does the nlohmann::adl_serializer<icinga::Value> implementation you provide actually do? Doesn't that basically transform its input into a copy represented as nlohmann::json?

Well, I guess you already answered your first question. Yes, it would hold a copy of the given Value type but if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

So if you did the same thing for /v1/objects/* responses, wouldn't this still allocate the full JSON at once?
We don't want to send them to the network individually. But that sounds exactly like the situation buffered streams are made for.

Hm, then I will add that overload once the other questions are answered regarding how the Value type should be serialized.

@Al2Klimov
Copy link
Member

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

icinga2/lib/base/json.cpp

Lines 458 to 476 in 520aed6

void JsonEncoder<prettyPrint>::AppendChar(char c)
{
m_Result.emplace_back(c);
}
template<bool prettyPrint>
template<class Iterator>
inline
void JsonEncoder<prettyPrint>::AppendChars(Iterator begin, Iterator end)
{
m_Result.insert(m_Result.end(), begin, end);
}
template<bool prettyPrint>
inline
void JsonEncoder<prettyPrint>::AppendJson(nlohmann::json json)
{
nlohmann::detail::serializer<nlohmann::json>(nlohmann::detail::output_adapter<char>(m_Result), ' ').dump(std::move(json), prettyPrint, true, 0);
}

(AppendChar could even call AppendChars, making them only two.)

Simple enough to overload, I guess.

@yhabteab
Copy link
Member Author

yhabteab commented Apr 17, 2025

Well, I guess you already answered your first question. Yes, it would hold a copy of the given Value type but if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Alternatively, since we're shipping the json library in this repo, we can just directly patch it to support our Value type similar to how other users of that library are doing. That way, we wouldn't have to transform the value beforehand and doesn't require us to implement a custom serializer as well.

Edit: Sorry, they don't use a patched version of the library but a newer version than that we currently use.

@julianbrost
Copy link
Contributor

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

Well, isn't that just the "or [...] like how it's done previously" part?

Alternatively, since we're shipping the json library in this repo, we can just directly patch it to support our Value type similar to how other users of that library are doing. That way, we wouldn't have to transform the value beforehand and doesn't require us to implement a custom serializer as well.

I'm not sure what to look for in the code search you've linked. Forking the library (that's basically what you're suggesting) is possible, but sounds like something I'd rather try to avoid.

@yhabteab
Copy link
Member Author

I'm not sure what to look for in the code search you've linked. Forking the library (that's basically what you're suggesting) is possible, but sounds like something I'd rather try to avoid.

I've already updated my previous comment and actually they don't use a patched version of the library but they convert their custom type to the json object just like how I am doing in this PR.

Edit: Sorry, they don't use a patched version of the library but a newer version than that we currently use.

@Al2Klimov
Copy link
Member

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

Well, isn't that just the "or [...] like how it's done previously" part?

No, it's an implementation detail (to visualize the low effort).

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 2 times, most recently from 5354178 to d088a16 Compare April 22, 2025 07:59
@yhabteab
Copy link
Member Author

I think, I've implemented now all the requested points in #10408 and also updated the PR description, including listing two remaining open questions. So, please have a look again!

@yhabteab yhabteab requested a review from julianbrost April 22, 2025 08:08
@yhabteab yhabteab changed the title Support json serialization into std::ostream Support json serialization into std::ostream and Asio stream Apr 22, 2025
auto end(arr->End());
// Release the object lock if we're writing to an Asio stream, i.e. every write operation
// is asynchronous which may cause the coroutine to yield and later resume on another thread.
RELEASE_OLOCK_IF_ASYNC_WRITER(m_Writer, olock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't just release the lock...

Comment on lines +76 to +83
for (auto it(begin); it != end; ++it) {
if (it != begin) {
m_Writer->write_characters(",\n", 2);
}
m_Writer->write_characters(m_IndentStr.c_str(), newIndent);
Encode(*it, newIndent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and do stuff with the iterator then!

Consider using GetLength() and Get() instead of the iterator or pinning the coroutine to the thread or something (idk).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and do stuff with the iterator then!

Why not? Also, I've already listed this as a TBD question in the PR description:

There are some open questions left regarding this though:

  • Since the AsioStreamAdapter<> class asynchronously writes each character to the stream, potentially causing the coroutine to yield at any time, and resume later on a different thread, how do we handle the object lock needed for container serialization like Dictionary or Array? I mean, I don't think that the objects passed to the encoder will ever be accessed/modified by another thread, since we always use only temporary objects, but still, we need to make sure that we don't end up with some undefined behaviours (yes it many not necessarily be a deadlock). The current implementation caches the container begin and end iterators and releases the lock explicitly before writing the JSON tokens to the stream, i.e. the lock is enforced when one want to access the iterators, so the lock is released before even starting to traverse the container.

Consider using GetLength() and Get() instead of the iterator

How does that make any difference? In fact, using GetLength() and Get() would actually worsen the performance, since every Array::Get() call copies the value which I want to avoid in any case.

pinning the coroutine to the thread

And how should I do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already listed this as a TBD question in the PR description:

There are some open questions left regarding this though:

  • Since the AsioStreamAdapter<> class asynchronously writes each character to the stream, potentially causing the coroutine to yield at any time, and resume later on a different thread, how do we handle the object lock needed for container serialization like Dictionary or Array?

we need to make sure that we don't end up with some undefined behaviours (yes it many not necessarily be a deadlock).

Oh, I've overseen this 🐘.

pinning the coroutine to the thread

And how should I do that?

This was just an idea.

It's just a wrapper around the `JsonEncoder` class to simplify its usage.
Instead of serializing the events into a JSON `String` and transforming
it into an Asio buffer, we can directly write the JSON into an Asio
stream using the new `JsonEncoder`.
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from d088a16 to 6afd670 Compare April 23, 2025 09:09
@Al2Klimov
Copy link
Member

Also, this whole adventure is only suitable for /v1/events, except if the other API endpoints can live w/o a pre-known content length. But idk whether Boost Beast supports chunked encoding or we have to live with connections closed after each request.

@yhabteab
Copy link
Member Author

Also, this whole adventure is only suitable for /v1/events, except if the other API endpoints can live w/o a pre-known content length. But idk whether Boost Beast supports chunked encoding or we have to live with connections closed after each request.

Chunked or something else it's out of this PR scope! So, if you want to help, please focus on answering the remaining open questions from the PR description instead of bringing up somewhat unrelated topics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON encoding: allow writing to streams directly
3 participants