Skip to content

Conversation

@yhabteab
Copy link
Member

@yhabteab yhabteab commented Apr 17, 2025

This PR kinda rewrites the whole JSON serialization process allowing now to directly write the JSON tokens to whatever output stream you want, including std::ostream, std::string, or any other stream that implements the nlohmann::detail::output_adapter_protocol<> interface. This allows for more flexibility and performance improvements when dealing with JSON data, especially in our HTTP handlers where we can now just write the JSON directly to the HTTP response body without the need for intermediate buffering and perfectly suitable for chunked transfer encoding. I know that @julianbrost didn't want to perform all the JSON serialization by ourselves, but the cost of not doing so is just even higher, since we would have to transform our Value types into nlohmann::json objects to be able to use the existing JSON serialization of the library. I was just too naive to think that my previously proposed solution would actually bring any improvement, but in fact it was even worse.

There are some open questions left regarding this though:

  • Are all happy with the new encoder interface? It is now a bit more complex than before, but it does exactly what we need and is much more flexible. Even though the JsonEncoder only supports writing to a stream, there's still the previous helper function JsonEncode() which wraps the encoder and provides a simple interface if required.

Test cases with #10491:

$ curl -ksSiu root:icinga -H 'Accept: application/json' \
 -X POST 'https://localhost:5667/v1/events' \
 -d '{ "queue": "myqueue", "types": [ "CheckResult" ] }'

HTTP/1.1 200 OK
Server: Icinga/v2.15.0-27-g94f3f2f3a
Content-Type: application/json
Transfer-Encoding: chunked

{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.290213,"execution_start":1751635745.290213,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.290238,"schedule_start":1751635745.280238,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-17","timestamp":1751635745.291581,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.499443,"execution_start":1751635745.499443,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.499476,"schedule_start":1751635745.491047,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-29","timestamp":1751635745.501003,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.547113,"execution_start":1751635745.547113,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.547127,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-67","timestamp":1751635745.549277,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.549259,"execution_start":1751635745.549259,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":2,"schedule_end":1751635745.549272,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"test","timestamp":1751635745.551912,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.549242,"execution_start":1751635745.549242,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.549253,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-2","timestamp":1751635745.552046,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.551865,"execution_start":1751635745.551865,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.551875,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-78","timestamp":1751635745.553246,"type":"CheckResult"}
...

fixes #10408
closes #10400

@yhabteab yhabteab added the area/api REST API label Apr 17, 2025
@cla-bot cla-bot bot added the cla/signed label Apr 17, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from e10ebdd to edc4112 Compare April 17, 2025 08:34
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I was thinking to add another overload that allows to directly write the JSON into an Asio stream,

❤️

but since the json library we use doesn't provide such an implementation we'd have to implement quite a lot of unusual things.

Actually, you'd just need a yc-aware streambuf (for ostream) which writes to a stream of your choice.

@julianbrost
Copy link
Contributor

So for my understanding: The nlohmann::json type is actually quite similar to icinga::Value in the sense that it contains a union to store the different types of JSON values:

/// the value of the current element
json_value m_value = {};

////////////////////////
// JSON value storage //
////////////////////////
/*!
@brief a JSON value
The actual storage for a JSON value of the @ref basic_json class. This
union combines the different storage types for the JSON value types
defined in @ref value_t.
JSON type | value_t type | used type
--------- | --------------- | ------------------------
object | object | pointer to @ref object_t
array | array | pointer to @ref array_t
string | string | pointer to @ref string_t
boolean | boolean | @ref boolean_t
number | number_integer | @ref number_integer_t
number | number_unsigned | @ref number_unsigned_t
number | number_float | @ref number_float_t
binary | binary | pointer to @ref binary_t
null | null | *no value is stored*
@note Variable-length types (objects, arrays, and strings) are stored as
pointers. The size of the union should not exceed 64 bits if the default
value types are used.
@since version 1.0.0
*/
union json_value
{
/// object (stored with pointer to save storage)
object_t* object;
/// array (stored with pointer to save storage)
array_t* array;
/// string (stored with pointer to save storage)
string_t* string;
/// binary (stored with pointer to save storage)
binary_t* binary;
/// boolean
boolean_t boolean;
/// number (integer)
number_integer_t number_integer;
/// number (unsigned integer)
number_unsigned_t number_unsigned;
/// number (floating-point)
number_float_t number_float;

Now what does the nlohmann::adl_serializer<icinga::Value> implementation you provide actually do? Doesn't that basically transform its input into a copy represented as nlohmann::json?

In other words: doesn't this just add a new copy of the JSON to potentially save a copy in another place?

Even then, it wouldn't make any sense, I mean, why would we ever want to (async)write every and each of the JSON tokens individually into an Asio stream? Instead, we can just serialize the JSON into an asio::streambuf and pass that to any Asio write methods (see 4d1a2a9 for example usages).

I don't really see how asio::streambuf helps here. Conceptually, the usage there looks pretty similar to what it would look like when using a std::stringstream. So if you did the same thing for /v1/objects/* responses, wouldn't this still allocate the full JSON at once?

why would we ever want to (async)write every and each of the JSON tokens individually into an Asio stream?

We don't want to send them to the network individually. But that sounds exactly like the situation buffered streams are made for.

@yhabteab
Copy link
Member Author

Now what does the nlohmann::adl_serializer<icinga::Value> implementation you provide actually do? Doesn't that basically transform its input into a copy represented as nlohmann::json?

Well, I guess you already answered your first question. Yes, it would hold a copy of the given Value type but if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

So if you did the same thing for /v1/objects/* responses, wouldn't this still allocate the full JSON at once?
We don't want to send them to the network individually. But that sounds exactly like the situation buffered streams are made for.

Hm, then I will add that overload once the other questions are answered regarding how the Value type should be serialized.

@Al2Klimov
Copy link
Member

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

icinga2/lib/base/json.cpp

Lines 458 to 476 in 520aed6

void JsonEncoder<prettyPrint>::AppendChar(char c)
{
m_Result.emplace_back(c);
}
template<bool prettyPrint>
template<class Iterator>
inline
void JsonEncoder<prettyPrint>::AppendChars(Iterator begin, Iterator end)
{
m_Result.insert(m_Result.end(), begin, end);
}
template<bool prettyPrint>
inline
void JsonEncoder<prettyPrint>::AppendJson(nlohmann::json json)
{
nlohmann::detail::serializer<nlohmann::json>(nlohmann::detail::output_adapter<char>(m_Result), ' ').dump(std::move(json), prettyPrint, true, 0);
}

(AppendChar could even call AppendChars, making them only two.)

Simple enough to overload, I guess.

@yhabteab
Copy link
Member Author

yhabteab commented Apr 17, 2025

Well, I guess you already answered your first question. Yes, it would hold a copy of the given Value type but if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Alternatively, since we're shipping the json library in this repo, we can just directly patch it to support our Value type similar to how other users of that library are doing. That way, we wouldn't have to transform the value beforehand and doesn't require us to implement a custom serializer as well.

Edit: Sorry, they don't use a patched version of the library but a newer version than that we currently use.

@julianbrost
Copy link
Contributor

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

Well, isn't that just the "or [...] like how it's done previously" part?

Alternatively, since we're shipping the json library in this repo, we can just directly patch it to support our Value type similar to how other users of that library are doing. That way, we wouldn't have to transform the value beforehand and doesn't require us to implement a custom serializer as well.

I'm not sure what to look for in the code search you've linked. Forking the library (that's basically what you're suggesting) is possible, but sounds like something I'd rather try to avoid.

@yhabteab
Copy link
Member Author

I'm not sure what to look for in the code search you've linked. Forking the library (that's basically what you're suggesting) is possible, but sounds like something I'd rather try to avoid.

I've already updated my previous comment and actually they don't use a patched version of the library but they convert their custom type to the json object just like how I am doing in this PR.

Edit: Sorry, they don't use a patched version of the library but a newer version than that we currently use.

@Al2Klimov
Copy link
Member

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

Well, isn't that just the "or [...] like how it's done previously" part?

No, it's an implementation detail (to visualize the low effort).

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 2 times, most recently from 5354178 to d088a16 Compare April 22, 2025 07:59
@yhabteab
Copy link
Member Author

I think, I've implemented now all the requested points in #10408 and also updated the PR description, including listing two remaining open questions. So, please have a look again!

@yhabteab yhabteab requested a review from julianbrost April 22, 2025 08:08
@yhabteab yhabteab changed the title Support json serialization into std::ostream Support json serialization into std::ostream and Asio stream Apr 22, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from d088a16 to 6afd670 Compare April 23, 2025 09:09
@Al2Klimov

This comment was marked as off-topic.

@yhabteab

This comment was marked as off-topic.

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 6afd670 to a79b2aa Compare June 20, 2025 11:19
@yhabteab yhabteab added the enhancement New feature or request label Jun 20, 2025
@yhabteab yhabteab added this to the 2.16.0 milestone Jun 20, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 2 times, most recently from 2d5fade to 4c77b12 Compare June 23, 2025 06:58
@yhabteab

This comment was marked as outdated.

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 3 times, most recently from f2a7ce6 to 1309169 Compare July 10, 2025 09:52
@yhabteab yhabteab requested a review from julianbrost July 10, 2025 09:54
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 1309169 to e0d49e5 Compare July 10, 2025 13:57
yhabteab added 6 commits July 10, 2025 18:09
It's just a wrapper around the `JsonEncoder` class to simplify its usage.
Replacing invalid UTF-8 characters beforehand by our selves doesn't make
any sense, the serializer can literally perform the same replacement ops
with the exact same Unicode replacement character (U+FFFD) on its own.
So, why not just use it directly? Instead of wasting memory on a temporary
`String` object to always UTF-8 validate every and each value, we just
use the serializer to directly to dump the replaced char (if any) into
the output writer. No memory waste, no fuss!
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from e0d49e5 to 89418f3 Compare July 10, 2025 16:10
@yhabteab yhabteab requested a review from julianbrost July 10, 2025 16:10
@yhabteab yhabteab requested a review from julianbrost July 11, 2025 12:08
Copy link
Contributor

@julianbrost julianbrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one more thing, and it's even my own fault :(

This commit intruduces a small helper class that wraps any writer and
provides a flush operation that performs the corresponding action if the
writer is an AsyncJsonWriter and does nothing otherwise.
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 0df1e9b to 1f15f0f Compare July 11, 2025 14:10
@yhabteab yhabteab requested a review from julianbrost July 11, 2025 14:12
@julianbrost
Copy link
Contributor

Thank you for being so patient with me!

@yhabteab
Copy link
Member Author

Thank you for being so patient with me!

In the end we all want to achieve the same goal in one way or another, so no worries. :)
I actually learned a bit more about floating-numbers due to all the discussions, so it was worth it!

@julianbrost julianbrost merged commit 8d15e7f into master Jul 15, 2025
25 checks passed
@julianbrost julianbrost deleted the support-json-serialization-into-ostream branch July 15, 2025 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api REST API cla/signed enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JSON encoding: allow writing to streams directly

5 participants