Skip to content

Support JSON serialization into any stream like objects #10414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

yhabteab
Copy link
Member

@yhabteab yhabteab commented Apr 17, 2025

This PR kinda rewrites the whole JSON serialization process allowing now to directly write the JSON tokens to whatever output stream you want, including std::ostream, std::string, or any other stream that implements the nlohmann::detail::output_adapter_protocol<> interface. This allows for more flexibility and performance improvements when dealing with JSON data, especially in our HTTP handlers where we can now just write the JSON directly to the HTTP response body without the need for intermediate buffering and perfectly suitable for chunked transfer encoding. I know that @julianbrost didn't want to perform all the JSON serialization by ourselves, but the cost of not doing so is just even higher, since we would have to transform our Value types into nlohmann::json objects to be able to use the existing JSON serialization of the library. I was just too naive to think that my previously proposed solution would actually bring any improvement, but in fact it was even worse.

There are some open questions left regarding this though:

  • Are all happy with the new encoder interface? It is now a bit more complex than before, but it does exactly what we need and is much more flexible. Even though the JsonEncoder only supports writing to a stream, there's still the previous helper function JsonEncode() which wraps the encoder and provides a simple interface if required.

Test cases with #10491:

$ curl -ksSiu root:icinga -H 'Accept: application/json' \
 -X POST 'https://localhost:5667/v1/events' \
 -d '{ "queue": "myqueue", "types": [ "CheckResult" ] }'

HTTP/1.1 200 OK
Server: Icinga/v2.15.0-27-g94f3f2f3a
Content-Type: application/json
Transfer-Encoding: chunked

{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.290213,"execution_start":1751635745.290213,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.290238,"schedule_start":1751635745.280238,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-17","timestamp":1751635745.291581,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.499443,"execution_start":1751635745.499443,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.499476,"schedule_start":1751635745.491047,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-29","timestamp":1751635745.501003,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.547113,"execution_start":1751635745.547113,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.547127,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-67","timestamp":1751635745.549277,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.549259,"execution_start":1751635745.549259,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":2,"schedule_end":1751635745.549272,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"test","timestamp":1751635745.551912,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.549242,"execution_start":1751635745.549242,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.549253,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-2","timestamp":1751635745.552046,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"mbp-yhabteab","command":"dummy","execution_end":1751635745.551865,"execution_start":1751635745.551865,"exit_status":0,"output":"I'm just testing something","performance_data":[],"previous_hard_state":99,"schedule_end":1751635745.551875,"schedule_start":1751635745.54,"scheduling_source":"mbp-yhabteab","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"Host-78","timestamp":1751635745.553246,"type":"CheckResult"}
...

fixes #10408
closes #10400

@yhabteab yhabteab added the area/api REST API label Apr 17, 2025
@cla-bot cla-bot bot added the cla/signed label Apr 17, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from e10ebdd to edc4112 Compare April 17, 2025 08:34
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I was thinking to add another overload that allows to directly write the JSON into an Asio stream,

❤️

but since the json library we use doesn't provide such an implementation we'd have to implement quite a lot of unusual things.

Actually, you'd just need a yc-aware streambuf (for ostream) which writes to a stream of your choice.

@julianbrost
Copy link
Contributor

So for my understanding: The nlohmann::json type is actually quite similar to icinga::Value in the sense that it contains a union to store the different types of JSON values:

/// the value of the current element
json_value m_value = {};

////////////////////////
// JSON value storage //
////////////////////////
/*!
@brief a JSON value
The actual storage for a JSON value of the @ref basic_json class. This
union combines the different storage types for the JSON value types
defined in @ref value_t.
JSON type | value_t type | used type
--------- | --------------- | ------------------------
object | object | pointer to @ref object_t
array | array | pointer to @ref array_t
string | string | pointer to @ref string_t
boolean | boolean | @ref boolean_t
number | number_integer | @ref number_integer_t
number | number_unsigned | @ref number_unsigned_t
number | number_float | @ref number_float_t
binary | binary | pointer to @ref binary_t
null | null | *no value is stored*
@note Variable-length types (objects, arrays, and strings) are stored as
pointers. The size of the union should not exceed 64 bits if the default
value types are used.
@since version 1.0.0
*/
union json_value
{
/// object (stored with pointer to save storage)
object_t* object;
/// array (stored with pointer to save storage)
array_t* array;
/// string (stored with pointer to save storage)
string_t* string;
/// binary (stored with pointer to save storage)
binary_t* binary;
/// boolean
boolean_t boolean;
/// number (integer)
number_integer_t number_integer;
/// number (unsigned integer)
number_unsigned_t number_unsigned;
/// number (floating-point)
number_float_t number_float;

Now what does the nlohmann::adl_serializer<icinga::Value> implementation you provide actually do? Doesn't that basically transform its input into a copy represented as nlohmann::json?

In other words: doesn't this just add a new copy of the JSON to potentially save a copy in another place?

Even then, it wouldn't make any sense, I mean, why would we ever want to (async)write every and each of the JSON tokens individually into an Asio stream? Instead, we can just serialize the JSON into an asio::streambuf and pass that to any Asio write methods (see 4d1a2a9 for example usages).

I don't really see how asio::streambuf helps here. Conceptually, the usage there looks pretty similar to what it would look like when using a std::stringstream. So if you did the same thing for /v1/objects/* responses, wouldn't this still allocate the full JSON at once?

why would we ever want to (async)write every and each of the JSON tokens individually into an Asio stream?

We don't want to send them to the network individually. But that sounds exactly like the situation buffered streams are made for.

@yhabteab
Copy link
Member Author

Now what does the nlohmann::adl_serializer<icinga::Value> implementation you provide actually do? Doesn't that basically transform its input into a copy represented as nlohmann::json?

Well, I guess you already answered your first question. Yes, it would hold a copy of the given Value type but if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

So if you did the same thing for /v1/objects/* responses, wouldn't this still allocate the full JSON at once?
We don't want to send them to the network individually. But that sounds exactly like the situation buffered streams are made for.

Hm, then I will add that overload once the other questions are answered regarding how the Value type should be serialized.

@Al2Klimov
Copy link
Member

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

icinga2/lib/base/json.cpp

Lines 458 to 476 in 520aed6

void JsonEncoder<prettyPrint>::AppendChar(char c)
{
m_Result.emplace_back(c);
}
template<bool prettyPrint>
template<class Iterator>
inline
void JsonEncoder<prettyPrint>::AppendChars(Iterator begin, Iterator end)
{
m_Result.insert(m_Result.end(), begin, end);
}
template<bool prettyPrint>
inline
void JsonEncoder<prettyPrint>::AppendJson(nlohmann::json json)
{
nlohmann::detail::serializer<nlohmann::json>(nlohmann::detail::output_adapter<char>(m_Result), ' ').dump(std::move(json), prettyPrint, true, 0);
}

(AppendChar could even call AppendChars, making them only two.)

Simple enough to overload, I guess.

@yhabteab
Copy link
Member Author

yhabteab commented Apr 17, 2025

Well, I guess you already answered your first question. Yes, it would hold a copy of the given Value type but if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Alternatively, since we're shipping the json library in this repo, we can just directly patch it to support our Value type similar to how other users of that library are doing. That way, we wouldn't have to transform the value beforehand and doesn't require us to implement a custom serializer as well.

Edit: Sorry, they don't use a patched version of the library but a newer version than that we currently use.

@julianbrost
Copy link
Contributor

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

Well, isn't that just the "or [...] like how it's done previously" part?

Alternatively, since we're shipping the json library in this repo, we can just directly patch it to support our Value type similar to how other users of that library are doing. That way, we wouldn't have to transform the value beforehand and doesn't require us to implement a custom serializer as well.

I'm not sure what to look for in the code search you've linked. Forking the library (that's basically what you're suggesting) is possible, but sounds like something I'd rather try to avoid.

@yhabteab
Copy link
Member Author

I'm not sure what to look for in the code search you've linked. Forking the library (that's basically what you're suggesting) is possible, but sounds like something I'd rather try to avoid.

I've already updated my previous comment and actually they don't use a patched version of the library but they convert their custom type to the json object just like how I am doing in this PR.

Edit: Sorry, they don't use a patched version of the library but a newer version than that we currently use.

@Al2Klimov
Copy link
Member

if we don't want to perform the low level JSON serialization stuff then we need to transform the Value class into a nlohmann::basic_json<> instance or implement a custom serializer for the Value class like how it's done previously.

Not entirely. In the current implementation, everything ends up in m_Result which is written by barely three methods:

Well, isn't that just the "or [...] like how it's done previously" part?

No, it's an implementation detail (to visualize the low effort).

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 2 times, most recently from 5354178 to d088a16 Compare April 22, 2025 07:59
@yhabteab
Copy link
Member Author

I think, I've implemented now all the requested points in #10408 and also updated the PR description, including listing two remaining open questions. So, please have a look again!

@yhabteab yhabteab requested a review from julianbrost April 22, 2025 08:08
@yhabteab yhabteab changed the title Support json serialization into std::ostream Support json serialization into std::ostream and Asio stream Apr 22, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from d088a16 to 6afd670 Compare April 23, 2025 09:09
@Al2Klimov

This comment was marked as off-topic.

@yhabteab

This comment was marked as off-topic.

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 6afd670 to a79b2aa Compare June 20, 2025 11:19
@yhabteab yhabteab added the enhancement New feature or request label Jun 20, 2025
@yhabteab yhabteab added this to the 2.16.0 milestone Jun 20, 2025
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 2 times, most recently from 2d5fade to 4c77b12 Compare June 23, 2025 06:58
@yhabteab
Copy link
Member Author

I've now adjusted the encoder to incrementally build JSON objects and arrays, which allows for more efficient memory usage and avoids the need to store large intermediate results in memory. I've modified the ObjectQueryHandler to use this new encoder and would roughly look like this (the generated response doesn't satisfy the HTTP specification, but it gives you an idea of how the encoder works):

diff --git a/lib/remote/httpserverconnection.cpp b/lib/remote/httpserverconnection.cpp
index 17e61f160..75356479f 100644
--- a/lib/remote/httpserverconnection.cpp
+++ b/lib/remote/httpserverconnection.cpp
@@ -458,7 +458,7 @@ bool ProcessRequest(
                return false;
        }

-       http::async_write(stream, response, yc);
+       //http::async_write(stream, response, yc);
        stream.async_flush(yc);

        return true;
diff --git a/lib/remote/objectqueryhandler.cpp b/lib/remote/objectqueryhandler.cpp
index f6f049e4e..685d5d928 100644
--- a/lib/remote/objectqueryhandler.cpp
+++ b/lib/remote/objectqueryhandler.cpp
@@ -3,6 +3,7 @@
 #include "remote/objectqueryhandler.hpp"
 #include "remote/httputility.hpp"
 #include "remote/filterutility.hpp"
diff --git a/lib/remote/httpserverconnection.cpp b/lib/remote/httpserverconnection.cpp
index 17e61f160..75356479f 100644
--- a/lib/remote/httpserverconnection.cpp
+++ b/lib/remote/httpserverconnection.cpp
@@ -458,7 +458,7 @@ bool ProcessRequest(
                return false;
        }

-       http::async_write(stream, response, yc);
+       //http::async_write(stream, response, yc);
        stream.async_flush(yc);

        return true;
diff --git a/lib/remote/objectqueryhandler.cpp b/lib/remote/objectqueryhandler.cpp
index f6f049e4e..685d5d928 100644
--- a/lib/remote/objectqueryhandler.cpp
+++ b/lib/remote/objectqueryhandler.cpp
@@ -3,6 +3,7 @@
 #include "remote/objectqueryhandler.hpp"
 #include "remote/httputility.hpp"
 #include "remote/filterutility.hpp"
+#include "base/json.hpp"
 #include "base/serializer.hpp"
 #include "base/dependencygraph.hpp"
 #include "base/configtype.hpp"
@@ -166,9 +167,6 @@ bool ObjectQueryHandler::HandleRequest(
                return true;
        }

-       ArrayData results;
-       results.reserve(objs.size());
-
        std::set<String> joinAttrs;
        std::set<String> userJoinAttrs;

@@ -194,6 +192,10 @@ bool ObjectQueryHandler::HandleRequest(
        std::unordered_map<Type*, std::pair<bool, std::unique_ptr<Expression>>> typePermissions;
        std::unordered_map<Object*, bool> objectAccessAllowed;

+       auto adapter(std::make_shared<AsioStreamAdapter<AsioTlsStream>>(stream, yc));
+       JsonEncoder encoder(adapter, params && HttpUtility::GetLastParameter(params, "pretty"));
+
+       bool started = false;
        for (ConfigObject::Ptr obj : objs) {
                DictionaryData result1{
                        { "name", obj->GetName() },
@@ -311,15 +313,18 @@ bool ObjectQueryHandler::HandleRequest(

                result1.emplace_back("joins", new Dictionary(std::move(joins)));

-               results.push_back(new Dictionary(std::move(result1)));
-       }
-
-       Dictionary::Ptr result = new Dictionary({
-               { "results", new Array(std::move(results)) }
-       });
+               if (!started) {
+                       started = true;
+                       response.result(http::status::ok);
+                       response.set(http::field::content_type, "application/json");
+                       http::response_serializer<http::string_body> serializer(response);
+                       http::async_write_header(stream, serializer, yc);

-       response.result(http::status::ok);
-       HttpUtility::SendJsonBody(response, params, result);
+                       encoder.Encode(JsonEncoder::BeginObject, JsonEncoder::MakeField("results", JsonEncoder::BeginArray));
+               }
+               encoder.EncodeItem(new Dictionary(std::move(result1)));
+       }

+       encoder.Encode(JsonEncoder::Flush);
        return true;
 }

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 256fc5c to 10b47ad Compare July 3, 2025 08:43
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 10b47ad to 85d2fc7 Compare July 3, 2025 11:52
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch 2 times, most recently from b810837 to c9a2434 Compare July 4, 2025 07:36
@yhabteab yhabteab requested a review from Al2Klimov July 4, 2025 07:52
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from c9a2434 to 25dae46 Compare July 4, 2025 08:06
Al2Klimov
Al2Klimov previously approved these changes Jul 4, 2025
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional:

@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 25dae46 to 09de342 Compare July 4, 2025 09:34
@yhabteab yhabteab requested a review from Al2Klimov July 4, 2025 09:50
Al2Klimov
Al2Klimov previously approved these changes Jul 4, 2025
@yhabteab yhabteab changed the title Support json serialization into std::ostream and Asio stream Support JSON serialization into any stream like objects Jul 4, 2025
yhabteab and others added 10 commits July 4, 2025 14:23
It's just a wrapper around the `JsonEncoder` class to simplify its usage.
Replacing invalid UTF-8 characters beforehand by our selves doesn't make
any sense, the serializer can literally perform the same replacement ops
with the exact same Unicode replacement character (U+FFFD) on its own.
So, why not just use it directly? Instead of wasting memory on a temporary
`String` object to always UTF-8 validate every and each value, we just
use the serializer to directly to dump the replaced char (if any) into
the output writer. No memory waste, no fuss!
@yhabteab yhabteab force-pushed the support-json-serialization-into-ostream branch from 09de342 to a33b9a2 Compare July 4, 2025 15:01
@yhabteab
Copy link
Member Author

yhabteab commented Jul 4, 2025

Sorry! As @julianbrost already found out, we don't need to introduce our own std::string output adapter, since the library already provides one on its own. Besides, apart from introducing that new StringOutputAdapter class, I didn't even make use of it in the JsonEncoder class, so it's now completely removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api REST API cla/signed enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON encoding: allow writing to streams directly
4 participants