feat(kv-ir): Add UnstructuredIrDeserializerImpl implementation for unstructured IR deserialization (resolves #2096).#2227
Conversation
…structured ir deserialization.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughFactory now validates protocol version and selects either the existing KvIrDeserializerImpl (Supported) or a new templated UnstructuredIrDeserializerImpl (BackwardCompatible) supporting four- and eight-byte encodings; related decoding helpers, error codes, build files, and tests were added or adjusted. ChangesUnstructured IR deserialization feature
Sequence Diagram(s)sequenceDiagram
actor Client
participant Factory as Deserializer Factory
participant PV as Protocol Validator
participant KvImpl as KvIrDeserializerImpl
participant UnstructImpl as UnstructuredIrDeserializerImpl
participant Reader as ReaderInterface
participant Schema as SchemaTree
Client->>Factory: create_generic(version, encoding_type, metadata, ...)
Factory->>PV: validate_protocol_version(version)
alt Supported
PV-->>Factory: Supported
Factory->>KvImpl: construct KvIrDeserializerImpl(...)
KvImpl-->>Factory: instance
else BackwardCompatible
PV-->>Factory: BackwardCompatible
Factory->>UnstructImpl: UnstructuredIrDeserializerImpl::create(metadata)
UnstructImpl-->>Factory: unique_ptr<UnstructuredIrDeserializerImpl>
end
Factory-->>Client: Deserializer instance
Client->>UnstructImpl: get_next_ir_unit_type(Reader)
alt Pending schema insertions exist
UnstructImpl-->>Client: SchemaTreeNodeInsertion (message/timestamp)
else
UnstructImpl->>Reader: read protocol tag
Reader-->>UnstructImpl: tag
alt tag == LogEvent
UnstructImpl->>Schema: resolve node IDs for message & timestamp
Schema-->>UnstructImpl: NodeLocator info
UnstructImpl-->>Client: KeyValuePairLogEvent
else tag == EndOfStream
UnstructImpl-->>Client: EndOfStream
else tag == UtcOffsetChange
UnstructImpl-->>Client: UtcOffsetChange
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. 👉 Get your free trial and get 200 agent minutes per Slack user (a $50 value). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp`:
- Around line 199-241: The code currently increments m_previous_timestamp before
running fallible steps; instead compute a local new_timestamp =
m_previous_timestamp + timestamp_delta and use that local value in
build_encoded_text_ast/when creating the timestamp Value and in
KeyValuePairLogEvent::create, but do not assign it back to m_previous_timestamp
until after build_encoded_text_ast, resolve_required_node_ids and
KeyValuePairLogEvent::create succeed (i.e., just prior to the successful
return). Update references in this block (deserialize_log_event,
build_encoded_text_ast, resolve_required_node_ids, KeyValuePairLogEvent::create)
to use the local new_timestamp and only mutate m_previous_timestamp on the
success path.
- Around line 48-74: Move the doxygen comment block so it immediately precedes
the template declaration for build_encoded_text_ast; specifically place the
entire /** ... */ docblock above the line "template <ir::EncodedVariableTypeReq
encoded_variable_t>" so the documentation is attached to the templated function
declaration (build_encoded_text_ast) rather than following the template line,
leaving the function signature and body (including
EncodedTextAst<encoded_variable_t>::create usage) unchanged.
- Around line 155-169: Replace the verbose has_error()/error()/value() handling
with the project macro to be concise and move-friendly: use
YSTDLIB_ERROR_HANDLING_TRYX when calling deserialize_tag (currently held in
tag_result) and for the two encoded_text_ast_result usages; specifically wrap
the deserialize_tag(reader) call with YSTDLIB_ERROR_HANDLING_TRYX to return its
error or extract the tag, and do the same for the encoded_text_ast_result
occurrences so they return early on error and yield the inner value consistently
with the rest of the file (keep existing semantics for
cProtocol::Eof/cProtocol::Payload checks and the final IrUnitType returns).
In `@components/core/tests/test-ir_encoding_methods.cpp`:
- Around line 1389-1390: Remove the redundant manual EOF push: delete the call
that does ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof) because
serialize_log_events already appends cProtocol::Eof; ensure tests rely on the
helper-produced EOF (used by is_stream_completed() and decode_ir_complete) and
keep only the serialize_log_events<TestType>(...) call followed by assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c4ea8206-10f2-4989-8ebc-ae5df0a2d712
📒 Files selected for processing (6)
components/core/CMakeLists.txtcomponents/core/src/clp/ffi/ir_stream/Deserializer.hppcomponents/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cppcomponents/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.hppcomponents/core/src/clp_s/CMakeLists.txtcomponents/core/tests/test-ir_encoding_methods.cpp
| REQUIRE(serialize_log_events<TestType>(test_log_events, preamble_ts, ir_buf, encoded_logtypes)); | ||
| ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof); |
There was a problem hiding this comment.
Redundant Eof byte appended after serialize_log_events.
serialize_log_events already appends cProtocol::Eof at line 288, so line 1390 pushes a second EOF that the deserializer never reaches (it halts at the first one and is_stream_completed() becomes true, which is exactly what the assertion at line 1452 then exercises). The byte is harmless but obscures the fact that the EOF is produced by the helper, and it's inconsistent with decode_ir_complete (lines 1001-1064), which relies solely on the helper's EOF.
🧹 Suggested cleanup
REQUIRE(serialize_log_events<TestType>(test_log_events, preamble_ts, ir_buf, encoded_logtypes));
- ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| REQUIRE(serialize_log_events<TestType>(test_log_events, preamble_ts, ir_buf, encoded_logtypes)); | |
| ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof); | |
| REQUIRE(serialize_log_events<TestType>(test_log_events, preamble_ts, ir_buf, encoded_logtypes)); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/tests/test-ir_encoding_methods.cpp` around lines 1389 - 1390,
Remove the redundant manual EOF push: delete the call that does
ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof) because
serialize_log_events already appends cProtocol::Eof; ensure tests rely on the
helper-produced EOF (used by is_stream_completed() and decode_ir_complete) and
keep only the serialize_log_events<TestType>(...) call followed by assertions.
| std::vector<SchemaTree::NodeLocator> m_pending_schema_insertions; | ||
| std::optional<SchemaTree::Node::id_t> m_optional_message_node_id; | ||
| std::optional<SchemaTree::Node::id_t> m_optional_timestamp_node_id; | ||
| ir::epoch_time_ms_t m_previous_timestamp{0}; |
There was a problem hiding this comment.
This member shouldn't exist for eight-byte encoded vars. Check:
| template <ir::EncodedVariableTypeReq encoded_variable_t> | ||
| /** | ||
| * Builds an encoded text AST from the components of an unstructured IR log event. | ||
| * @tparam encoded_variable_t | ||
| * @param logtype | ||
| * @param dict_vars | ||
| * @param encoded_vars | ||
| * @return A result containing the encoded text AST on success, or an error code indicating the | ||
| * failure: | ||
| * - Forwards `EncodedTextAst::create`'s return values on failure. | ||
| */ | ||
| [[nodiscard]] auto build_encoded_text_ast( | ||
| std::string const& logtype, | ||
| std::vector<std::string> const& dict_vars, | ||
| std::vector<encoded_variable_t> encoded_vars | ||
| ) -> ystdlib::error_handling::Result<EncodedTextAst<encoded_variable_t>> { | ||
| StringBlob string_blob; | ||
| for (auto const& dict_var : dict_vars) { | ||
| string_blob.append(dict_var); | ||
| } | ||
| string_blob.append(logtype); | ||
|
|
||
| return EncodedTextAst<encoded_variable_t>::create( | ||
| std::move(encoded_vars), | ||
| std::move(string_blob) | ||
| ); | ||
| } |
There was a problem hiding this comment.
We shouldn't need this. The existing encoded text AST method can be used to deserialize unstructured IR, which uses string blob directly.
| m_pending_schema_insertions.pop_back(); | ||
|
|
||
| key_name.clear(); | ||
| return std::pair{false, node_locator}; |
There was a problem hiding this comment.
A design decsion I forgot to mention: the timestamp key should be considered auto-generated to match our convention in the kv-ir logging library. I think the ``pending_schema_insertionsshould have a pair ofis_auto_gen, node_locator` instead.
| std::vector<std::string> dict_vars; | ||
| ir::epoch_time_ms_t timestamp_delta{}; | ||
|
|
||
| if (auto const err{deserialize_log_event<ir::four_byte_encoded_variable_t>( |
There was a problem hiding this comment.
As mentioned in the previous comment, I think we should use the latest version of deserialize_encoded_text_ast directly as kv-ir, and then deserialize the timestamp/timestamp-delta manually from the stream. Check this implementation for details.
| // Constructors | ||
| explicit UnstructuredIrDeserializerImpl( | ||
| std::vector<SchemaTree::NodeLocator> initial_insertions | ||
| ); | ||
|
|
||
| UnstructuredIrDeserializerImpl( | ||
| std::vector<SchemaTree::NodeLocator> initial_insertions, | ||
| ir::epoch_time_ms_t reference_timestamp | ||
| ); |
There was a problem hiding this comment.
We should add constraints to only enable the first method for eight-byte encoded, and the second for four-byte encoded. Check std::enable_if_t: https://en.cppreference.com/cpp/types/enable_if.
| template <ir::EncodedVariableTypeReq encoded_variable_t> | ||
| UnstructuredIrDeserializerImpl<encoded_variable_t>::UnstructuredIrDeserializerImpl( | ||
| std::vector<SchemaTree::NodeLocator> initial_insertions | ||
| ) | ||
| : m_pending_schema_insertions{std::move(initial_insertions)} {} | ||
|
|
||
| template <ir::EncodedVariableTypeReq encoded_variable_t> | ||
| UnstructuredIrDeserializerImpl<encoded_variable_t>::UnstructuredIrDeserializerImpl( | ||
| std::vector<SchemaTree::NodeLocator> initial_insertions, | ||
| ir::epoch_time_ms_t reference_timestamp | ||
| ) | ||
| : m_pending_schema_insertions{std::move(initial_insertions)}, | ||
| m_previous_timestamp{reference_timestamp} {} |
There was a problem hiding this comment.
Please pay attention to the symbol declaration ordering. The constructor is declared after other functions, then its definition should follow the declaration order.
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
There was a problem hiding this comment.
♻️ Duplicate comments (5)
components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp (5)
39-46:⚠️ Potential issue | 🟠 MajorSchema insertions still don't distinguish auto-generated vs user-generated keys.
create_schema_insertionsreturns astd::vector<NodeLocator>(line 39) anddeserialize_ir_unit_schema_tree_node_insertionalways returns{false, node_locator}(line 178), so bothmessageandtimestampare emitted as user-generated insertions. Per the kv-ir convention,timestampshould be auto-generated; the pending stack should hold(is_auto_gen, NodeLocator)pairs andresolve_required_node_ids(lines 132-137) should look uptimestampagainstauto_gen_keys_schema_treerather thanuser_gen_keys_schema_tree. As-is, downstream consumers expecting the kv-ir layout will see a mismatched data model.Also applies to: 167-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp` around lines 39 - 46, create_schema_insertions currently produces NodeLocator entries but doesn't mark which keys are auto-generated, and deserialize_ir_unit_schema_tree_node_insertion always returns {false, node_locator} so timestamp is treated as user-generated; update create_schema_insertions to mark the timestamp NodeLocator as auto-generated, change deserialize_ir_unit_schema_tree_node_insertion to return a pair (is_auto_gen, NodeLocator) with true for timestamp, update the pending stack type to hold (bool, SchemaTree::NodeLocator) and modify resolve_required_node_ids to consult auto_gen_keys_schema_tree (not user_gen_keys_schema_tree) when resolving the timestamp key so auto-generated keys are stored/checked in the auto-gen table.
210-217: 🧹 Nitpick | 🔵 TrivialUse
YSTDLIB_ERROR_HANDLING_TRYXfor theencoded_text_ast_resultreturns.The
tagdeserialization at line 155 already usesYSTDLIB_ERROR_HANDLING_TRYX, but the twoencoded_text_ast_resultsites (lines 215-217 and 271-273) still use the verbosehas_error()/error()/value()pattern. Switching them to the macro keeps this file consistent withDeserializer.hppand the rest of the file, and is more move-friendly.♻️ Proposed change (apply to both specializations)
- auto encoded_text_ast_result{build_encoded_text_ast<ir::four_byte_encoded_variable_t>( - logtype, - dict_vars, - std::move(encoded_vars) - )}; - if (encoded_text_ast_result.has_error()) { - return encoded_text_ast_result.error(); - } + auto encoded_text_ast{YSTDLIB_ERROR_HANDLING_TRYX( + build_encoded_text_ast<ir::four_byte_encoded_variable_t>( + logtype, + dict_vars, + std::move(encoded_vars) + ) + )}; @@ - user_gen_pairs.emplace(message_node_id, Value{std::move(encoded_text_ast_result.value())}); + user_gen_pairs.emplace(message_node_id, Value{std::move(encoded_text_ast)});Also applies to: 266-273
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp` around lines 210 - 217, Replace the verbose has_error()/error()/value() handling for the results of build_encoded_text_ast in UnstructuredIrDeserializerImpl.cpp with the YSTDLIB_ERROR_HANDLING_TRYX macro: locate the two spots that call build_encoded_text_ast<ir::four_byte_encoded_variable_t> (the encoded_text_ast_result at ~lines 215-217 and the other specialization at ~266-273) and wrap/unwrap the result using YSTDLIB_ERROR_HANDLING_TRYX so the error is propagated and the value is retrieved in a move-friendly way, matching how tag deserialization uses the macro elsewhere in this file.
195-237:⚠️ Potential issue | 🟠 Major
m_previous_timestampis mutated before fallible work completes.Line 208 advances
m_previous_timestampbeforebuild_encoded_text_ast(line 210) andresolve_required_node_ids(line 220) run. If either fails, the persistent timestamp has already moved forward, so any subsequent diagnostic or recovery path observes a corrupted running timestamp. Compute the new timestamp into a local, use the local for the emittedValue(line 227) andKeyValuePairLogEvent::create, and only assign back tom_previous_timestampon the success path immediately before the finalreturn.🛡️ Proposed reordering
- m_previous_timestamp += timestamp_delta; - auto encoded_text_ast_result{build_encoded_text_ast<ir::four_byte_encoded_variable_t>( logtype, dict_vars, std::move(encoded_vars) )}; if (encoded_text_ast_result.has_error()) { return encoded_text_ast_result.error(); } auto const [message_node_id, timestamp_node_id]{ YSTDLIB_ERROR_HANDLING_TRYX(this->resolve_required_node_ids(user_gen_keys_schema_tree)) }; + auto const new_timestamp{m_previous_timestamp + timestamp_delta}; + KeyValuePairLogEvent::NodeIdValuePairs user_gen_pairs; user_gen_pairs.emplace(message_node_id, Value{std::move(encoded_text_ast_result.value())}); user_gen_pairs.emplace( timestamp_node_id, - Value{static_cast<value_int_t>(m_previous_timestamp)} + Value{static_cast<value_int_t>(new_timestamp)} ); - return KeyValuePairLogEvent::create( + auto result{KeyValuePairLogEvent::create( auto_gen_keys_schema_tree, user_gen_keys_schema_tree, {}, std::move(user_gen_pairs), utc_offset - ); + )}; + if (false == result.has_error()) { + m_previous_timestamp = new_timestamp; + } + return result;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp` around lines 195 - 237, The code advances m_previous_timestamp before fallible operations; compute the new timestamp into a local variable (e.g., new_timestamp = m_previous_timestamp + timestamp_delta), use that local for the Value placed into user_gen_pairs and for the utc_offset/KeyValuePairLogEvent::create call, run build_encoded_text_ast and resolve_required_node_ids as currently written, and only on the success path (just before returning the created KeyValuePairLogEvent) assign m_previous_timestamp = new_timestamp; references to help locate the spots: m_previous_timestamp, timestamp_delta, build_encoded_text_ast<ir::four_byte_encoded_variable_t>, resolve_required_node_ids, and KeyValuePairLogEvent::create.
152-164: 🧹 Nitpick | 🔵 TrivialInconsistent
std::pairusage in return statements.Line 164 was updated to use braced-list return (
return {IrUnitType::LogEvent, tag};), but lines 152, 158, and 161 still wrap their returns instd::pair{...}. Make them consistent for clarity.♻️ Proposed change
- if (false == m_pending_schema_insertions.empty()) { - return std::pair{IrUnitType::SchemaTreeNodeInsertion, encoded_tag_t{0}}; - } + if (false == m_pending_schema_insertions.empty()) { + return {IrUnitType::SchemaTreeNodeInsertion, encoded_tag_t{0}}; + } auto const tag{YSTDLIB_ERROR_HANDLING_TRYX(deserialize_tag(reader))}; if (cProtocol::Eof == tag) { - return std::pair{IrUnitType::EndOfStream, tag}; + return {IrUnitType::EndOfStream, tag}; } if (cProtocol::Payload::UtcOffsetChange == tag) { - return std::pair{IrUnitType::UtcOffsetChange, tag}; + return {IrUnitType::UtcOffsetChange, tag}; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp` around lines 152 - 164, The returns in UnstructuredIrDeserializerImpl.cpp are inconsistent: some use std::pair{...} while one uses braced-list return {IrUnitType::LogEvent, tag};; make them consistent by replacing the std::pair{...} returns (the SchemaTreeNodeInsertion and UtcOffsetChange/EndOfStream branches) with the same braced-list style used for the LogEvent return so all return statements return {IrUnitType::..., tag} (referencing IrUnitType, encoded_tag_t, deserialize_tag, YSTDLIB_ERROR_HANDLING_TRYX, and cProtocol::Eof / cProtocol::Payload::UtcOffsetChange to locate the branches).
48-74: 🛠️ Refactor suggestion | 🟠 Major
build_encoded_text_asthelper is redundant.The existing
deserialize_encoded_text_astpath used by kv-ir already produces anEncodedTextAstfrom aStringBlob; the unstructured deserializer can call it directly after manually deserializing the timestamp/timestamp-delta from the stream (seedecoding_methods.cpp). Wrapping that into a separate helper here is unnecessary and duplicates assembly logic. Removing it would also let you drop thedict_vars/encoded_vars/logtypetriple buffered locally indeserialize_ir_unit_kv_pair_log_event.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp` around lines 48 - 74, The build_encoded_text_ast helper is redundant—remove build_encoded_text_ast and instead call the existing deserialize_encoded_text_ast (the kv-ir path) directly after you deserialize the timestamp/timestamp-delta in deserialize_ir_unit_kv_pair_log_event; assemble a StringBlob in-place (as done in decoding_methods.cpp) and pass it to deserialize_encoded_text_ast to get an EncodedTextAst, and drop the temporary dict_vars/encoded_vars/logtype buffers previously created in deserialize_ir_unit_kv_pair_log_event so the unstructured deserializer reuses the shared assembly logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp`:
- Around line 39-46: create_schema_insertions currently produces NodeLocator
entries but doesn't mark which keys are auto-generated, and
deserialize_ir_unit_schema_tree_node_insertion always returns {false,
node_locator} so timestamp is treated as user-generated; update
create_schema_insertions to mark the timestamp NodeLocator as auto-generated,
change deserialize_ir_unit_schema_tree_node_insertion to return a pair
(is_auto_gen, NodeLocator) with true for timestamp, update the pending stack
type to hold (bool, SchemaTree::NodeLocator) and modify
resolve_required_node_ids to consult auto_gen_keys_schema_tree (not
user_gen_keys_schema_tree) when resolving the timestamp key so auto-generated
keys are stored/checked in the auto-gen table.
- Around line 210-217: Replace the verbose has_error()/error()/value() handling
for the results of build_encoded_text_ast in UnstructuredIrDeserializerImpl.cpp
with the YSTDLIB_ERROR_HANDLING_TRYX macro: locate the two spots that call
build_encoded_text_ast<ir::four_byte_encoded_variable_t> (the
encoded_text_ast_result at ~lines 215-217 and the other specialization at
~266-273) and wrap/unwrap the result using YSTDLIB_ERROR_HANDLING_TRYX so the
error is propagated and the value is retrieved in a move-friendly way, matching
how tag deserialization uses the macro elsewhere in this file.
- Around line 195-237: The code advances m_previous_timestamp before fallible
operations; compute the new timestamp into a local variable (e.g., new_timestamp
= m_previous_timestamp + timestamp_delta), use that local for the Value placed
into user_gen_pairs and for the utc_offset/KeyValuePairLogEvent::create call,
run build_encoded_text_ast and resolve_required_node_ids as currently written,
and only on the success path (just before returning the created
KeyValuePairLogEvent) assign m_previous_timestamp = new_timestamp; references to
help locate the spots: m_previous_timestamp, timestamp_delta,
build_encoded_text_ast<ir::four_byte_encoded_variable_t>,
resolve_required_node_ids, and KeyValuePairLogEvent::create.
- Around line 152-164: The returns in UnstructuredIrDeserializerImpl.cpp are
inconsistent: some use std::pair{...} while one uses braced-list return
{IrUnitType::LogEvent, tag};; make them consistent by replacing the
std::pair{...} returns (the SchemaTreeNodeInsertion and
UtcOffsetChange/EndOfStream branches) with the same braced-list style used for
the LogEvent return so all return statements return {IrUnitType::..., tag}
(referencing IrUnitType, encoded_tag_t, deserialize_tag,
YSTDLIB_ERROR_HANDLING_TRYX, and cProtocol::Eof /
cProtocol::Payload::UtcOffsetChange to locate the branches).
- Around line 48-74: The build_encoded_text_ast helper is redundant—remove
build_encoded_text_ast and instead call the existing
deserialize_encoded_text_ast (the kv-ir path) directly after you deserialize the
timestamp/timestamp-delta in deserialize_ir_unit_kv_pair_log_event; assemble a
StringBlob in-place (as done in decoding_methods.cpp) and pass it to
deserialize_encoded_text_ast to get an EncodedTextAst, and drop the temporary
dict_vars/encoded_vars/logtype buffers previously created in
deserialize_ir_unit_kv_pair_log_event so the unstructured deserializer reuses
the shared assembly logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 12f62508-3b04-497e-9e49-468aa14fcef3
📒 Files selected for processing (1)
components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
components/core/tests/test-ir_encoding_methods.cpp (1)
1389-1390:⚠️ Potential issue | 🟡 MinorDrop the extra EOF append.
serialize_log_events()already writescProtocol::Eof, so Line 1390 leaves trailing unread bytes and weakens what this test is actually proving about end-of-stream handling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/tests/test-ir_encoding_methods.cpp` around lines 1389 - 1390, Remove the redundant append of cProtocol::Eof after calling serialize_log_events<TestType>(...) because serialize_log_events already writes the end-of-stream marker; specifically, delete the ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof) call so the test verifies EOF handling as produced by serialize_log_events rather than leaving trailing unread bytes.components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp (1)
117-147:⚠️ Potential issue | 🟡 MinorDelay updating
m_previous_timestampuntil the success path.Line 127 mutates persistent state before
resolve_required_node_ids(...)andKeyValuePairLogEvent::create(...)can still fail. If either returns an error, the deserializer has advanced its running timestamp without having produced a log event.Suggested fix
ir::epoch_time_ms_t timestamp_delta{}; if (auto const err{deserialize_timestamp<ir::four_byte_encoded_variable_t>( reader, timestamp_tag, timestamp_delta )}; IRErrorCode_Success != err) { return ir_error_code_to_errc(err); } - m_previous_timestamp += timestamp_delta; + auto const timestamp{m_previous_timestamp + timestamp_delta}; auto const [message_node_id, timestamp_node_id]{YSTDLIB_ERROR_HANDLING_TRYX( this->resolve_required_node_ids(auto_gen_keys_schema_tree, user_gen_keys_schema_tree) )}; KeyValuePairLogEvent::NodeIdValuePairs auto_gen_pairs; auto_gen_pairs.emplace( timestamp_node_id, - Value{static_cast<value_int_t>(m_previous_timestamp)} + Value{static_cast<value_int_t>(timestamp)} ); KeyValuePairLogEvent::NodeIdValuePairs user_gen_pairs; user_gen_pairs.emplace(message_node_id, Value{std::move(encoded_text_ast)}); - return KeyValuePairLogEvent::create( + auto result = KeyValuePairLogEvent::create( auto_gen_keys_schema_tree, user_gen_keys_schema_tree, std::move(auto_gen_pairs), std::move(user_gen_pairs), utc_offset ); + if (result.has_error()) { + return result.error(); + } + m_previous_timestamp = timestamp; + return result;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp` around lines 117 - 147, Compute the new timestamp in a local variable instead of mutating m_previous_timestamp immediately: call deserialize_timestamp to get timestamp_delta, then compute auto new_timestamp = m_previous_timestamp + timestamp_delta and use new_timestamp when populating the timestamp Value (e.g. Value{static_cast<value_int_t>(new_timestamp)}) and when calling KeyValuePairLogEvent::create; only after resolve_required_node_ids(...) and KeyValuePairLogEvent::create(...) succeed, assign m_previous_timestamp = new_timestamp. This ensures deserialize_timestamp, resolve_required_node_ids, and KeyValuePairLogEvent::create can fail without advancing the persistent m_previous_timestamp.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/tests/test-ir_encoding_methods.cpp`:
- Around line 1412-1443: The test loop currently validates timestamp and message
but omits verifying UTC offset round-trip; update the assertions in the loop
that iterates deserialized_log_events (the block using deserialized_log_event,
user_gen_schema_tree, auto_gen_node_id_value_pairs, user_gen_node_id_value_pairs
and serialized_json_result) to also assert that
deserialized_log_event.get_utc_offset() equals the original offset from the
matching test_log_events entry (created by create_test_log_events()), ensuring
UtcOffsetChange handling is validated; add a REQUIRE comparing
deserialized_log_event.get_utc_offset() to
test_log_events.at(idx).get_utc_offset() (or equivalent getter) immediately
after the existing timestamp/message checks.
---
Duplicate comments:
In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp`:
- Around line 117-147: Compute the new timestamp in a local variable instead of
mutating m_previous_timestamp immediately: call deserialize_timestamp to get
timestamp_delta, then compute auto new_timestamp = m_previous_timestamp +
timestamp_delta and use new_timestamp when populating the timestamp Value (e.g.
Value{static_cast<value_int_t>(new_timestamp)}) and when calling
KeyValuePairLogEvent::create; only after resolve_required_node_ids(...) and
KeyValuePairLogEvent::create(...) succeed, assign m_previous_timestamp =
new_timestamp. This ensures deserialize_timestamp, resolve_required_node_ids,
and KeyValuePairLogEvent::create can fail without advancing the persistent
m_previous_timestamp.
In `@components/core/tests/test-ir_encoding_methods.cpp`:
- Around line 1389-1390: Remove the redundant append of cProtocol::Eof after
calling serialize_log_events<TestType>(...) because serialize_log_events already
writes the end-of-stream marker; specifically, delete the
ir_buf.push_back(clp::ffi::ir_stream::cProtocol::Eof) call so the test verifies
EOF handling as produced by serialize_log_events rather than leaving trailing
unread bytes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 99537c36-6661-490a-9c5e-fcc70f334315
📒 Files selected for processing (5)
components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cppcomponents/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.hppcomponents/core/src/clp/ffi/ir_stream/decoding_methods.cppcomponents/core/src/clp/ffi/ir_stream/decoding_methods.hppcomponents/core/tests/test-ir_encoding_methods.cpp
| for (size_t idx{0}; idx < deserialized_log_events.size(); ++idx) { | ||
| auto const& deserialized_log_event{deserialized_log_events.at(idx)}; | ||
| auto const& user_gen_schema_tree{deserialized_log_event.get_user_gen_keys_schema_tree()}; | ||
| auto const& auto_gen_node_id_value_pairs{ | ||
| deserialized_log_event.get_auto_gen_node_id_value_pairs() | ||
| }; | ||
| auto const& user_gen_node_id_value_pairs{ | ||
| deserialized_log_event.get_user_gen_node_id_value_pairs() | ||
| }; | ||
|
|
||
| REQUIRE((2 == user_gen_schema_tree.get_size())); | ||
| REQUIRE((user_gen_schema_tree.has_node( | ||
| clp::ffi::SchemaTree::NodeLocator{ | ||
| clp::ffi::SchemaTree::cRootId, | ||
| "message", | ||
| clp::ffi::SchemaTree::Node::Type::Str | ||
| } | ||
| ))); | ||
| REQUIRE((1 == auto_gen_node_id_value_pairs.size())); | ||
| REQUIRE((1 == user_gen_node_id_value_pairs.size())); | ||
|
|
||
| auto const serialized_json_result{deserialized_log_event.serialize_to_json()}; | ||
| REQUIRE_FALSE(serialized_json_result.has_error()); | ||
| auto const& [auto_gen_json_obj, user_gen_json_obj]{serialized_json_result.value()}; | ||
| REQUIRE( | ||
| (test_log_events.at(idx).get_timestamp() | ||
| == auto_gen_json_obj.at("timestamp").template get<epoch_time_ms_t>()) | ||
| ); | ||
| REQUIRE( | ||
| (test_log_events.at(idx).get_message() | ||
| == user_gen_json_obj.at("message").template get<string>()) | ||
| ); |
There was a problem hiding this comment.
Assert the UTC offset round-trip here as well.
create_test_log_events() deliberately changes UTC offsets, but this loop only checks timestamp and message. A regression in UtcOffsetChange handling would still pass this new legacy-IR serde test unless you also verify deserialized_log_event.get_utc_offset().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/tests/test-ir_encoding_methods.cpp` around lines 1412 - 1443,
The test loop currently validates timestamp and message but omits verifying UTC
offset round-trip; update the assertions in the loop that iterates
deserialized_log_events (the block using deserialized_log_event,
user_gen_schema_tree, auto_gen_node_id_value_pairs, user_gen_node_id_value_pairs
and serialized_json_result) to also assert that
deserialized_log_event.get_utc_offset() equals the original offset from the
matching test_log_events entry (created by create_test_log_events()), ensuring
UtcOffsetChange handling is validated; add a REQUIRE comparing
deserialized_log_event.get_utc_offset() to
test_log_events.at(idx).get_utc_offset() (or equivalent getter) immediately
after the existing timestamp/message checks.
LinZhihao-723
left a comment
There was a problem hiding this comment.
Reviewed the details of unstructured deserializer impl.
There was a problem hiding this comment.
Instead of reusing the existing one, can we make a result-based version for deserialize_timestamp that returns Result<ir::epoch_time_ms_t>?
|
|
||
| template <ir::EncodedVariableTypeReq encoded_variable_t> | ||
| auto UnstructuredIrDeserializerImpl<encoded_variable_t>::create( | ||
| EncodingType encoding_type, |
There was a problem hiding this comment.
This param is redundant since the encoding type is already indicated by the template parameter, right? Let's remove it.
| } | ||
| auto const ref_timestamp_iter{metadata.find(cProtocol::Metadata::ReferenceTimestampKey)}; | ||
| if (metadata.end() == ref_timestamp_iter || false == ref_timestamp_iter->is_string()) { | ||
| return std::errc::protocol_error; |
There was a problem hiding this comment.
We should use our own error code, not std::errc, right?
| auto const [is_auto_gen, node_locator]{m_pending_schema_insertions.back()}; | ||
| m_pending_schema_insertions.pop_back(); | ||
|
|
||
| key_name.clear(); |
There was a problem hiding this comment.
I just realized this is confusing. The original purpose of passing key_name is to provide a buffer for key name deserialization. In our case, it is not used. I think it might be worth renaming the interface's key_name to key_name_buffer and indicating that it might not be used, depending on the implementation.
| SchemaTree::NodeLocator{SchemaTree::cRootId, cTimestampKey, SchemaTree::Node::Type::Int} | ||
| )}; | ||
| if (false == message_node_id.has_value() || false == timestamp_node_id.has_value()) { | ||
| return std::errc::protocol_error; |
There was a problem hiding this comment.
Again, we should use our own error code.
| auto const message_node_id{user_gen_keys_schema_tree->try_get_node_id( | ||
| SchemaTree::NodeLocator{SchemaTree::cRootId, cMessageKey, SchemaTree::Node::Type::Str} | ||
| )}; | ||
| auto const timestamp_node_id{auto_gen_keys_schema_tree->try_get_node_id( | ||
| SchemaTree::NodeLocator{SchemaTree::cRootId, cTimestampKey, SchemaTree::Node::Type::Int} | ||
| )}; |
There was a problem hiding this comment.
These two locators appear twice in this file, how about drop the constexpr key name and make the locators themselves constexpr?
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp/ffi/ir_stream/IrDeserializationError.hpp`:
- Around line 19-26: The new enum members were inserted mid-list in
IrDeserializationErrorEnum, which changes the numeric values of existing members
(e.g., InvalidTag, UnsupportedVersion) and is a breaking change; restore
stability by either appending any new error members to the end of
IrDeserializationErrorEnum or by assigning explicit numeric values to every enum
member so existing numeric codes (including InvalidReferenceTimestampMetadata,
InvalidReferenceTimestampValue, InvalidTag, MissingRequiredSchemaNodes,
UnknownSchemaTreeNodeType, UnknownValueType, UnsupportedMetadataFormat,
UnsupportedVersion) keep their original values; update the enum declaration in
IrDeserializationErrorEnum accordingly and add a comment explaining that numeric
values are stable and must not be reordered.
In `@components/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cpp`:
- Around line 145-148: The NodeIdValuePairs initialization is inconsistent:
replace the emplace-based builds for KeyValuePairLogEvent::NodeIdValuePairs
(auto_gen_pairs and user_gen_pairs) with the same aggregate initialization
pattern used in the four-byte specialization—construct the containers with
brace-initialized entries (using timestamp_node_id/Value{...} and
message_node_id/Value{...} referencing absolute_timestamp and
std::move(encoded_text_ast)) so both specializations use identical
initialization style.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 75db059c-2aac-47d2-aeb2-bbdcf2c52ac3
📒 Files selected for processing (13)
components/core/CMakeLists.txtcomponents/core/src/clp/ffi/SchemaTree.hppcomponents/core/src/clp/ffi/ir_stream/Deserializer.hppcomponents/core/src/clp/ffi/ir_stream/DeserializerImpl.hppcomponents/core/src/clp/ffi/ir_stream/IrDeserializationError.cppcomponents/core/src/clp/ffi/ir_stream/IrDeserializationError.hppcomponents/core/src/clp/ffi/ir_stream/KvIrDeserializerImpl.cppcomponents/core/src/clp/ffi/ir_stream/KvIrDeserializerImpl.hppcomponents/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.cppcomponents/core/src/clp/ffi/ir_stream/UnstructuredIrDeserializerImpl.hppcomponents/core/src/clp/ffi/ir_stream/decoding_methods.cppcomponents/core/src/clp/ffi/ir_stream/decoding_methods.hppcomponents/core/src/clp_s/CMakeLists.txt
| InvalidReferenceTimestampMetadata, | ||
| InvalidReferenceTimestampValue, | ||
| InvalidTag, | ||
| UnsupportedMetadataFormat, | ||
| UnsupportedVersion, | ||
| MissingRequiredSchemaNodes, | ||
| UnknownSchemaTreeNodeType, | ||
| UnknownValueType, | ||
| UnsupportedMetadataFormat, | ||
| UnsupportedVersion, |
There was a problem hiding this comment.
Keep existing IrDeserializationErrorEnum values stable.
Lines 19-26 insert new members in the middle of a public uint8_t enum, which renumbers existing codes like InvalidTag and UnsupportedVersion. That is a breaking change for any caller that persists or branches on the numeric value.
Proposed fix
enum class IrDeserializationErrorEnum : uint8_t {
DuplicateKey = 1,
- EncodedTextAstDeserializationFailure,
- EndOfStream,
- IncompleteStream,
- InvalidKeyGroupOrdering,
- InvalidMagicNumber,
- InvalidReferenceTimestampMetadata,
- InvalidReferenceTimestampValue,
- InvalidTag,
- MissingRequiredSchemaNodes,
- UnknownSchemaTreeNodeType,
- UnknownValueType,
- UnsupportedMetadataFormat,
- UnsupportedVersion,
+ EncodedTextAstDeserializationFailure = 2,
+ EndOfStream = 3,
+ IncompleteStream = 4,
+ InvalidKeyGroupOrdering = 5,
+ InvalidMagicNumber = 6,
+ InvalidTag = 7,
+ UnknownSchemaTreeNodeType = 8,
+ UnknownValueType = 9,
+ UnsupportedMetadataFormat = 10,
+ UnsupportedVersion = 11,
+ InvalidReferenceTimestampMetadata = 12,
+ InvalidReferenceTimestampValue = 13,
+ MissingRequiredSchemaNodes = 14,
};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/src/clp/ffi/ir_stream/IrDeserializationError.hpp` around
lines 19 - 26, The new enum members were inserted mid-list in
IrDeserializationErrorEnum, which changes the numeric values of existing members
(e.g., InvalidTag, UnsupportedVersion) and is a breaking change; restore
stability by either appending any new error members to the end of
IrDeserializationErrorEnum or by assigning explicit numeric values to every enum
member so existing numeric codes (including InvalidReferenceTimestampMetadata,
InvalidReferenceTimestampValue, InvalidTag, MissingRequiredSchemaNodes,
UnknownSchemaTreeNodeType, UnknownValueType, UnsupportedMetadataFormat,
UnsupportedVersion) keep their original values; update the enum declaration in
IrDeserializationErrorEnum accordingly and add a comment explaining that numeric
values are stable and must not be reordered.
LinZhihao-723
left a comment
There was a problem hiding this comment.
Functionality wise looks good to me. Validated that:
- Unstructured IR logs can be ingested into clp-s using
timestamp-key="@timestamp". - Unstructured IR logs can be searched by clp-s cli.
| * deserialize. | ||
| */ | ||
| template <ir::EncodedVariableTypeReq encoded_variable_t> | ||
| [[nodiscard]] auto deserialize_timestamp(ReaderInterface& reader, encoded_tag_t encoded_tag) |
There was a problem hiding this comment.
How about deserialize_timestamp_or_timestamp_delta?
| epoch_time_ms_t ts{}; | ||
| if (auto const error_code = deserialize_timestamp<encoded_variable_t>(reader, encoded_tag, ts); | ||
| IRErrorCode_Success != error_code) | ||
| { | ||
| if (IRErrorCode_Corrupted_IR == error_code) { | ||
| return IrDeserializationError{IrDeserializationErrorEnum::InvalidTag}; | ||
| } | ||
| return IrDeserializationError{IrDeserializationErrorEnum::IncompleteStream}; | ||
| } | ||
| return ts; |
There was a problem hiding this comment.
| epoch_time_ms_t ts{}; | |
| if (auto const error_code = deserialize_timestamp<encoded_variable_t>(reader, encoded_tag, ts); | |
| IRErrorCode_Success != error_code) | |
| { | |
| if (IRErrorCode_Corrupted_IR == error_code) { | |
| return IrDeserializationError{IrDeserializationErrorEnum::InvalidTag}; | |
| } | |
| return IrDeserializationError{IrDeserializationErrorEnum::IncompleteStream}; | |
| } | |
| return ts; | |
| if constexpr (is_same_v<encoded_variable_t, eight_byte_encoded_variable_t>) { | |
| if (cProtocol::Payload::TimestampVal != encoded_tag) { | |
| return IrDeserializationError{IrDeserializationErrorEnum::InvalidTag}; | |
| } | |
| return deserialize_int<epoch_time_ms_t>(reader); | |
| } else { | |
| if (cProtocol::Payload::TimestampDeltaByte == encoded_tag) { | |
| return static_cast<epoch_time_ms_t>( | |
| YSTDLIB_ERROR_HANDLING_TRYX(deserialize_int<int8_t>(reader)) | |
| ); | |
| } | |
| if (cProtocol::Payload::TimestampDeltaShort == encoded_tag) { | |
| return static_cast<epoch_time_ms_t>( | |
| YSTDLIB_ERROR_HANDLING_TRYX(deserialize_int<int16_t>(reader)) | |
| ); | |
| } | |
| if (cProtocol::Payload::TimestampDeltaInt == encoded_tag) { | |
| return static_cast<epoch_time_ms_t>( | |
| YSTDLIB_ERROR_HANDLING_TRYX(deserialize_int<int32_t>(reader)) | |
| ); | |
| } | |
| if (cProtocol::Payload::TimestampDeltaLong == encoded_tag) { | |
| return static_cast<epoch_time_ms_t>( | |
| YSTDLIB_ERROR_HANDLING_TRYX(deserialize_int<int64_t>(reader)) | |
| ); | |
| } | |
| return IrDeserializationError{IrDeserializationErrorEnum::InvalidTag}; | |
| } |
I think we should just rewrite it using the latest helpers. We should do a clean-up to remove all old/outdated helpers.
There was a problem hiding this comment.
Don't forget to update the docstring since we're now forwarding the error code from deserialize_int instead.
| * Virtual base class defining methods for deserializing IR units and log events | ||
| * called by `Deserializer`. |
There was a problem hiding this comment.
This change should be reverted (might be introduced by IDE auto-formatter).
Please check the diff everytime you make a commit to make sure all changes are intentional.
| * @return A result containing the deserialized UTC offset on success, or an | ||
| * error code indicating the failure: |
| if (metadata_json.contains(cProtocol::Metadata::UserDefinedMetadataKey) | ||
| && false == metadata_json.at(cProtocol::Metadata::UserDefinedMetadataKey).is_object()) | ||
| { | ||
| return std::errc::protocol_not_supported; | ||
| } |
There was a problem hiding this comment.
This is the metadata check. It should be supported for both kv-ir and the old IR format. Let's move it outside of this if-else cluster.
|
|
||
| case IrUnitType::SchemaTreeNodeInsertion: { | ||
| std::string key_name; | ||
| std::string key_name_buffer; |
There was a problem hiding this comment.
I think this buffer should be made as a member variable of Deserializer to avoid memory reallocation. Can you create an issue to track it? @coderabbitai
There was a problem hiding this comment.
🧩 Analysis chain
@LinZhihao-723, I've created a GitHub issue to track moving key_name_buffer to a member variable of Deserializer. The link will be provided once the script completes.
🏁 Script executed:
#!/bin/bash
gh issue create \
--repo y-scope/clp \
--title "refactor(kv-ir): Make \`key_name_buffer\` a member variable of \`Deserializer\` to avoid repeated memory reallocation" \
--body "## Summary
In \`components/core/src/clp/ffi/ir_stream/Deserializer.hpp\`, the \`key_name_buffer\` variable inside \`Deserializer::deserialize_next_ir_unit\` is currently declared as a local \`std::string\` within the \`IrUnitType::SchemaTreeNodeInsertion\` case block:
\`\`\`cpp
case IrUnitType::SchemaTreeNodeInsertion: {
std::string key_name_buffer;
...
}
\`\`\`
This causes a heap allocation on every call to \`deserialize_next_ir_unit\` that encounters a schema tree node insertion IR unit.
## Proposed Change
Move \`key_name_buffer\` to be a member variable of \`Deserializer\`, and clear it at the start of each use. This allows the allocated memory to be reused across calls, avoiding repeated heap allocations.
## References
- PR: https://github.com/y-scope/clp/pull/2227
- Comment: https://github.com/y-scope/clp/pull/2227#discussion_r3191785760
- Requested by: `@LinZhihao-723`
"
Length of output: 43
@LinZhihao-723, I've created the issue to track this: #2249
The issue captures the proposal to move key_name_buffer from a local variable inside the IrUnitType::SchemaTreeNodeInsertion case block to a member variable of Deserializer, so that the allocated memory can be reused across calls instead of being reallocated on every schema tree node insertion encountered during deserialization.
🧠 Learnings used
Learnt from: SharafMohamed
Repo: y-scope/clp PR: 2080
File: components/core/src/clp/ProfilerReporter.hpp:34-53
Timestamp: 2026-03-07T10:37:02.035Z
Learning: In components/core/src/clp/ProfilerReporter.hpp, do not suggest or implement resetting Profiler::reset_runtime_measurements() in ProfilerReporter constructors. The Reporter is intended to accumulate and report all runtime measurements up to its destruction, enabling mid-run reporting with a single reporter created at program start. When reviewing similar code, prefer preserving accumulated profiler state and only report/clear at explicit points if the design requires it; avoid constructor-time resets in reporter components.
| } else { | ||
| if (EncodingType::FourByte == encoding_type) { | ||
| deserializer_impl = YSTDLIB_ERROR_HANDLING_TRYX( | ||
| UnstructuredIrDeserializerImpl<ir::four_byte_encoded_variable_t>::create( |
There was a problem hiding this comment.
Need to update the docstring to reflect that it forwards UnstructuredIrDeserializerImpl::create's return values on failure.
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Description
Completes the implementation of #2096.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Bug Fixes / Diagnostics
Chores
Tests