Skip to content

Fix[BMQ]: unsafe schemalearner use #618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 13, 2025
Merged

Conversation

dorjesinpo
Copy link
Collaborator

Cannot use bmqp::SchemaLearner in both FSM and event threads.
Instead, call the leaner in FSM and cache the result (schema) in the generated event.

@dorjesinpo dorjesinpo added the bug Something isn't working label Feb 19, 2025
@dorjesinpo dorjesinpo requested a review from a team as a code owner February 19, 2025 16:32
@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from a454b0d to 91d01b5 Compare February 19, 2025 17:33
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2484 of commit 91d01b5 has completed with FAILURE

@dorjesinpo dorjesinpo requested a review from 678098 February 19, 2025 18:58
@678098 678098 changed the title Fix unsafe scheamalearner use Fix unsafe schemalearner use Feb 24, 2025
@678098 678098 changed the title Fix unsafe schemalearner use Fix[BMQ]: unsafe schemalearner use Feb 24, 2025
@@ -60,6 +60,7 @@
// BMQ

#include <bmqa_queueid.h>
#include <bmqp_messageproperties.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change in this header is this new field:
bmqp::MessageProperties::SchemaPtr d_schema_sp

It is a pointer type, so we don't need to know the class sizeof in advance. We can replace include to a heavy header with a forward declaration to this type. At some point in the future we might want to optimize includes to achieve a better compilation time.

If you think it's helpful, you can make this change now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. No nested MessageProperties::Schema though, now it has to be MessageProperties_Schema


const SchemaPtr& result = contextHandle->d_schema_sp;

if (!result) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume that it's an unlikely scenario? Or is it better to evaluate the expression without such assumptions?

MessageProperties mps(d_allocator_p);
int rc = mps.streamIn(blob, input, result);

if (rc == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If rc != 0 we will silently return an empty SchemaPtr. Should we log something or skip the message?

bmqp::MessageProperties::SchemaPtr schema = schemaLearner().learn(
schemaLearner().createContext(queueId),
bmqp::MessagePropertiesInfo(info.d_header),
bdlbb::Blob());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at learn implementation:

SchemaLearner::SchemaPtr
SchemaLearner::learn(Context&                     context,
                     const MessagePropertiesInfo& input,
                     const bdlbb::Blob&           blob)
{
    const SchemaIdType inputId = input.schemaId();

    if (!isPresentAndValid(inputId)) {
        // Nothing to do
        return SchemaPtr();  // RETURN
    }

We have an early return if schemaId is not present or valid. Does it happen for all messages that don't have message properties? If it can happen that frequently, we do a lot of unnecessary work here:

  • Pass args and call schemaLearner().createContext(queueId)
  • Within schemaLearner().createContext(queueId) make lookup in a map (we don't use its result)
  • Construct bdlbb::Blob() that we don't use
  • Pass args and call SchemaLearner::learn just to return early

Do you think it's worth to avoid these extra operations if we don't have schemaId?

@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from 91d01b5 to f0622b7 Compare March 11, 2025 00:13
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2518 of commit f0622b7 has completed with FAILURE

@dorjesinpo dorjesinpo closed this Apr 8, 2025
@dorjesinpo dorjesinpo deleted the fix/unsafe-scheamalearner-use branch April 8, 2025 22:06
@dorjesinpo dorjesinpo restored the fix/unsafe-scheamalearner-use branch April 8, 2025 22:18
@dorjesinpo dorjesinpo reopened this Apr 8, 2025
@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from f0622b7 to bbb706a Compare May 12, 2025 15:28

if (mps.streamIn(appData, input.isExtended()) == 0) {
// Learn new schema.
*schemaHolder = schema = mps.makeSchema(d_allocator_p);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern about the number of SchemaPtr copies.

Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2657 of commit 5b46679 has completed with SUCCESS

@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch 2 times, most recently from 7fc32f5 to 2ea8627 Compare May 19, 2025 19:41
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2676 of commit 2ea8627 has completed with FAILURE

@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from 2ea8627 to efcc512 Compare June 5, 2025 15:10
@@ -117,6 +121,8 @@ struct MessageImpl {
/// SubscriptionHandle this message is associated with
bmqt::SubscriptionHandle d_subscriptionHandle;

bsl::shared_ptr<const bmqp::MessageProperties_Schema> d_schema_sp;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a private component of MessageProperties, we should probably be referring to it through the typedef

Suggested change
bsl::shared_ptr<const bmqp::MessageProperties_Schema> d_schema_sp;
bmqp::MessageProperties::SchemaPtr d_schema_sp;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #618 (comment)

We can return to that version, sure

Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2723 of commit efcc512 has completed with SUCCESS

@678098 678098 assigned dorjesinpo and unassigned 678098 Jun 11, 2025
@678098
Copy link
Collaborator

678098 commented Jun 11, 2025

Hi @dorjesinpo, do you want to make any more changes to this PR?

@dorjesinpo
Copy link
Collaborator Author

Hi @dorjesinpo, do you want to make any more changes to this PR?

Hi. Other than the typedef change, no. Was waiting for the next round of your review

Copy link
Collaborator

@678098 678098 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last comment

bmqp::EventUtilQueueInfo(msgIterator.header(),
subQueueInfos[0].id(),
msgIterator.applicationDataSize(),
schema));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still extra schema copies here, can do the same change you did in another file:

    bmqp::MessageProperties::SchemaPtr* schema_p = 0;

    schema_p = d_schemaLearner.observe(d_schemaLearner.createContext(queueId),
                                       input);

    if (schema_p) {
        if (schema_p->get() == 0) {
            // Learn new Schema by reading all MessageProperties.
            bmqp::MessageProperties mps(d_allocator_p);

            if (mps.streamIn(d_appData, input.isExtended()) == 0) {
                // Learn new schema.
                *schema_p = mps.makeSchema(d_allocator_p);
            }
        }
    }

@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from efcc512 to ef0bd0d Compare June 11, 2025 21:31
@dorjesinpo dorjesinpo assigned 678098 and unassigned dorjesinpo Jun 11, 2025
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2742 of commit ef0bd0d has completed with FAILURE

bmqp::MessageProperties::SchemaPtr schema;

if (schema_p) {
if (!schema_p->get() == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ! a bug?

Suggested change
if (!schema_p->get() == 0) {
if (schema_p->get() == 0) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ! a bug?

it is! Amended

@678098 678098 assigned dorjesinpo and unassigned 678098 Jun 12, 2025
Signed-off-by: dorjesinpo <[email protected]>
@dorjesinpo dorjesinpo force-pushed the fix/unsafe-scheamalearner-use branch from ef0bd0d to b7531cc Compare June 12, 2025 21:31
@dorjesinpo dorjesinpo assigned 678098 and unassigned dorjesinpo Jun 12, 2025
Copy link

@bmq-oss-ci bmq-oss-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build 2748 of commit b7531cc has completed with FAILURE

Copy link
Collaborator

@678098 678098 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

BSLS_ASSERT_SAFE(d_rawEvent.isPushEvent());
BSLS_ASSERT_SAFE(0 <= position);
BSLS_ASSERT_SAFE(static_cast<int>(d_correlationIds.size()) > position);
BSLS_ASSERT_SAFE(static_cast<int>(d_contexts.size()) > position);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer to reorder so the smaller value is on the left

Suggested change
BSLS_ASSERT_SAFE(static_cast<int>(d_contexts.size()) > position);
BSLS_ASSERT_SAFE(position < static_cast<int>(d_contexts.size()));

@678098 678098 assigned dorjesinpo and unassigned 678098 Jun 13, 2025
@dorjesinpo dorjesinpo merged commit 495dc34 into main Jun 13, 2025
39 of 40 checks passed
@678098 678098 deleted the fix/unsafe-scheamalearner-use branch June 13, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants