Skip to content

Enhancements to the pub sub client reconnect unsubscribe publish with time out #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

moamenvx
Copy link

@moamenvx moamenvx commented Apr 10, 2025

Describe your changes

Issue ticket number and link

Checklist - Manual tasks

  • Examples are executing successfully

  • Created/updated unit tests. Code Coverage percentage on new code shall be >= 80%.

  • Created/updated integration tests.

  • Devcontainer can be opened successfully

  • Devcontainer can be opened successfully behind a corporate proxy

  • Devcontainer can be re-built successfully

  • Extended the documentation (e.g. README.md, CONTRIBUTING.md, Velocitas Docs)

Copy link
Member

@BjoernAtBosch BjoernAtBosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi moamenvx,
thx for your 1st contribution. Looks meaningful.
However, there are some findings to be fixed.
Regards, Björn

Comment on lines 62 to 70
/**
* @brief Status of an MQTT publish operation
*/
enum PublishStatus {
Success, // Message was published successfully
Timeout, // Publish operation timed out
Failure // Publish operation failed (e.g., exception thrown)
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used here. Should be moved to IPubSubClient.h

Comment on lines +107 to +113
/**
* @brief Reconnect the client to the broker.
* @param timeout_ms maximum time to wait for the reconnection attempt to complete, in
* milliseconds.
*/
virtual void reconnect(int timeout_ms) = 0;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be visible to the app "user" code: Instead, reconnection should be done "under the hood" within the PubSubClient after calling connect() and until calling disconnect().
Would be cool if you could achieve that. There is also longer on our minds, that currently the app is terminating if there is no mqtt connection available at startup. Better behaviour would be, that the app is just silently waiting and connecting to the mqtt broker once it gets available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote back:

When testing the original component, I ran into a problem where, after shutting down the connection and bringing it back up, the client didn’t detect the restored connection and remained down. So, I needed a way to explicitly attempt a reconnect.

Ok, but how do you detect that situation (to trigger the reconnect)? I mean, an (maybe naive) idea would be to detect it during a publish call, e.g. because it's failing an then try to trigger the reconnect. But of course this wouldn't help with previously established subscriptions because the lacking notifications cannot be detected ...

My problem here is, that I'd like to avoid inserting this function, because removing it later (when finding a "better" solution) would be an incompatible change.

Comment on lines 21 to 25
<<<<<<< HEAD
|lxml|5.3.2|New BSD|
=======
|lxml|5.3.1|New BSD|
>>>>>>> ad99c59d1940332b4c003f40072511903b40cacd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix this merge conflict
You can get the required content just from the output of the failing workflow. There is a section where you can copy the content from. Just past it over the existing content of this file. Make sure there is exactly one single FF after the last license entry!

*
* @param topic the MQTT topic to publish the message to
* @param data the payload to send as the message
* @param timeout_ms maximum time to wait for the publish to complete, in milliseconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the corner cases: What does a timeout of 0 ms mean? What about negative numbers?

@moamenvx
Copy link
Author

moamenvx commented Apr 10, 2025 via email

@moamenvx moamenvx marked this pull request as draft April 15, 2025 11:53
@moamenvx moamenvx marked this pull request as ready for review April 15, 2025 12:08
@moamenvx moamenvx requested a review from BjoernAtBosch April 16, 2025 07:01
Copy link
Member

@BjoernAtBosch BjoernAtBosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi moamenvx,
please revert your changes regarding indentation and include order. This is handled by our tooling.
Furthermore, it makes it hard to understand the real code changes. (In general, those kind of changes should not be intermixed with real behavioral changes.)
Will continue reviewing once those changes are reverted.

@moamenvx moamenvx requested a review from BjoernAtBosch April 16, 2025 11:19
@MP91
Copy link
Contributor

MP91 commented May 9, 2025

I quickly fixed the licenses again, sorry for the annoying workflow, lets see if we can improve this in the future

Copy link
Member

@BjoernAtBosch BjoernAtBosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, for this long time of silence, but we were busy with releasing the Conan 2 migration (and other things), which is now available.

After rebasing your branch you're "on Conan 2" then. This might require to rebuild you're devcontainer.
If you already have an app repo basing on our app template, you'll also need to migrate this to Conan 2. We've put a migration guide at the bottom of the template's readme.

Comment on lines +62 to +70
/**
* @brief Status of a publish operation
*/
enum PublishStatus {
Success, // Message was published successfully
Timeout, // Publish operation timed out
Failure // Publish operation failed (e.g., exception thrown)
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please move this to IPubSuClient.h and include that header in VehicleApp.h

auto range = m_subscriberMap.equal_range(topic);
m_subscriberMap.erase(range.first, range.second);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See finding above

Suggested change
void unsubscribeTopic(const AsyncSubscriptionPtr_t<std::string>& subscription) override {
for (auto it = m_subscriberMap.begin(); it != m_subscriberMap.end(); ++it) {
if (it->second == subscription) {
auto topic = it->first;
m_subscriberMap.erase(it);
if (m_subscriberMap.count(topic) == 0) {
m_client.unsubscribe(topic)->wait();
}
return;
}
}
}

* @param topic The topic to unsubscribe from.
*/
virtual void unsubscribeTopic(const std::string& topic) = 0;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be different subscriptions to the same topic. Therefore we should also offer the option to recall just a specific subscription:

Suggested change
/**
* @brief Recall a specific subscription
*
* @param subscription The subscription to recall; this is an object earlier returned by subscibeTopic.
*/
virtual void unsubscribeTopic(const AsyncSubscriptionPtr_t<std::string>& subscription = {}) = 0;

* Failure
*/
virtual PublishStatus publishOnTopic(const std::string& topic, const std::string& data,
int timeout_ms) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer using some unsigned int to make clear that negative numbers are not useful.

Suggested change
int timeout_ms) = 0;
unsigned int timeout_ms) = 0;

* Success, Timeout, or Failure
*/
virtual PublishStatus publishOnTopic(const std::string& topic, const std::string& data,
int timeout_ms);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int timeout_ms);
unsigned int timeout_ms);

Comment on lines +107 to +113
/**
* @brief Reconnect the client to the broker.
* @param timeout_ms maximum time to wait for the reconnection attempt to complete, in
* milliseconds.
*/
virtual void reconnect(int timeout_ms) = 0;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote back:

When testing the original component, I ran into a problem where, after shutting down the connection and bringing it back up, the client didn’t detect the restored connection and remained down. So, I needed a way to explicitly attempt a reconnect.

Ok, but how do you detect that situation (to trigger the reconnect)? I mean, an (maybe naive) idea would be to detect it during a publish call, e.g. because it's failing an then try to trigger the reconnect. But of course this wouldn't help with previously established subscriptions because the lacking notifications cannot be detected ...

My problem here is, that I'd like to avoid inserting this function, because removing it later (when finding a "better" solution) would be an incompatible change.

Comment on lines +98 to +103
if (timeout_ms > MAX_TIMEOUT_MS) {
logger().warn("Timeout capped to {} ms (requested: {} ms)",
MAX_TIMEOUT_MS, timeout_ms);
timeout_ms = MAX_TIMEOUT_MS;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know if this limitation is required. I would leave it to the user to decide about this.

Comment on lines +133 to +137
if (timeout_ms > MAX_TIMEOUT_MS) {
logger().warn("Timeout capped to {} ms (requested: {} ms)",
MAX_TIMEOUT_MS, timeout_ms);
timeout_ms = MAX_TIMEOUT_MS;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

@moamenvx moamenvx force-pushed the Enhancements_to_the_PubSub_Client_Reconnect_Unsubscribe_PublishWithTimeOut branch from 0d1ea52 to 087f193 Compare May 28, 2025 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants