Skip to content

Request Response Duplicate Correlation Token Validation#400

Closed
sbSteveK wants to merge 15 commits into
mainfrom
rr-dupe-token
Closed

Request Response Duplicate Correlation Token Validation#400
sbSteveK wants to merge 15 commits into
mainfrom
rr-dupe-token

Conversation

@sbSteveK
Copy link
Copy Markdown
Contributor

@sbSteveK sbSteveK commented Jun 12, 2025

If a Request-Response Operation attempts to use an already in-use correlation token it will now fail validation on submission of the operation to the RR Client.

Add a test for failing validation when a duplicate correlation token is submitted while the token is already in use.
Add a test for successful validation of a reused correlation token if the previous operation with the same correlation token has already completed.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 12, 2025

Codecov Report

Attention: Patch coverage is 96.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 84.71%. Comparing base (0024652) to head (f6d1761).

Files with missing lines Patch % Lines
source/request-response/request_response_client.c 96.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #400      +/-   ##
==========================================
+ Coverage   84.70%   84.71%   +0.01%     
==========================================
  Files          25       25              
  Lines       10492    10517      +25     
==========================================
+ Hits         8887     8910      +23     
- Misses       1605     1607       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sbSteveK sbSteveK changed the title Duplicate Correlation Token on RR Operation Error Handling Request Response Duplicate Correlation Token Validation Jun 16, 2025
Comment thread source/request-response/request_response_client.c Outdated
Comment thread source/request-response/request_response_client.c Outdated
Comment thread source/request-response/request_response_client.c
Comment on lines +889 to 891
s_remove_correlation_token_from_in_use(operation);

if (operation->state == AWS_MRROS_PENDING_DESTROY) {
Copy link
Copy Markdown
Contributor

@sfod sfod Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think s_remove_correlation_token_from_in_use should be called only once in s_remove_operation_from_client_tables. I.e. an operation's token should become available for reuse only when there is no possibility for the operation to do something with it.

With current implementation, it's hard to reason about if it's possible or not for the previous operation to do something with its token when the token is already assigned to the next operation.

For example, take the following scenario.

Imagine there is an existing operation op_1 with a token token.

  • a new operation op_2 with a token token started by a user thread, so aws_mqtt_request_response_client_submit_request executes:
    • schedule a task (for the next event loop iteration) which will perform this new operation initialization

event loop iteration N:

  • a PUBLISH arrives for op_1, so s_complete_operation_with_correlation_token executes:
    • rm token from correlation_tokens_in_use
    • ... perform the rest of the function
  • Initialize op_2, i.e. execute s_mqtt_rr_client_submit_operation function:
    • add token to correlation_tokens_in_use
    • ... and do other stuff

event loop iteration N + 1:

  • the second PUBLISH arrives for op_1 (or alternatively there was an op failure on the previous iteration)
    • rm token from correlation_tokens_in_use

Now there is op_2 with a token token, but no token record in correlation_tokens_in_use.

To address this hypothetical scenario, we could just execute s_remove_correlation_token_from_in_use(operation); only if state not AWS_MRROS_PENDING_DESTROY. But I feel that this will be still kinda fragile.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason I went with the way I did was that we need to have a separate check in s_complete_request_operation_with_failure() to insure we aren't removing a correlation token when the operation is cleaning up due to a duplicate correlation token error. I'll trace the logic and see if we can't combine it and work through your scenario.

@sbSteveK sbSteveK closed this Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants