-
Notifications
You must be signed in to change notification settings - Fork 49
ADR-58: JetStream durable stream sourcing/mirroring #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
@roeschter Please add comments inline using the review function, otherwise it's extremely difficult to thread and track replies to individual comments. |
roeschter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing specifications?
-Will the sourcing side verify that the consumer has the right AckPolicy of AckFlowControl?
-What will happen to sourcing from a workqueue is NOT consumer is set? WE shouldn't fail (compatibility), but should we at least print a warning
adr/ADR-57.md
Outdated
|
|
||
| Some additional tooling will be required to create the durable consumer with the proper configuration. But through the | ||
| use of new fields/values on the consumer configuration, the server will be able to help enforce the correct | ||
| configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too vague. Which fields? What will be enforced? We need to at least reference where the details are specified,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to:
... But through the use of a new AckPolicy=AckFlowControl field, the server will be able to help enforce the correct configuration.
The enforcing it does is then mentioned in the "Consumer configuration" section, specifically:
- Requires
FlowControlandHeartbeatto be set.
adr/ADR-57.md
Outdated
|
|
||
| The durable consumer used for stream sourcing/mirroring will need to be just as performant as the current ephemeral | ||
| variant. The current ephemeral consumer configuration uses `AckNone` which is problematic for WorkQueue and Interest | ||
| streams. A different `AckPolicy` will need to be used to ensure that messages are not lost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new AckPolicy=AckFlowControl will be introduce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just introductory text explaining AckNone can't be used, the mention of the new field is done below:
- Uses
AckPolicy=AckFlowControlinstead ofAckNone
adr/ADR-57.md
Outdated
| - `AckPolicy=AckFlowControl` will function like `AckAll` although the server will not use the ack reply on the received | ||
| messages. | ||
| - The server responds to the flow control messages, including the stream sequence (`Nats-Last-Stream`) and delivery | ||
| sequence (`Nats-Last-Consumer`) as headers to specify which messages have been successfully stored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The receiving stream responds with a flow control message, which includes the stream sequence (Nats-Last-Stream) and delivery sequence (Nats-Last-Consumer) as headers to signalling which messages have been successfully stored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly updated to:
The receiving server responds to the flow control messages, which includes the stream sequence (Nats-Last-Stream) and delivery sequence (Nats-Last-Consumer) as headers to signal which messages have been successfully stored.
Specifically the server that's receiving the flow control message responds to it. It doesn't send flow control messages itself.
adr/ADR-57.md
Outdated
| messages. | ||
| - The server responds to the flow control messages, including the stream sequence (`Nats-Last-Stream`) and delivery | ||
| sequence (`Nats-Last-Consumer`) as headers to specify which messages have been successfully stored. | ||
| - The server receiving the flow control response will ack messages based on these stream/delivery sequences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server receiving the flow control response will internally acknowledge all prior messages in the consumer according to its retention policy. For work queue and interest based streams this may result in messages deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have kept the addition of For WorkQueue and Interest streams this may result in messages deletion. But am inclined to not change the former. A response to a flow control message will not mean ALL prior messages are to be acknowledged. Only those messages that are below the stream/delivery sequences are acked.
adr/ADR-57.md
Outdated
| - The consumer will be reset, resembling the delivery state of creating a new consumer with `opt_start_seq` set to the | ||
| specified sequence. | ||
| - The pending and redelivered messages will always be reset. | ||
| - The delivered stream/consumer sequences will always be reset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The delivered consumer sequence will always be reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated this to delivered stream and consumer sequence. These are two independent sequences that need to be reset.
|
LGTM |
|
Opened a draft PR on the server with a working version for this. I'll update this ADR with some refinements soon. |
|
This ADR is now also updated to reflect the initial implemention. Requiring to specify |
f741e0a to
ed4847e
Compare
c381cc7 to
06e8ab3
Compare
ripienaar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feels like we need a proper ok from @derekcollison also maybe?
06e8ab3 to
00aef99
Compare
Signed-off-by: Maurice van Veen <[email protected]>
00aef99 to
e7c3980
Compare
adr/ADR-58.md
Outdated
|
|
||
| The durable consumer used for stream sourcing/mirroring will need to be just as performant as the current ephemeral | ||
| variant. The current ephemeral consumer configuration uses `AckNone` which is problematic for WorkQueue and Interest | ||
| streams. A different `AckPolicy` (`AckFlowControl`) will need to be used to ensure that messages are not lost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial change:
A new AckPolicy (AckFlowControl) will need to be used to ensure that performance is on par with an OrderedConsumer and the messages are not lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
adr/ADR-58.md
Outdated
| - Requires `FlowControl` and `Heartbeat` to be set. | ||
| - Uses `AckPolicy=AckFlowControl` instead of `AckNone`. | ||
| - `AckPolicy=AckFlowControl` will function like `AckAll` although the receiving server will not use the current ack | ||
| reply format by acknowledging individual messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hard to understand - Suggestion:
AckPolicy=AckFlowControl will function like AckAll. The flow control messages (see below) will ack message rather than the current ack reply format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
adr/ADR-58.md
Outdated
| - Uses `AckPolicy=AckFlowControl` instead of `AckNone`. | ||
| - `AckPolicy=AckFlowControl` will function like `AckAll` although the receiving server will not use the current ack | ||
| reply format by acknowledging individual messages. | ||
| - The receiving server responds to the flow control messages, which includes the stream sequence (`Nats-Last-Stream`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"With a" rather than "to a"
The receiving server responds with a flow control messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| and delivery sequence (`Nats-Last-Consumer`) as headers to signal which messages have been successfully stored. | ||
| - The server receiving the flow control response will ack messages based on these stream/delivery sequences. For | ||
| WorkQueue and Interest streams this may result in messages deletion. | ||
| - Acknowledgements happen based on flow control limits, usually a data window size. But if the stream is idle the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vaguely remember there is a reasons for moving the ack floor. In the next section we talk about flow control ack being done according to window size. So this one is for very slow moevign streams?
But if the source/mirror is idle (low message rate) the Heartbeat will also trigger a flow control message to move the acknowledgement floor up even if the window size has not been reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, if a stream receives small messages very slowly, it will not trigger the flow control messages often. So, if the stream hasn't received messages for Heartbeat amount of time, then a flow control message will also be sent. This ensures that a WorkQueue/Interest stream can remove messages after ack in such a slow/idle stream case.
| `MaxAckPending` setting. `MaxAckPending` determines the maximum number of pending messages that can be sent before the | ||
| sourcing pauses. A flow control message will be automatically sent (no need to wait for a `Heartbeat`) so these | ||
| messages are acknowledged and new messages can be sent as soon as possible. | ||
| - Since acknowledgements happen based on dynamic flow control, it being determined either by data size, `MaxAckPending` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MUST HAVE is fine, but is this being verified? Will ackwait or backoff settings result in an error? a warning? Or result in undefined behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that's not allowed will error. Have added that here.
|
|
||
| Importantly, the server should also handle the case where a user manually resets the consumer that's used for sourcing. | ||
| The server should handle this gracefully and ensure no messages are lost. However, the user could also reset the | ||
| consumer such that it moves ahead in the stream. The server should also handle this by properly skipping over those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will the receiving server detect that this is an intentional skip rather than a gap? Please document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, consumer delivery sequence goes back to 1, the gap is the difference between the received stream sequence and the previously received stream sequence.
|
|
||
| Client should support the `$JS.API.CONSUMER.RESET.<STREAM>.<CONSUMER>` reset API. Clients should not rely on this call | ||
| to be initiated by the client process, but it potentially being called by another process, by the CLI for example. | ||
| Importantly, clients should not fail when the consumer delivery sequence is not monotonic, except when needed for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with a reset the consumer delivery sequence should still be monotonic. BUT the stream sequence may move backwards!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resetting the consumer also resets the consumer delivery sequence back to 1, in that case it's not monotonic anymore due to the reset. But it's intended to be monotonic after that (until another reset), and a gap/unordered is detected if it's not monotonically increasing.
The stream sequence may move arbitrarily forward or backward and that doesn't matter for the client's ordered consumer implementations, also why it's not mentioned here.
Signed-off-by: Maurice van Veen <[email protected]>
Stream sourcing/mirroring on WorkQueue or Interest streams is not reliable due to the use of an ephemeral
AckNoneconsumer. This ADR proposes a design for durable stream sourcing/mirroring that works reliably on all stream retention types.Signed-off-by: Maurice van Veen [email protected]