-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Allow custom StreamSource heartbeat #4095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
Except for the failing CI (for which TBH I am a bit perplexed), feel free to give me some feedback about the value of the PR (and the relative issue). If you think it is basically fine, I will add some tests for the new behavior. |
|
The data race in TestJetStreamClusterSourceWithOptStartTime is legit. |
|
@derekcollison can you give me a hint about a possible cause? The reason I don't understand is that, without specifying any |
|
The stack for the datarace will give you the info needed, in terms of when it violates and what was previous operation, etc.. |
ec31e80 to
eba8999
Compare
|
@derekcollison Thanks for the patient! There was just a single point of data race it slipped through. Tests are now green. Let me know if the change is reasonable for you, in that case I can add a few tests. |
derekcollison
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable, may need to be targeted at dev branch for 2.10 vs main which is 2.9.x atm.
Let's add some tests etc.
Thanks!
|
@derekcollison Sorry to bother, but I am banging my head against the tests and I probably need a hint from someone who knows the codebase. The ideal test would be to use both (relatively) small and large custom heartbeat and monitor the behavior of the consumers in order to check everything is working as expected. For instance, I should expect a failing test if I change back the heartbeat value to the previously default one in the checks for stalled streams. The issue I am facing is I am unable to verify this situation. For instance, the test I honestly do not know if I am missing something obvious or the situation is really a bit tricky, in any case I think people with the knowledge of the codebase could give me very useful information. Even if my changes have a small impact, I think that there are a couple of sharp edges that really need some additional tests. Thank you in advance! |
|
Yes can be tricky in there, the consumer is gettable, but a bit nuanced. Can you point me to a line in your test where you would need direct access and I can probably help. |
eba8999 to
776fd99
Compare
|
Sure, and thank you for helping me! 😊 I just pushed a wip commit, which is pretty much stolen from TestJetStreamPushConsumerIdleHeartbeats. As you can see I am trying to replicate the subscription to the consumer in order to get the heartbeat messages. However (and pretty obviously), in the original test the subject is used as a delivery subject for the consumer. However, in my case the subject of the consumer is changed a bit and other parameters are set -- and my noobness with the codebase does not let me grasp what is the specific reason I am unable to see the the heartbeat messages from the inbox subscription 😅. I would be really happy to solve this without bothering you, but I think I need some help 😞. Thank you in advance for all the support! |
|
We have a customer that is looking at this functionality so we will introduce for 2.10 in some for or fashion. |
776fd99 to
3feeb31
Compare
| } | ||
|
|
||
| func (mset *stream) getMirrorHeartbeat() time.Duration { | ||
| if mset.cfg.Mirror.Heartbeat == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dodomorandi My guess is that the data race is in this method since no read lock is being used (unlike all of the other methods on mset when fields are accessed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually.. most places the lock is already held except for here: https://github.com/nats-io/nats-server/pull/4095/files#diff-2f4991438bb868a8587303cde9107f83127e88ad70bd19d5c6a31c238a20c299R1874
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data race now is fixed, the only thing is that I keep using the initial value of the heartbeat. AFAIK this should not be an issue because I don't expect the value to change across locking blocks. Is my assumption correct or it would be better to use the current value of the heartbeat here?
255b205 to
b4d89b2
Compare
|
Looks like the final bit to remove is the local |
Instead of using the default heartbeat for stream sources and mirror, allow more customizability. The mechanism of the health check keeps working in a linear fashion. Every time the heartbeat is not specified, things work just like before.
4d33449 to
02a59d9
Compare
|
@bruth Sorry for the delay. I just resynced the repo. Regarding to your last comment, I think that this PR needs a change in If you want I can open a PR in |
|
The approach of #4352 looks interesting. Since this didn't make it in 2.10 we're going to patch nats-server in our Yocto build by hard-coding a higher value for heartbeat interval. We'll see whether this option makes sense to be configured globally or at a per stream-by-stream level. |
Has this been introduced? |
Resolves #NNNgit pull --rebase origin main)Main issue: #4094
Changes proposed in this pull request:
Heartbeatfield toStreamSourcegetMirrorHeartbeathelper functiongetStreamSourceHeartbeatandgetSourceHeartbeathelper functionsgetMirrorHeartbeatto define mirror heartbeat and related health checksgetStreamSourceHeartbeatandgetSourceHeartbeatto define stream sources heartbeat and related health checksCC @paolobarbolini