-
Notifications
You must be signed in to change notification settings - Fork 42
websocket proxy: ping/pong downstream clients and proactively disconnect #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
I am not a rust expert and I am not familiar with the ping pong conventions in rust. But I think the code can be improved in a couple of ways.
Meanwhile, in your implementation, even if the pong fails,
In this case, we could have cleaner code with no periodic healthchecks every 2 seconds |
More context on the architecture is that we will have 5 conductors and 5 rollup boosts (1:1 mapping). And Conductor health is managed separately by conductor health check logic, which means unhealthy conductors won't become leaders anyway. If we only log out warnings, we can avoid false positives: Network hiccup → Missed pong → Connection killed → Healthy conductor unnecessarily disconnected. |
|
I agree with the
@angel-ding-cb I don't think that is true for the websocket proxy. We could have a huge number of clients |
It's important to note here that pings can definitely continue sending without the client ponging back. It's the same as our flashblocks messages continuing to send to them even if they're not actually processing/active. The cases in which this happens is the same - device put to sleep, CPU throttling processes, browsers sleeping their tabs, etc. so its not good enough to just assume that the next ping will fail. the healthcheck therefore expects a pong or it forces a disconnect. the
im considering perhaps making these part of config/env instead of hardcoding so we can play with it. im not too sure on what the right value should be atm either. |
@danyalprout I agree that for the websocket proxy, we could have a huge number of clients. That's why the conductor will close out stale/unresponsive connections gracefully. Since in the final state where the rollupboost will only stream messages from the builder to the conductor, I am worried that closing the connections will interrupt the flow when there is a false positive. But this is not a huge issue since the conductor will indefinitely try to connect to rollup boost if it ever loses the connection. So I will defer it to you @haardikk21 |
@angel-ding-cb I believe this code is for managing connections to downstream clients from the websocket proxy (i.e. the nodes). Unless I'm missing something, I don't think it'll impact the conductor setup at all |
has the ws proxy send
ping
frames to downstream clients periodically. if apong
isn't received back within a specified timeout (10s), it disconnects the clientspreviously, we disconnected clients if message sending failed - but that only works for disconnecting on one-way networking issues.
there are a variety of cases where clients can continue to receive packets but aren't actually (able to) process them, adding unnecessary overhead to this service. devices going to sleep, battery saving/low prio mode in browsers, processing getting blocked on client side due to slow code, etc can all cause the TCP buffer to stay open and receive packets but being unable to process the data
by pinging, we ensure that clients are not just receiving messages but also actively processing them by needing them to respond with a pong