Skip to content

Conversation

@richard-ramos
Copy link
Member

@richard-ramos richard-ramos commented Sep 4, 2025

closes: #1642

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

🏁 Performance Summary

Commit: d0df6e552f1095d00e99b79ea9834f5d4e4ca6fa

Scenario Nodes Total messages sent Total messages received Latency min (ms) Latency max (ms) Latency avg (ms)
Base test 10 100 900 0.296 1.811 0.867
Low Bandwidth rate 256kbit burst 8kbit limit 5000 10 100 900 0.206 37.458 5.617
Packet Reorder 15% 40% with 2ms delay 10 100 900 0.253 5.549 2.880
Queue Limit 5 10 100 900 0.259 2.170 0.849
Latency 100ms 20ms 10 100 900 39.796 238.521 115.322
Burst Loss 8% 30% 10 100 900 0.253 2.125 0.851
Duplication 2% 10 100 900 0.282 2.159 0.901
Corruption 0.5% 10 100 900 0.289 2.195 0.869
Packet Loss 5% 10 100 900 0.300 2.160 0.897
Combined Network Conditions 10 100 900 0.226 299.581 127.329

📊 View Latency History and full Container Resources in the Workflow Summary

@richard-ramos
Copy link
Member Author

Interop's green again :)

@arnetheduck
Copy link
Contributor

arnetheduck commented Sep 5, 2025

So .. I only have a very vague recollection of why the EOF waiting made sense and maybe it has been solved in another way since, but the way things were back then, you would close a stream but if it still had unread data associated with it - such as the "virtual" EOF marker or any in-flight unconsumed buffered data - it would linger in memory and take up resources in the stream multiplexer because "reading" from the stream is what drives "forward motion" in its operation - without anything reading, the multiplexer would then get stuck (mplex in particular) because it eventually runs out of buffer and starts blocking reads on other streams.

closeWithEOF would provide exactly that: a read loop that consumes everything from a stream as part of the close which would help it "drive" shut down in an orderly way and have its resources released.

This is similar to how, when you close a bsd socket, the kernel maintains a thread that consumes the rest of the socket and makes sure all the FINs and ACKs and so happen - we don't have a "reading" thread in general but the problem remains relevant - the other relevant thing here is here is to handle the case where the application has an ongoing readXxx and close gets called from another async task.

What's tricky about problems like this is that they manifest as "long-term" leaks where slowly there's a buildup until suddenly it stops working - if you change these parts, make sure to run long-term tests on an active network (like nimbus-eth2) and monitor for resource leaks and blocked streams.

@richard-ramos
Copy link
Member Author

I see.
I think in this case the proper 'fix' should be to reset the stream instead of doing a close or a closeEOF, as on the remote side, they will not be sending further data. Will convert this PR back to draft and work on this then. Thank you

@richard-ramos richard-ramos marked this pull request as draft September 5, 2025 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: new

Development

Successfully merging this pull request may close these issues.

quic: transport interop test with zig is failing

4 participants