System XLS: Peer to Peer Disconnect Handshake #338
Tapanito
started this conversation in
XLS Proposals
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Abstract
This specification proposes an enhancement to the XRP Ledger peer-to-peer protocol to include diagnostic headers in connection termination messages. Currently, when a peer closes a connection, it provides no reason, making it difficult to differentiate between network failures and application-level issues like resource exhaustion. By adding headers that specify the reason for closure and the peer's last known ledger state, this proposal aims to provide crucial data for debugging and improving network stability.
Motivation
Network operators frequently observe peer disconnections, with a leading hypothesis being that remote peers drop connections due to high resource usage on the local server. Verifying this is challenging because peer servers act as black boxes; without access to their debug logs, it's nearly impossible to determine the root cause of a disconnection.
This proposal introduces a simple, non-intrusive mechanism for peers to communicate the reason for termination. By sending a small set of diagnostic headers when closing a connection—similar to the handshake performed when opening one—peers can provide valuable context. This data will enable operators to more effectively diagnose and resolve issues related to resource limits, idle timeouts, and ledger synchronization problems.
Specification
When a
rippledserver decides to terminate a p2p connection, it should send a final message containing a set of HTTP-style headers before closing the underlying TCP socket. This message is intended for diagnostic purposes and helps the remote peer understand the reason for the disconnection.The following headers are included in the termination message:
ConnectionClose. Explicitly indicates that the server is closing the connection.X-ReasonClosed-LedgerPrevious-LedgerNetwork-TimeX-ReasonCodesTo standardize reporting, the following initial values for the
X-Reasonheader are proposed:resource_limit_exceeded: The connection was terminated because the remote peer was consuming excessive resources (e.g., CPU, memory, I/O) or sending too many messages.idle_timeout: The connection was closed after a period of inactivity.protocol_violation: The remote peer violated the p2p protocol.shutdown: The server is shutting down cleanly.unspecified: The server is closing the connection for a reason not covered by other codes.Rationale
The design of this feature mirrors the existing handshake process for new peer connections, which already uses HTTP-style headers. Reusing this format simplifies implementation and maintains consistency within the peer protocol. The chosen headers provide a balance between delivering valuable diagnostic information and minimizing the amount of data transmitted upon disconnection. The inclusion of ledger and time information, already sent during the initial handshake, is critical for correlating disconnection events with potential ledger state or sync-related problems.
Backwards Compatibility
This change is fully backwards compatible.
No protocol breakage or adverse effects are anticipated.
Test Plan
An implementation of this specification requires the following tests:
X-Reasoncodes.Security Considerations
The primary security consideration for this feature is the potential for minor information disclosure. The
X-Reasonheader reveals a peer's internal state (e.g.,resource_limit_exceededimplies the server is under high load). An attacker could potentially use this information to identify and target stressed nodes in the network.To mitigate this risk, the provided reasons are generic and do not expose specific metrics. The other headers (
Closed-Ledger,Previous-Ledger,Network-Time) expose information that is already publicly available or exchanged during the initial handshake. The overall security risk is assessed as low.Beta Was this translation helpful? Give feedback.
All reactions