Fix reassembly of ASTM frames split across multiple TCP reads#78
Open
xispa wants to merge 3 commits into
Open
Conversation
TCP does not preserve message boundaries, so a single ASTM frame can be
delivered across several reads (and, conversely, several frames/control
bytes can arrive in one read). This is the norm with serial-to-LAN
gateways (Moxa NPort, Lantronix UDS) that forward a slow serial stream in
small TCP segments.
data_received() previously treated each read as one complete frame: a
fragment such as b'\x021H|\\^&|||A' was checksum-validated on its own,
failed, and was NAKed, while the trailing bytes ('RCHITECT^...') could
not be dispatched -- the instrument retransmitted and the session never
completed. Confirmed against an Abbott ARCHITECT ci4100 behind a
Lantronix gateway.
data_received() now appends to a per-connection buffer and iter_tokens()
yields only complete tokens: a single control byte (ENQ/ACK/NAK/EOT), or
a whole frame up to STX...(ETX|ETB) + 2 checksum bytes, consuming an
optional trailing CR/LF. Partial frames stay buffered until the rest
arrives. Downstream handling is unchanged, so this is transparent to both
server (listen) and client (connect) modes.
The simulator writes each frame in a single send over loopback, so tests
never fragmented. Adds protocol tests that feed a frame one byte at a time
and multiple tokens in a single read.
Contributor
|
Seems to be the same as this one: #77 |
Raising inside an asyncio protocol's data_received() is fatal: it drops the connection. Several handlers raised on input that instruments legitimately send, so a stray byte forced a full disconnect/reconnect (5s) instead of being shrugged off: - on_eot() raised InvalidState when an <EOT> arrived outside a transfer, but instruments emit stray <EOT> (and <EOT>/<ENQ> bursts) between sessions and while establishing -- observed live from an Abbott ARCHITECT ci4100, which churned the client connection repeatedly. - on_ack()/on_nak() raised NotAccepted on any <ACK>/<NAK>; we are the receiver and should simply ignore these. Make stray/unexpected control bytes non-fatal: on_eot() outside a transfer and unexpected on_ack()/on_nak() now log and ignore instead of raising. As a backstop, data_received() wraps handle_data() in try/except so no single malformed token can ever drop the link. Removes the now-unused InvalidState import. Adds protocol tests: a stray <EOT> (and an <EOT>/<ENQ>/<EOT> burst) does not drop the connection, and unexpected <ACK>/<NAK> are ignored.
Member
Author
Yes, both do essentially the same, except that #77 lacks on the control-byte hardening added with 798bdeb This PR comes with #24 and #61. I've been working on this ARCHITECT integration for a while and has been quite challenging to be honest. Also because there is a serial-to-LAN converter in the middle (Moxa does not work in this case, but Lantronix). I am now focused on make this long-standing integration fully work and will port the differences to 2.x afterwards, once all the fundamentals are in place. Will then be possible for me to fully test this with 2.x |
Member
Author
Member
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of the issue/feature this PR addresses
Make the ASTM protocol robust against real serial-to-LAN gateways, which — unlike the loopback simulator used in tests — deliver the byte stream in arbitrary chunks and emit stray control bytes between sessions. Two related fixes:
TCP does not preserve message boundaries, so a single ASTM frame can be delivered across several reads (and, conversely, several frames/control bytes can arrive in one read). This is the norm with serial-to-LAN gateways (Moxa NPort, Lantronix UDS) that forward a slow serial stream in small TCP segments.
data_received() previously treated each read as one complete frame: a fragment such as
b'\x021H|\\^&|||A' was checksum-validated on its own, failed, and was NAKed, while the trailing bytes ('RCHITECT^...') could not be dispatched -- the instrument retransmitted and the session never completed. Confirmed against an Abbott ARCHITECT ci4100 behind a Lantronix gateway.data_received()now appends to a per-connection buffer anditer_tokens()yields only complete tokens: a single control byte (ENQ/ACK/NAK/EOT), or a whole frame up toSTX...(ETX|ETB)+ 2 checksum bytes, consuming an optional trailingCR/LF. Partial frames stay buffered until the rest arrives. Downstream handling is unchanged, so this is transparent to both server (listen) and client (connect) modes.Raising inside an asyncio protocol's
data_received()is fatal -- it drops the connection. Several handlers raised on input that instruments legitimately send, so a stray byte forced a full disconnect/reconnect instead of being shrugged off:on_eot()raisedInvalidStatewhen an<EOT>arrived outside a transfer, but instruments emit stray<EOT>(and<EOT>/<ENQ>bursts) between sessions and while establishing -- observed live from the ARCHITECT, which churned the client connection (disconnect + 5s reconnect) repeatedly.on_ack()/on_nak()raisedNotAcceptedon any<ACK>/<NAK>; as the receiver we should simply ignore these.These now log and ignore instead of raising, and
data_received()wrapshandle_data()in try/except as a backstop so no single malformed token can drop the link. Removes the now-unusedInvalidStateimport.The simulator writes each frame in a single send over loopback, so tests never fragmented and never sent stray control bytes. Adds protocol tests: a frame fed one byte at a time, multiple tokens in a single read, a stray
<EOT>(and<EOT>/<ENQ>/<EOT>burst) that must not drop the connection, and unexpected<ACK>/<NAK>that are ignored.Current behavior before PR
The protocol assumed one TCP read = one complete ASTM frame.
data_received()ran each chunk straight throughhandle_data(), checksum-validating it as if it were a whole frame, and raised on unexpected control bytes. Both assumptions break against real serial-to-LAN gateways:b'\x021H|\\^&|||A'thenb'RCHITECT^...') fails the checksum on its first fragment and is NAKed; the trailing bytes can't be dispatched. The instrument retransmits, the fragmentation repeats, and the session never completes — no message is received.ENQimmediately followed by a frame) were not all processed.<EOT>outside a transfer (on_eot→InvalidState), or any<ACK>/<NAK>(on_ack/on_nak→NotAccepted), raised insidedata_received()and tore down the connection, forcing a reconnect.Confirmed end-to-end against an Abbott ARCHITECT ci4100 behind a Lantronix gateway: connection established,
<ENQ>/<ACK>fine, but every data frame was NAKed, and stray<EOT>/<ENQ>bursts repeatedly crashed the connection.Desired behavior after PR is merged
The protocol reassembles frames regardless of how TCP splits or coalesces the byte stream, and tolerates stray traffic:
data_received()buffers, anditer_tokens()emits only complete tokens — a single control byte, or a whole frame up toSTX…(ETX|ETB)+ 2 checksum bytes (optional trailingCR/LF). Partial frames stay buffered; multiple tokens in one read are each dispatched.on_eot()outside a transfer and unexpectedon_ack()/on_nak()log and ignore, and handle_data() is wrapped so no single token can drop the connection.Result: fragmented frames are received and ACKed correctly and the session completes — verified against the Abbott ARCHITECT ci4100 via Lantronix (full
H/P/O/R/C/Lmessage parsed) — and stray<EOT>/<ENQ>bursts no longer cause reconnect churn. Transparent to both server (--listen) and client (--connect) modes--
I confirm I have tested this PR thoroughly and coded it according to PEP8
and Plone's Python styleguide standards.