fix: improve TCP DNS client interactions with TCP actor#32635
fix: improve TCP DNS client interactions with TCP actor#32635leviramsey wants to merge 2 commits intoakka:mainfrom
Conversation
johanandren
left a comment
There was a problem hiding this comment.
Left some comments but will probably address myself to not have to wait for American TZ-morning.
| log.debug("DNS response truncated, falling back to TCP") | ||
| inflightRequests.get(msg.id) match { | ||
| case Some((_, msg)) => | ||
| tcpRequests = tcpRequests.incl(msg.id) |
There was a problem hiding this comment.
We only clear this on failure, so for the happy path the set will just accumulate ids, so needs a remove also on success. But, to avoid that, could we make it a boolean flag on the value in inFlightRequest instead and avoid the extra set?
| val connection = sender() | ||
| context.become(ready(connection)) | ||
| writer = new Writing(connection, log) | ||
| context.become(ready()) |
There was a problem hiding this comment.
Mixing a mutable field with the become-style already present here is not great for readability. Let's use just context become closing over state instead.
| def maybeWriteMessage(msg: Message): Writer = { | ||
| writeMessage(msg) | ||
|
|
||
| new Buffering(connection, log) |
There was a problem hiding this comment.
I'm not sure we really want to allow just the one in flight write message between this actor and the TCP manager
There was a problem hiding this comment.
Ok, that's actually how we show the ack based protocol in the docs, so might be fine.
There was a problem hiding this comment.
Yeah the TCP manager only allows one write at a time. I originally went for the "NACK-based with write-suspending" approach (assume the write will be accepted until you get the backpressure signal, then resend the ones that were backpressured), but the complexity there got a bit out of hand and I wasn't sure that actually gave us that much.
|
Superseeded by #32636 |
The TCP DNS client responds to a backpressure signal (a
CommandFailed(Write(...))) from the TCP connection actor by shutting itself down (which is, in its own way a form of backpressure), but means that two requests requiring TCP in fast enough succession (the second arriving before the connection actor has been able to hand the data off to the socket) will cause the DNS client to stop before it can receive a response to the first. DNS resolutions (e.g. through Discovery) will typically retry after some time, but if there are two retries being resolved that failed sufficiently close to each other the scheduler resolution (default 10 millis) will tend to mean that they retry at effectively the same time: it's thus unlikely that this retry loop will ever be broken.This change implements ack-based throttling, it will not issue a subsequent TCP write until the connection actor has written the data to the socket and sent back an ack (this does not require any response from the other end, to be clear). Since there typically should not be that many in-flight DNS resolutions happening (the main use is for service discovery), the impact on applications should be minimal.
Also improves the logging within the TCP DNS client.