Description
Describe the bug
I have a third-party server that is extremely picky about its requests, and a certain social platform that shall remain nameless insists on injecting extra query string arguments into URLs shared on it. The third-party server should be ignoring these query string arguments, but it isn't, and I don't have control over it, so I created a simple reverse proxy in ASP.NET Core to forward requests on to the third-party server, but altered to remove the unwanted arguments.
This third-party server has an endpoint that produces a dynamically-generated MJPEG stream. As such, this endpoint has no specific Content-Length
and streams data as it arrives. In order to accommodate this, my reverse proxy is implemented as middleware that ends with the following loop:
while (true)
{
int bytesRead = await responseStream.ReadAsync(buffer, 0, buffer.Length);
if (bytesRead < 0)
throw new Exception("I/O error");
if (bytesRead == 0)
break;
await context.Response.Body.WriteAsync(buffer, 0, bytesRead);
_statusConsoleSender.NotifyRequestProgress(requestID, bytesRead);
}
The third-party server will continue sending data as long as the client continues to receive it. As such, my loop must behave the same way: the only way that it knows that it should terminate the loop and stop sending data is if the client disconnects, so that context.Response.Body.WriteAsync()
is no longer possible.
What I am observing, though, is that the loop keeps running, pumping data from the third-party service and sending it ... nowhere? I am able to see this because of the "Status Console" tied in by the last line in the loop body. After closing a browser tab that is receiving this MJPEG stream, "Status Console" shows bytes continuing to be pumped indefinitely. I left it running yesterday and it had gotten to over 6 GB before I terminated the server. But, netstat -an
showed no connection from the browser to the proxy any more, and the proxy's process memory footprint had not grown, so clearly the data is just going nowhere.
I have only had the opportunity to test this with Kestrel.
To Reproduce
I have created a new project that minimally exercises precisely the functionality where I encountered this problem, and it does indeed reproduce it:
https://github.com/logiclrd/ReproduceMiddlewareStreamBug
I noticed that when browsing to the stream with Chrome, Chrome takes some time (10-ish seconds?) before it decides to show the text/plain data as it arrives.
EDIT by @Rick-Anderson add <!-- comment out version info -->