Description
This spins out of the "Questions about the Fetch API" thread on the whatwg list, in particular @willchan's reply around here and the subsequent follow-ups. We had a video-chat conversation that helped clarify things and I want to capture them here.
Our current piping algorithm essentially says: whenever the source has data, and the dest is not exerting backpressure, read from the source and write to dest. @willchan calls this "push", because the dest does not really get to decide how it consumes from source. He explains that a "pull" model would be better, wherein you give source to the underlying sink implementation (probably via dest), which then grabs data out of it as it determines is necessary.
This is most important for high-performance binary streams which will allow reading of specific amounts of bytes (#111), because the writable stream implementer (e.g., the UA) could then use smart algorithms to figure out exactly what size chunks they want to try transferring, depending on e.g. how well the streaming has gone so far, what type of network they are on, and other such factors.
This functionality will not be useful for most writable streams: only those who know how to be smart about consuming the data. Object streams in particular are unlikely to want to use this.
The tentative idea I had for solving this was something like the following:
- Introduce
WritableStream.prototype.pipeFrom
, and move all of the existing code inReadableStream.prototype.pipeTo
into that. This is the default pipe-from implementation. This is possible since the pipe code does not depend on any internals, just the public API, and in fact it could be a standalone function. - Have
ReadableStream.prototype.pipeTo(dest)
become essentiallydest.pipeFrom(this)
. SopipeTo
just becomes a convenience so that authors can write things in right-to-left order. - Allow writable streams that have advanced use cases to subclass
WritableStream
and override thepipeFrom
method to include custom logic.
This actually solves a number of other problems:
It helps give a framework for streams to "recognize" each other, e.g. for off-main-thread-piping (#97) via things like splice: writable streams recognizing file descriptors can recognize that a readable stream representing a file descriptor is being piped to them, and then do splicing instead of the usual algorithm---but then fall back to super(...args)
if they do not recognize the stream.
It also allows dest streams to apply any other stream-specific logic. For example, a stream representing a HTTP request body to be sent out could recognize a file descriptor stream being piped to it, get the file's length, and then set that on its content-length header. The popular request package in Node.js makes use of these sorts of tricks extensively.
These could be done the other way around, too, via overriden pipeTo
, but it seems more of the dest's responsibility rather than the source to do this kind of recognition.
I am optimistic that this does not really add any complexity to the default case, while adding good flexibility for the fastest-possible implementations for high-performance binary streams.