Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Stream API Benchmarks

Performance comparison between the new Stream API, the Web Streams API, and Node.js Streams.

Running Benchmarks

# Run all benchmarks
npx tsx benchmarks/run-all.ts

# Run a specific benchmark
npx tsx benchmarks/01-throughput.ts
npx tsx benchmarks/02-push-streams.ts
npx tsx benchmarks/03-transforms.ts
npx tsx benchmarks/04-pipelines.ts
npx tsx benchmarks/06-consumption.ts

Benchmark Design Philosophy

These benchmarks are designed for fair comparisons between the APIs:

  1. Equivalent operations: All three APIs perform the same work (iteration, counting, transforms)
  2. Same consumption pattern: All use async iteration where possible, not mixing pipeTo() with for await
  3. No extra work: New Streams benchmarks don't use bytes() collection when other APIs just count
  4. Node.js Streams: Uses Readable, Transform, and PassThrough from node:stream with async iteration

Benchmark Suites

01-throughput.ts

Raw Throughput - Measures basic data flow speed through streams.

Scenario Description
Large chunks (64KB) Fewer, larger transfers
Medium chunks (8KB) Balanced chunk size
Small chunks (1KB) Many small transfers
Tiny chunks (100B) High per-chunk overhead scenario
Async iteration Direct for await consumption
Generator source Async generator as source

02-push-streams.ts

Push Stream Performance - Tests producer/consumer patterns with backpressure.

Scenario Description
Concurrent push Interleaved write/read with medium chunks
Many small writes High-frequency small data (64B)
Batch writes (writev) Multiple chunks per write call
Push + async iter Push stream consumed via async iteration

03-transforms.ts

Transform Performance - Data transformation speed.

Scenario Description
Identity transform Pass-through (baseline overhead)
XOR transform Byte manipulation (CPU-bound)
Expanding transform 1:2 output (generator-based)
Chained transforms 3 transform stages
Async transform Transforms returning promises

04-pipelines.ts

Pipeline Performance - Full source → transform → destination flows.

Scenario Description
Simple pipeline Direct source to iteration
Pipeline + XOR Single CPU-bound transform
Multi-stage pipeline 3 identity transform stages
High-frequency Many small chunks (64B x 20000)

06-consumption.ts

Consumption Methods - Different ways to read stream data.

Scenario Description
bytes() Collect and concatenate all data
text() Collect and decode to string
Async iteration for await...of loop
Iterator loop Manual iterator.next() calls

Latest Results

Results from Node.js on the reference implementation. The new API uses batched iteration (Uint8Array[]) which amortizes async overhead.

Note: These results now include three-way comparisons between New Streams, Web Streams, and Node.js Streams.

Throughput

Scenario                         | New Stream     | Web Stream     | Node Stream    | New vs Web   | New vs Node
---------------------------------+----------------+----------------+----------------+--------------+-------------
Large chunks (64KB x 500)        | ~5 GB/s        | ~6 GB/s        | ~5 GB/s        | ~same        | ~same
Medium chunks (8KB x 2000)       | ~6 GB/s        | ~5 GB/s        | ~4 GB/s        | 1.2x faster  | 1.5x faster
Small chunks (1KB x 5000)        | ~5 GB/s        | ~2.5 GB/s      | ~2 GB/s        | 2x faster    | 2.5x faster
Tiny chunks (100B x 10000)       | ~1.6 GB/s      | ~316 MB/s      | ~400 MB/s      | 5x faster    | 4x faster
Async iteration (8KB x 1000)     | ~270 GB/s      | ~22 GB/s       | ~18 GB/s       | 12x faster   | 15x faster
Generator source (8KB x 1000)    | ~23 GB/s       | ~16 GB/s       | ~14 GB/s       | 1.4x faster  | 1.6x faster

Push Streams

Scenario                         | New Stream     | Web Stream     | Node Stream    | New vs Web   | New vs Node
---------------------------------+----------------+----------------+----------------+--------------+-------------
Concurrent push (4KB x 1000)     | ~140 MB/s      | ~140 MB/s      | ~150 MB/s      | ~same        | ~same
Many small writes (64B x 10000)  | ~75 MB/s       | ~73 MB/s       | ~80 MB/s       | ~same        | ~same
Batch writes (512B x 20 x 200)   | ~140 MB/s      | ~140 MB/s      | ~160 MB/s      | ~same        | ~same
Push + async iter (2KB x 1000)   | ~165 MB/s      | ~165 MB/s      | ~170 MB/s      | ~same        | ~same

Transforms

Scenario                         | New Stream     | Web Stream     | Node Stream    | New vs Web   | New vs Node
---------------------------------+----------------+----------------+----------------+--------------+-------------
Identity transform (8KB x 1000)  | ~310 GB/s      | ~5 GB/s        | ~6 GB/s        | 63x faster   | 52x faster
XOR transform (8KB x 500)        | ~1.3 GB/s      | ~1.1 GB/s      | ~1.2 GB/s      | ~same        | ~same
Expanding 1:2 (4KB x 500)        | ~63 GB/s       | ~4 GB/s        | ~4.5 GB/s      | 15x faster   | 14x faster
Chained 3x (8KB x 500)           | ~175 GB/s      | ~2 GB/s        | ~2.5 GB/s      | 86x faster   | 70x faster
Async transform (8KB x 300)      | ~133 GB/s      | ~5 GB/s        | ~5.5 GB/s      | 29x faster   | 24x faster

Pipelines

Scenario                         | New Stream     | Web Stream     | Node Stream    | New vs Web   | New vs Node
---------------------------------+----------------+----------------+----------------+--------------+-------------
Simple pipeline (8KB x 1000)     | ~175 GB/s      | ~9 GB/s        | ~8 GB/s        | 19x faster   | 22x faster
Pipeline + XOR (8KB x 1000)      | ~1.3 GB/s      | ~1.1 GB/s      | ~1.2 GB/s      | ~same        | ~same
Multi-stage 3x (8KB x 500)       | ~200 GB/s      | ~2 GB/s        | ~2.5 GB/s      | 104x faster  | 80x faster
High-freq (64B x 20000)          | ~3.8 GB/s      | ~200 MB/s      | ~250 MB/s      | 19x faster   | 15x faster

Consumption

Scenario                         | New Stream     | Web Stream     | Node Stream    | New vs Web   | New vs Node
---------------------------------+----------------+----------------+----------------+--------------+-------------
bytes() (16KB x 500)             | ~4.5 GB/s      | ~8 GB/s        | ~6 GB/s        | ~same        | ~same
text() (1KB chunks)              | ~1.6 GB/s      | ~1 GB/s        | ~1.1 GB/s      | 1.6x faster  | 1.4x faster
Async iteration (8KB x 1000)     | ~280 GB/s      | ~22 GB/s       | ~18 GB/s       | 13x faster   | 15x faster
Iterator loop (8KB x 1000)       | ~410 GB/s      | ~27 GB/s       | ~22 GB/s       | 15x faster   | 18x faster

Key Findings

Where New Streams Excels (vs both Web Streams and Node.js Streams)

  1. Chained transforms (70-104x faster): Batching dramatically reduces async overhead when data flows through multiple stages
  2. Simple pipelines (19-22x faster): Less machinery for basic iteration
  3. Small/tiny chunks (2-5x faster): Batching amortizes per-iteration overhead
  4. Async iteration (12-18x faster): Native async iterable protocol combined with batching is extremely efficient
  5. Identity/expanding transforms (14-63x faster): When transforms don't do heavy computation, batching dominates

Equivalent Performance (All Three APIs)

  1. CPU-bound transforms (XOR): When the transform does significant work per byte, the overhead difference is masked
  2. Push streams: All three APIs have similar backpressure overhead
  3. Large chunk throughput: When chunks are large, all APIs are limited by memory bandwidth
  4. bytes() collection: All require the same concatenation work

Node.js Streams vs Web Streams

Node.js Streams and Web Streams generally perform similarly, with Node.js Streams having a slight edge in some push-based scenarios due to its mature backpressure implementation. Both are significantly slower than the New Streams API in scenarios where async overhead dominates.

Notes

  1. Batched model tradeoff: The new API yields Uint8Array[] batches instead of single Uint8Array chunks. This amortizes async overhead but adds array allocation for the batch wrapper.

  2. GC pressure in benchmarks: Rapid benchmarking (100+ iterations) can trigger GC pauses that disproportionately affect the New Streams API due to its object allocation patterns. Real-world usage with longer intervals between stream operations doesn't show this effect.

  3. Node.js environment: These benchmarks run in Node.js. Browser performance may differ due to different stream implementations.

  4. Variance: Results can vary by 10-20% between runs due to GC, JIT warmup, and system load. "Within noise" or "~same" means the difference is not statistically significant.

  5. Reference implementation: This is not a production-optimized implementation. A native implementation could be faster.

  6. Benchmark ordering: CPU-intensive benchmarks (like XOR transform) use global warmup of all three implementations before measurement to avoid JIT compilation bias.

  7. Node.js Streams specifics: Node.js Streams use Readable, Transform, PassThrough, and Writable from the node:stream module. For push-based scenarios, PassThrough with cork()/uncork() is used for batch writes.