Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: remove Array.p.shift primordial from hotpath of RS.read() #50340

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

debadree25
Copy link
Member

@debadree25 debadree25 commented Oct 23, 2023

Refs: nodejs/performance#82

A small improvement noticed in performance of read():

                                                   confidence improvement accuracy (*)   (**)  (***)
webstreams/readable-async-iterator.js n=100000                     0.75 %       ±1.93% ±2.57% ±3.35%
webstreams/readable-read.js type='byob' n=100000                   0.69 %       ±1.21% ±1.61% ±2.10%
webstreams/readable-read.js type='normal' n=100000        ***      8.81 %       ±1.55% ±2.07% ±2.70%

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. web streams labels Oct 23, 2023
@debadree25 debadree25 force-pushed the ft/remove-aps-readablestream branch from 7795772 to a836b44 Compare October 23, 2023 14:27
@debadree25
Copy link
Member Author

cc @nodejs/performance @nodejs/whatwg-stream

@anonrig anonrig added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Oct 23, 2023
@anonrig
Copy link
Member

anonrig commented Oct 23, 2023

cc @nodejs/primordials

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 23, 2023
@nodejs-github-bot
Copy link
Collaborator

@debadree25
Copy link
Member Author

benchmark ci https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1453/ (hope i started it correctly 😅)

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ljharb
Copy link
Member

ljharb commented Oct 23, 2023

Since shift is O(n), rather than removing a primordial, could this be made more performant by conceptually reversing the array throughout, so that this shift operation can become arr[arr.length - 1]; arr.length -= 1;?

@debadree25
Copy link
Member Author

debadree25 commented Oct 23, 2023

Since shift is O(n), rather than removing a primordial, could this be made more performant by conceptually reversing the array throughout, so that this shift operation can become arr[arr.length - 1]; arr.length -= 1;?

Can give that a try in a follow up although dont you think that might break the spec? i feel like it might, will report back with an investigation 👍

@ljharb
Copy link
Member

ljharb commented Oct 23, 2023

I'm hoping/assuming it's an internal implementation detail, but I look forward to your attempt!

@debadree25
Copy link
Member Author

Benchmark CI results:

23:08:36 webstreams/creation.js kind='ReadableStream.tee' n=50000                               1.81 %       ±2.93% ±3.89%  ±5.07%
23:08:36 webstreams/creation.js kind='ReadableStream' n=50000                                   1.54 %       ±4.85% ±6.46%  ±8.43%
23:08:36 webstreams/creation.js kind='ReadableStreamBYOBReader' n=50000                         0.16 %       ±6.40% ±8.52% ±11.09%
23:08:36 webstreams/creation.js kind='ReadableStreamDefaultReader' n=50000                     -3.95 %       ±5.36% ±7.14%  ±9.32%
23:08:36 webstreams/creation.js kind='TransformStream' n=50000                                 -3.64 %       ±3.77% ±5.02%  ±6.54%
23:08:36 webstreams/creation.js kind='WritableStream' n=50000                                   1.90 %       ±4.71% ±6.27%  ±8.18%
23:08:36 webstreams/js_transfer.js n=10000 payload='ReadableStream'                             1.45 %       ±1.76% ±2.34%  ±3.05%
23:08:36 webstreams/js_transfer.js n=10000 payload='TransformStream'                            1.14 %       ±1.33% ±1.77%  ±2.30%
23:08:36 webstreams/js_transfer.js n=10000 payload='WritableStream'                            -0.66 %       ±1.73% ±2.31%  ±3.00%
23:08:36 webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=1024 n=500000                 0.72 %       ±2.01% ±2.68%  ±3.49%
23:08:36 webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=2048 n=500000                -0.44 %       ±1.90% ±2.53%  ±3.30%
23:08:36 webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=4096 n=500000                -0.26 %       ±2.03% ±2.70%  ±3.51%
23:08:36 webstreams/pipe-to.js highWaterMarkW=1024 highWaterMarkR=512 n=500000                 -0.91 %       ±2.16% ±2.88%  ±3.76%
23:08:36 webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=1024 n=500000                 0.97 %       ±1.79% ±2.38%  ±3.10%
23:08:36 webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=2048 n=500000                 1.12 %       ±1.47% ±1.96%  ±2.55%
23:08:36 webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=4096 n=500000                 0.95 %       ±1.82% ±2.42%  ±3.15%
23:08:36 webstreams/pipe-to.js highWaterMarkW=2048 highWaterMarkR=512 n=500000                  0.75 %       ±2.23% ±2.97%  ±3.87%
23:08:36 webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=1024 n=500000                 1.17 %       ±1.87% ±2.48%  ±3.23%
23:08:36 webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=2048 n=500000                 0.07 %       ±2.04% ±2.71%  ±3.52%
23:08:36 webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=4096 n=500000                 0.32 %       ±1.79% ±2.39%  ±3.11%
23:08:36 webstreams/pipe-to.js highWaterMarkW=4096 highWaterMarkR=512 n=500000                 -0.72 %       ±2.17% ±2.89%  ±3.76%
23:08:36 webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=1024 n=500000                 -0.59 %       ±1.88% ±2.50%  ±3.25%
23:08:36 webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=2048 n=500000                 -1.26 %       ±2.06% ±2.74%  ±3.57%
23:08:36 webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=4096 n=500000                  0.01 %       ±2.24% ±2.98%  ±3.87%
23:08:36 webstreams/pipe-to.js highWaterMarkW=512 highWaterMarkR=512 n=500000                   0.32 %       ±2.37% ±3.16%  ±4.13%
23:08:36 webstreams/readable-async-iterator.js n=100000                                        -1.20 %       ±3.09% ±4.16%  ±5.49%
23:08:36 webstreams/readable-read.js type='byob' n=100000                                *      3.50 %       ±2.83% ±3.76%  ±4.90%
23:08:36 webstreams/readable-read.js type='normal' n=100000                              *      8.67 %       ±6.71% ±8.93% ±11.65%

@nodejs-github-bot
Copy link
Collaborator

@targos
Copy link
Member

targos commented Oct 23, 2023

Benchmark CI results aren't really convincing: two small positive results when we should expect 1.4 false positives

@debadree25
Copy link
Member Author

debadree25 commented Oct 24, 2023

two small positive results when we should expect 1.4 false positives

Didn't get you 😅 could you explain?

edit: oh ok so this could be within statistical range of error? hmm it seems i was able to get higher confidence running it locally 🤔

@debadree25 debadree25 removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Oct 24, 2023
@aduh95
Copy link
Contributor

aduh95 commented Oct 26, 2023

oh ok so this could be within statistical range of error? hmm it seems i was able to get higher confidence running it locally 🤔

Maybe you are running the benchmark on a different platform (CPU arch, OS, etc.) than our CI, so that means either that change does not affect the benchmark CI, or either yours or the CI's result is a statistical error. Re-running the benchmarks with more data points would be a way to settle the question, but I would encourage you to investigate #50340 (comment) first, as there's a potential to get a greater perf win while preserving the robustness of the code.

@debadree25
Copy link
Member Author

i am investtigating #50340 (comment) turning this into draft for now

@debadree25 debadree25 marked this pull request as draft October 26, 2023 13:04
@debadree25

This comment was marked as outdated.

@debadree25
Copy link
Member Author

Hmm so i tried inverting the array doesnt seem to yield any performance benefits, tried something like this

diff --git a/lib/internal/webstreams/readablestream.js b/lib/internal/webstreams/readablestream.js
index a5f95596e3..f7dc3157ff 100644
--- a/lib/internal/webstreams/readablestream.js
+++ b/lib/internal/webstreams/readablestream.js
@@ -6,6 +6,7 @@ const {
   ArrayBufferPrototypeSlice,
   ArrayPrototypePush,
   ArrayPrototypeShift,
+  ArrayPrototypeUnshift,
   DataView,
   FunctionPrototypeBind,
   FunctionPrototypeCall,
@@ -2082,7 +2083,8 @@ function readableStreamFulfillReadRequest(stream, chunk, done) {
     reader,
   } = stream[kState];
   assert(reader[kState].readRequests.length);
-  const readRequest = ArrayPrototypeShift(reader[kState].readRequests);
+  const readRequest = reader[kState].readRequests[reader[kState].readRequests.length - 1];
+  reader[kState].readRequests.length -= 1;
 
   // TODO(@jasnell): It's not clear under what exact conditions done
   // will be true here. The spec requires this check but none of the
@@ -2110,7 +2112,7 @@ function readableStreamFulfillReadIntoRequest(stream, chunk, done) {
 function readableStreamAddReadRequest(stream, readRequest) {
   assert(readableStreamHasDefaultReader(stream));
   assert(stream[kState].state === 'readable');
-  ArrayPrototypePush(stream[kState].reader[kState].readRequests, readRequest);
+  stream[kState].reader[kState].readRequests = [readRequest, ...stream[kState].reader[kState].readRequests];
 }
 
 function readableStreamAddReadIntoRequest(stream, readIntoRequest) {
@@ -3156,9 +3158,11 @@ function readableByteStreamControllerProcessReadRequestsUsingQueue(controller) {
     if (queueTotalSize === 0) {
       return;
     }
+    const shifted = reader[kState].readRequests[reader[kState].readRequests.length - 1];
+    reader[kState].readRequests.length -= 1;
     readableByteStreamControllerFillReadRequestFromQueue(
       controller,
-      ArrayPrototypeShift(reader[kState].readRequests),
+      shifted,
     );
   }
 }

and the local benchmark gives:

                                                   confidence improvement accuracy (*)   (**)  (***)
webstreams/readable-async-iterator.js n=100000              *     -1.41 %       ±1.27% ±1.70% ±2.22%
webstreams/readable-read.js type='byob' n=100000                   0.11 %       ±2.14% ±2.84% ±3.70%
webstreams/readable-read.js type='normal' n=100000         **      1.47 %       ±1.03% ±1.37% ±1.78%

Be aware that when doing many comparisons the risk of a false-positive result increases.
In this case, there are 3 comparisons, you can thus expect the following amount of false-positive results:
  0.15 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.03 false positives, when considering a   1% risk acceptance (**, ***),
  0.00 false positives, when considering a 0.1% risk acceptance (***)

@benjamingr
Copy link
Member

fwiw the correct data structure for this is probably a deque

@ljharb
Copy link
Member

ljharb commented Oct 28, 2023

That array spread will kill speed, what if you use concat for that?

@debadree25
Copy link
Member Author

That array spread will kill speed, what if you use concat for that?

With concat stream[kState].reader[kState].readRequests = [readRequest].concat(stream[kState].reader[kState].readRequests);

                                                   confidence improvement accuracy (*)   (**)   (***)
webstreams/readable-async-iterator.js n=100000                     2.26 %       ±5.33% ±7.10%  ±9.27%
webstreams/readable-read.js type='byob' n=100000                   1.35 %       ±2.98% ±3.97%  ±5.18%
webstreams/readable-read.js type='normal' n=100000        ***    -12.67 %       ±6.52% ±8.70% ±11.36%

fwiw the correct data structure for this is probably a deque

This is interesting a linkedlist based deque could work here I guess!

@debadree25
Copy link
Member Author

Is there any internal implementation of a deque that can be reused? @benjamingr

@rluvaton
Copy link
Member

Is there any internal implementation of a deque that can be reused?

There was BufferList that @ronag just removed:

I had a PR to change from array to BufferList though and it did not have any change IIRC

also, @ronag moved away from it, and it made it faster

@debadree25
Copy link
Member Author

Ahh got it hmmm I havent gotten time yet to test but it does seem that doing array prototype shift frequently cases a bit of perf issue theres even an issue regarding it https://bugs.chromium.org/p/v8/issues/detail?id=12730

@ljharb
Copy link
Member

ljharb commented Oct 30, 2023

In your example, if you use unshift instead of concat, what do the benchmarks look like?

@debadree25
Copy link
Member Author

Similarish results as #50340 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run. web streams
Projects
None yet
Development

Successfully merging this pull request may close these issues.