Skip to content

fix: Put qlog behind a feature flag#3005

Closed
larseggert wants to merge 17 commits into
mozilla:mainfrom
larseggert:feat-1894
Closed

fix: Put qlog behind a feature flag#3005
larseggert wants to merge 17 commits into
mozilla:mainfrom
larseggert:feat-1894

Conversation

@larseggert
Copy link
Copy Markdown
Collaborator

@larseggert larseggert commented Sep 22, 2025

Also remove qlog::add_event_data_now, since we need to pass a timestamp around more places now anyhow. And rename QlogMetric to qlog::Metric.

This is pretty intrusive. Anyone got an idea how to make this cleaner? Edit: It's a bit better now.

Fixes #1894

Also remove `qlog::add_event_data_now`, since we need to pass a timestamp around more places now anyhow.

This is pretty intrusive. Anyone got an idea how to make this cleaner?

Fixes mozilla#1894
@mxinden
Copy link
Copy Markdown
Member

mxinden commented Sep 22, 2025

Thanks for exploring the change.

This adds a lot of noise. Thus I suggest only proceeding here in case it has a significant performance win.

What will we do on the Firefox side? Currently one can dynamically enable qlog logging at runtime. Would we always enable the feature in Firefox? If so, how we run our Neqo tests would diverge to how we run Neqo in Firefox.

@larseggert
Copy link
Copy Markdown
Collaborator Author

I would disable in release builds. So we ever ask for qlogs for bugzilla tickets?

Copy link
Copy Markdown
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like the most invasive action here is to make some Instant arguments conditional. I would not worry about that: we can pass them through all the way. There are relatively few places where we need to invoke Instant::now() to have that work and the argument lists don't bloat too much as a result.

Overall, the way I'd deal with this is to limit the conditional compilation to the qlog wrapper. That can make the inner object conditional and stub out all the functions.

Because most of the work is deferred and uses closures, the compiler will be able to recognize that the closures aren't run, so it can erase that code. I don't know how far up the call tree that will go in terms of killing code, but it should be far enough.

Comment thread neqo-http3/src/connection.rs Outdated
@martinthomson
Copy link
Copy Markdown
Member

Do we ever ask for qlogs for bugzilla tickets?

Sometimes, yes. Though maybe we can deliver a custom build in those cases.

@larseggert
Copy link
Copy Markdown
Collaborator Author

larseggert commented Sep 23, 2025

Because most of the work is deferred and uses closures, the compiler will be able to recognize that the closures aren't run, so it can erase that code. I don't know how far up the call tree that will go in terms of killing code, but it should be far enough.

That's what I was wondering, because if I only stub it out in neqo-common, only that crate will gain a qlog feature and hence all the other crates should be unaware of what that is set to and generate code for qlogging, even if that feature is off?

Let me see what it would look like if we also stubbed out qlog in the other crates.

@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 23, 2025

Codecov Report

❌ Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.10%. Comparing base (b9c32c7) to head (88508a1).
⚠️ Report is 91 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3005      +/-   ##
==========================================
- Coverage   93.41%   93.10%   -0.32%     
==========================================
  Files         124      124              
  Lines       36234    36254      +20     
  Branches    36234    36254      +20     
==========================================
- Hits        33847    33753      -94     
- Misses       1540     1655     +115     
+ Partials      847      846       -1     
Components Coverage Δ
neqo-common 95.64% <ø> (-1.68%) ⬇️
neqo-crypto 83.25% <ø> (-0.48%) ⬇️
neqo-http3 92.88% <91.66%> (-0.42%) ⬇️
neqo-qpack 94.18% <ø> (ø)
neqo-transport 94.33% <100.00%> (-0.15%) ⬇️
neqo-udp 78.94% <ø> (-0.48%) ⬇️
mtu 85.76% <ø> (ø)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 23, 2025

🐰 Bencher Report

Branchfeat-1894
TestbedOn-prem

🚨 1 Alert

IterationBenchmarkMeasure
Units
ViewBenchmark Result
(Result Δ%)
Upper Boundary
(Limit %)
2neqo vs. google (cubic, paced)Latency
milliseconds (ms)
📈 plot
🚷 threshold
🚨 alert (🔔)
779.04 ms
(+2.54%)Baseline: 759.76 ms
775.08 ms
(100.51%)

Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
google vs. neqo (cubic, paced)📈 view plot
🚷 view threshold
279.27 ms
(+0.33%)Baseline: 278.35 ms
282.74 ms
(98.78%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
msquic vs. neqo (cubic, paced)📈 view plot
🚷 view threshold
228.25 ms
(+14.54%)Baseline: 199.27 ms
236.95 ms
(96.33%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. google (cubic, paced)📈 view plot
🚷 view threshold
🚨 view alert (🔔)
779.04 ms
(+2.54%)Baseline: 759.76 ms
775.08 ms
(100.51%)

BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. msquic (cubic, paced)📈 view plot
🚷 view threshold
157.95 ms
(+0.11%)Baseline: 157.78 ms
160.60 ms
(98.35%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. neqo (cubic)📈 view plot
🚷 view threshold
93.01 ms
(+1.60%)Baseline: 91.55 ms
96.86 ms
(96.03%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. neqo (cubic, paced)📈 view plot
🚷 view threshold
92.23 ms
(-0.71%)Baseline: 92.90 ms
98.08 ms
(94.04%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. neqo (reno)📈 view plot
🚷 view threshold
91.84 ms
(+0.34%)Baseline: 91.53 ms
96.68 ms
(95.00%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. neqo (reno, paced)📈 view plot
🚷 view threshold
94.95 ms
(+2.33%)Baseline: 92.79 ms
97.78 ms
(97.11%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. quiche (cubic, paced)📈 view plot
🚷 view threshold
194.30 ms
(+0.34%)Baseline: 193.64 ms
196.96 ms
(98.65%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo vs. s2n (cubic, paced)📈 view plot
🚷 view threshold
220.83 ms
(-0.14%)Baseline: 221.14 ms
224.10 ms
(98.54%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
quiche vs. neqo (cubic, paced)📈 view plot
🚷 view threshold
156.69 ms
(+2.33%)Baseline: 153.13 ms
158.47 ms
(98.88%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
s2n vs. neqo (cubic, paced)📈 view plot
🚷 view threshold
169.97 ms
(-2.19%)Baseline: 173.77 ms
178.02 ms
(95.47%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 23, 2025

🐰 Bencher Report

Branchfeat-1894
TestbedOn-prem
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
nanoseconds (ns)
(Result Δ%)
Upper Boundary
nanoseconds (ns)
(Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client📈 view plot
🚷 view threshold
212,540,000.00 ns
(+2.78%)Baseline: 206,793,948.13 ns
216,504,270.13 ns
(98.17%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client📈 view plot
🚷 view threshold
205,260,000.00 ns
(+2.16%)Baseline: 200,914,005.76 ns
211,157,536.76 ns
(97.21%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client📈 view plot
🚷 view threshold
38,943,000.00 ns
(+22.84%)Baseline: 31,702,706.05 ns
43,003,109.02 ns
(90.56%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client📈 view plot
🚷 view threshold
291,060,000.00 ns
(-0.10%)Baseline: 291,350,518.73 ns
303,600,377.09 ns
(95.87%)
1-streams/each-1000-bytes/simulated-time📈 view plot
🚷 view threshold
119,170,000.00 ns
(+0.42%)Baseline: 118,677,463.98 ns
120,711,446.87 ns
(98.72%)
1-streams/each-1000-bytes/wallclock-time📈 view plot
🚷 view threshold
582,310.00 ns
(-1.74%)Baseline: 592,610.29 ns
615,409.62 ns
(94.62%)
1000-streams/each-1-bytes/simulated-time📈 view plot
🚷 view threshold
2,331,000,000.00 ns
(-82.64%)Baseline: 13,426,680,979.83 ns
23,178,179,829.53 ns
(10.06%)
1000-streams/each-1-bytes/wallclock-time📈 view plot
🚷 view threshold
12,703,000.00 ns
(-8.19%)Baseline: 13,836,132.56 ns
15,087,187.42 ns
(84.20%)
1000-streams/each-1000-bytes/simulated-time📈 view plot
🚷 view threshold
16,161,000,000.00 ns
(-13.33%)Baseline: 18,645,951,008.65 ns
20,643,862,366.60 ns
(78.28%)
1000-streams/each-1000-bytes/wallclock-time📈 view plot
🚷 view threshold
51,328,000.00 ns
(+1.29%)Baseline: 50,675,239.19 ns
57,058,656.37 ns
(89.96%)
RxStreamOrderer::inbound_frame()📈 view plot
🚷 view threshold
109,790,000.00 ns
(+0.08%)Baseline: 109,697,665.71 ns
111,585,865.13 ns
(98.39%)
coalesce_acked_from_zero 1+1 entries📈 view plot
🚷 view threshold
89.47 ns
(+0.66%)Baseline: 88.87 ns
90.14 ns
(99.25%)
coalesce_acked_from_zero 10+1 entries📈 view plot
🚷 view threshold
105.81 ns
(-0.25%)Baseline: 106.08 ns
107.22 ns
(98.69%)
coalesce_acked_from_zero 1000+1 entries📈 view plot
🚷 view threshold
91.80 ns
(+1.63%)Baseline: 90.33 ns
94.94 ns
(96.69%)
coalesce_acked_from_zero 3+1 entries📈 view plot
🚷 view threshold
106.37 ns
(-0.20%)Baseline: 106.59 ns
107.67 ns
(98.79%)
decode 1048576 bytes, mask 3f📈 view plot
🚷 view threshold
1,762,400.00 ns
(+7.75%)Baseline: 1,635,675.79 ns
1,811,574.79 ns
(97.29%)
decode 1048576 bytes, mask 7f📈 view plot
🚷 view threshold
5,059,000.00 ns
(-0.15%)Baseline: 5,066,441.79 ns
5,112,867.74 ns
(98.95%)
decode 1048576 bytes, mask ff📈 view plot
🚷 view threshold
3,015,100.00 ns
(-0.47%)Baseline: 3,029,272.33 ns
3,053,892.36 ns
(98.73%)
decode 4096 bytes, mask 3f📈 view plot
🚷 view threshold
6,244.30 ns
(-15.05%)Baseline: 7,350.41 ns
10,365.47 ns
(60.24%)
decode 4096 bytes, mask 7f📈 view plot
🚷 view threshold
19,607.00 ns
(-0.98%)Baseline: 19,801.38 ns
20,462.67 ns
(95.82%)
decode 4096 bytes, mask ff📈 view plot
🚷 view threshold
11,338.00 ns
(-0.24%)Baseline: 11,365.34 ns
12,518.55 ns
(90.57%)
sent::Packets::take_ranges📈 view plot
🚷 view threshold
4,545.50 ns
(-3.70%)Baseline: 4,719.98 ns
4,959.22 ns
(91.66%)
transfer/pacing-false/same-seed/simulated-time/run📈 view plot
🚷 view threshold
25,234,000,000.00 ns
(-0.69%)Baseline: 25,409,179,710.14 ns
26,027,111,806.36 ns
(96.95%)
transfer/pacing-false/same-seed/wallclock-time/run📈 view plot
🚷 view threshold
24,493,000.00 ns
(-4.62%)Baseline: 25,679,956.52 ns
27,106,506.30 ns
(90.36%)
transfer/pacing-false/varying-seeds/simulated-time/run📈 view plot
🚷 view threshold
25,193,000,000.00 ns
(+0.07%)Baseline: 25,175,544,927.54 ns
25,224,734,939.74 ns
(99.87%)
transfer/pacing-false/varying-seeds/wallclock-time/run📈 view plot
🚷 view threshold
25,125,000.00 ns
(-2.48%)Baseline: 25,763,759.42 ns
27,409,475.32 ns
(91.67%)
transfer/pacing-true/same-seed/simulated-time/run📈 view plot
🚷 view threshold
25,301,000,000.00 ns
(-1.07%)Baseline: 25,575,881,159.42 ns
25,885,357,263.99 ns
(97.74%)
transfer/pacing-true/same-seed/wallclock-time/run📈 view plot
🚷 view threshold
26,212,000.00 ns
(-3.01%)Baseline: 27,026,626.09 ns
28,600,519.26 ns
(91.65%)
transfer/pacing-true/varying-seeds/simulated-time/run📈 view plot
🚷 view threshold
24,998,000,000.00 ns
(+0.01%)Baseline: 24,995,269,565.22 ns
25,043,744,345.85 ns
(99.82%)
transfer/pacing-true/varying-seeds/wallclock-time/run📈 view plot
🚷 view threshold
25,369,000.00 ns
(-3.41%)Baseline: 26,264,742.03 ns
27,991,141.52 ns
(90.63%)
🐰 View full continuous benchmarking report in Bencher

@larseggert larseggert marked this pull request as ready for review September 23, 2025 10:50
Copy link
Copy Markdown
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've picked up some improvements that probably need to be taken up separately.

Comment thread neqo-bin/src/client/mod.rs Outdated
Comment thread neqo-common/src/qlog.rs
Comment thread neqo-common/src/qlog.rs Outdated
Comment thread neqo-common/src/qlog.rs
Comment thread neqo-common/src/qlog.rs
Comment thread neqo-transport/src/qlog.rs
Comment thread test-fixture/src/lib.rs
Comment thread test-fixture/src/sim/http3_connection.rs Outdated
Comment thread neqo-http3/src/qlog.rs
Comment thread neqo-bin/src/lib.rs
larseggert added a commit to larseggert/neqo that referenced this pull request Sep 24, 2025
So `qlog` always logs the time we maintain. Broken out of mozilla#3005.
larseggert added a commit to larseggert/neqo that referenced this pull request Sep 24, 2025
Copilot AI review requested due to automatic review settings October 1, 2025 15:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR puts the qlog functionality behind a feature flag to make it optional, addressing issue #1894. The changes reorganize QLOG-related code to be conditionally compiled only when the qlog feature is enabled.

  • Adds qlog feature flags to all relevant Cargo.toml files with appropriate dependency chains
  • Wraps QLOG-related code with #[cfg(feature = "qlog")] attributes throughout the codebase
  • Renames QlogMetric to qlog::Metric for better namespacing
  • Updates build commands and CI workflows to include the qlog feature when needed

Reviewed Changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/test.sh Adds qlog feature to cargo build command
test-fixture/src/lib.rs Conditionally compiles qlog imports and usage
test-fixture/Cargo.toml Makes qlog dependency optional with feature flag
qns/Dockerfile Adds qlog feature to docker build
neqo-transport/src/server.rs Adds feature flags around qlog imports
neqo-transport/src/rtt.rs Updates QlogMetric references to qlog::Metric
neqo-transport/src/recovery/mod.rs Updates metric type references
neqo-transport/src/qlog.rs Adds extensive feature gating for qlog functionality
neqo-transport/src/packet/metadata.rs Feature gates qlog-related methods
neqo-transport/src/connection/tests/mod.rs Conditionally compiles qlog test setup
neqo-transport/src/connection/tests/migration.rs Updates qlog usage in tests
neqo-transport/src/cc/classic_cc.rs Feature gates qlog calls and updates metric types
neqo-transport/Cargo.toml Makes qlog optional dependency
neqo-qpack/Cargo.toml Makes qlog optional dependency
neqo-http3/tests/webtransport.rs Adds timestamp parameter to webtransport calls
neqo-http3/src/server_events.rs Adds timestamp parameter to create_stream method
neqo-http3/src/qlog.rs Feature gates qlog imports
neqo-http3/src/features/extended_connect/webtransport_session.rs Adds timestamp parameter
neqo-http3/src/features/extended_connect/tests/webtransport/sessions.rs Updates test calls with timestamp
neqo-http3/src/features/extended_connect/tests/webtransport/mod.rs Updates helper methods with timestamp
neqo-http3/src/features/extended_connect/session.rs Adds timestamp parameter to add_stream
neqo-http3/src/connection_server.rs Threads timestamp through webtransport calls
neqo-http3/src/connection_client.rs Threads timestamp through webtransport calls
neqo-http3/src/connection.rs Updates webtransport stream creation with timestamps
neqo-http3/src/buffered_send_stream.rs Minor formatting change
neqo-http3/Cargo.toml Makes qlog optional dependency
neqo-common/src/qlog.rs Extensive feature gating and stub implementation
neqo-common/Cargo.toml Makes qlog optional dependency
neqo-bin/src/server/mod.rs Feature gates qlog error conversion
neqo-bin/src/server/http3.rs Fixes unused parameter warning
neqo-bin/src/lib.rs Updates test conditional compilation
neqo-bin/src/client/mod.rs Feature gates qlog functionality
neqo-bin/src/client/http3.rs Simplifies qlog setup call
neqo-bin/Cargo.toml Makes qlog optional dependency
README.md Updates build instructions to mention qlog feature
.github/workflows/sanitize.yml Adds qlog feature to CI
.github/workflows/check.yml Adds qlog feature to CI

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread neqo-transport/src/qlog.rs Outdated
Comment thread neqo-transport/src/cc/classic_cc.rs Outdated
Copilot AI review requested due to automatic review settings October 1, 2025 17:05
Copy link
Copy Markdown
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hesitant to disable qlog in Firefox at compile time. We plan to invest more time into our congestion controller. qlog is currently our only way to debug congestion control issues beyond basic logging, e.g. by visualizing the congestion window over time. I think qlog will become a valuable tool in this work.

If we don't disable qlog in Firefox by default, I don't think we should do so in Neqo either, as Neqo benchmarks should match Firefox as closely as possible.

I acknowledge the problem. I have seen qlog show up in various profiles, especially its usage of Rc. I would expect the reference counting to be relatively cheap, compared to say our crypto operations. It is counterintuitive to me. Does someone have a good theory? Is there an alternative to disabling it?

Comment thread neqo-http3/src/features/extended_connect/session.rs
@larseggert
Copy link
Copy Markdown
Collaborator Author

I am hesitant to disable qlog in Firefox at compile time.

We'd need to carefully explain to a user how to generate a qlog anyway. So I think @martinthomson's suggestion to ship them a custom build is therefore workable. Do you see a reason that wouldn't work?

@mxinden
Copy link
Copy Markdown
Member

mxinden commented Oct 8, 2025

Would work, yes. That said, I think we should make recording a qlog as low-friction as possible. Currently one has to simply set a pref. Shipping a custom build instead, of their specific Firefox version, for their specific OS version, is significantly more involved than that.

Yes qlog has an overhead. But compared to the many other low-hanging fruits we have in Firefox, is the above complexity worth the gain?

I am still puzzled why it shows up in profiles at all. We use Rc in various other hot paths. How is the Rc usage of qlog different?

@larseggert
Copy link
Copy Markdown
Collaborator Author

One way forward would be to merge this with a default to "on", so we can more easily flip that later.

Copilot AI review requested due to automatic review settings October 9, 2025 11:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated 6 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread neqo-http3/tests/webtransport.rs
Comment thread neqo-http3/tests/webtransport.rs
Comment thread neqo-http3/tests/webtransport.rs
Comment thread neqo-http3/tests/webtransport.rs
Comment thread neqo-http3/tests/webtransport.rs
Comment thread neqo-bin/src/lib.rs Outdated
@larseggert
Copy link
Copy Markdown
Collaborator Author

One way forward would be to merge this with a default to "on", so we can more easily flip that later.

I did this now.

Copilot AI review requested due to automatic review settings October 14, 2025 12:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@larseggert larseggert requested a review from mxinden October 15, 2025 14:43
Copy link
Copy Markdown
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of merging the changes around explicitly passing now.

I am indifferent to the introduction of the qlog feature. Given that it is on by default, and given that we never disable it, the disabled code path is effectively dead code without a near-term use case. (Please correct me if I am missing a use-case.) I am all for making qlog more performant. I don't think that performance exploration justifies a dead feature flag in main.

Feel free to merge here in case you think the qlog feature is indeed useful.

Comment thread README.md Outdated
Comment thread README.md Outdated
Copilot AI review requested due to automatic review settings October 16, 2025 17:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@larseggert larseggert dismissed martinthomson’s stale review October 16, 2025 17:21

Tried to re-request the review but GitHub doesn't let me.

@larseggert larseggert requested a review from mxinden November 12, 2025 06:51
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Nov 12, 2025

CodSpeed Performance Report

Merging #3005 will improve performances by 11.65%

Comparing larseggert:feat-1894 (88508a1) with main (b9c32c7)

Summary

⚡ 1 improvement
✅ 22 untouched

Benchmarks breakdown

Mode Benchmark BASE HEAD Change
Simulation client 852.3 ms 763.4 ms +11.65%

@github-actions
Copy link
Copy Markdown
Contributor

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to b9c32c7.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark results

Performance differences relative to b9c32c7.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💔 Performance has regressed.
       time:   [204.94 ms 205.26 ms 205.59 ms]
       thrpt:  [486.41 MiB/s 487.18 MiB/s 487.94 MiB/s]
change:
       time:   [+1.7802% +2.0308% +2.2733%] (p = 0.00 < 0.05)
       thrpt:  [-2.2228% -1.9904% -1.7490%]

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: 💔 Performance has regressed.
       time:   [289.46 ms 291.06 ms 292.73 ms]
       thrpt:  [34.161 Kelem/s 34.357 Kelem/s 34.548 Kelem/s]
change:
       time:   [+1.0421% +1.8390% +2.7153%] (p = 0.00 < 0.05)
       thrpt:  [-2.6435% -1.8058% -1.0314%]

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [38.742 ms 38.943 ms 39.174 ms]
       thrpt:  [25.527   B/s 25.679   B/s 25.812   B/s]
change:
       time:   [-0.2618% +0.4629% +1.2102%] (p = 0.23 > 0.05)
       thrpt:  [-1.1958% -0.4608% +0.2625%]

Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) high mild
8 (8.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.
       time:   [212.24 ms 212.54 ms 212.88 ms]
       thrpt:  [469.74 MiB/s 470.50 MiB/s 471.17 MiB/s]
change:
       time:   [+1.2497% +1.4713% +1.6921%] (p = 0.00 < 0.05)
       thrpt:  [-1.6639% -1.4500% -1.2343%]

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.307 µs 11.338 µs 11.377 µs]
       change: [-1.1068% -0.4029% +0.1505%] (p = 0.24 > 0.05)

Found 18 outliers among 100 measurements (18.00%)
2 (2.00%) low severe
6 (6.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [2.9970 ms 3.0151 ms 3.0417 ms]
       change: [-0.1969% +0.5183% +1.3522%] (p = 0.24 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
11 (11.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.561 µs 19.607 µs 19.659 µs]
       change: [-0.7654% -0.3012% +0.1011%] (p = 0.18 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
3 (3.00%) high mild
9 (9.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.0379 ms 5.0590 ms 5.0856 ms]
       change: [-0.2910% +0.2424% +0.9024%] (p = 0.41 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
17 (17.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [6.2086 µs 6.2443 µs 6.2874 µs]
       change: [-0.0871% +0.4587% +1.0706%] (p = 0.13 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
7 (7.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.7578 ms 1.7624 ms 1.7700 ms]
       change: [-0.1523% +0.1878% +0.6080%] (p = 0.46 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high severe

1-streams/each-1000-bytes/wallclock-time: Change within noise threshold.
       time:   [581.63 µs 582.31 µs 582.98 µs]
       change: [-1.8570% -1.2554% -0.7608%] (p = 0.00 < 0.05)
1-streams/each-1000-bytes/simulated-time: No change in performance detected.
       time:   [118.96 ms 119.17 ms 119.38 ms]
       thrpt:  [8.1801 KiB/s 8.1947 KiB/s 8.2095 KiB/s]
change:
       time:   [-0.0123% +0.2533% +0.5218%] (p = 0.06 > 0.05)
       thrpt:  [-0.5190% -0.2527% +0.0123%]

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild

1000-streams/each-1-bytes/wallclock-time: No change in performance detected.
       time:   [12.659 ms 12.703 ms 12.747 ms]
       change: [-0.7218% -0.1683% +0.3682%] (p = 0.57 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1000-streams/each-1-bytes/simulated-time
time: [2.3275 s 2.3310 s 2.3345 s]
thrpt: [428.36 B/s 429.01 B/s 429.65 B/s]
change:
time: [-0.3193% -0.1129% +0.1008%] (p = 0.30 > 0.05)
thrpt: [-0.1007% +0.1130% +0.3204%]
No change in performance detected.

1000-streams/each-1000-bytes/wallclock-time: Change within noise threshold.
       time:   [51.202 ms 51.328 ms 51.454 ms]
       change: [+0.6878% +1.0238% +1.3794%] (p = 0.00 < 0.05)
1000-streams/each-1000-bytes/simulated-time: No change in performance detected.
       time:   [15.909 s 16.161 s 16.410 s]
       thrpt:  [59.509 KiB/s 60.426 KiB/s 61.385 KiB/s]
change:
       time:   [-3.4056% -1.1156% +1.2662%] (p = 0.33 > 0.05)
       thrpt:  [-1.2503% +1.1282% +3.5256%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [89.170 ns 89.465 ns 89.762 ns]
       change: [-1.0811% -0.3994% +0.1991%] (p = 0.23 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [106.04 ns 106.37 ns 106.73 ns]
       change: [-0.5475% -0.0752% +0.3487%] (p = 0.75 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
1 (1.00%) high mild
11 (11.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [105.32 ns 105.81 ns 106.58 ns]
       change: [-0.9208% -0.1078% +0.9276%] (p = 0.83 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [91.629 ns 91.800 ns 91.980 ns]
       change: [-0.6966% -0.0793% +0.5928%] (p = 0.81 > 0.05)

Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [109.63 ms 109.79 ms 110.01 ms]
       change: [-0.3515% -0.1905% +0.0026%] (p = 0.03 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

sent::Packets::take_ranges: No change in performance detected.
       time:   [4.4468 µs 4.5455 µs 4.6355 µs]
       change: [-5.2416% -2.0241% +1.0287%] (p = 0.23 > 0.05)
transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold.
       time:   [25.087 ms 25.125 ms 25.163 ms]
       change: [+0.8368% +1.0845% +1.3192%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected.
       time:   [25.157 s 25.193 s 25.229 s]
       thrpt:  [162.36 KiB/s 162.59 KiB/s 162.82 KiB/s]
change:
       time:   [-0.2776% -0.0585% +0.1646%] (p = 0.59 > 0.05)
       thrpt:  [-0.1643% +0.0586% +0.2784%]

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

transfer/pacing-true/varying-seeds/wallclock-time/run: No change in performance detected.
       time:   [25.309 ms 25.369 ms 25.433 ms]
       change: [-0.4367% -0.0774% +0.2626%] (p = 0.68 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected.
       time:   [24.960 s 24.998 s 25.036 s]
       thrpt:  [163.61 KiB/s 163.86 KiB/s 164.10 KiB/s]
change:
       time:   [-0.1913% +0.0276% +0.2323%] (p = 0.80 > 0.05)
       thrpt:  [-0.2317% -0.0276% +0.1917%]
transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold.
       time:   [24.471 ms 24.493 ms 24.517 ms]
       change: [-1.6039% -1.4213% -1.2631%] (p = 0.00 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected.
       time:   [25.234 s 25.234 s 25.234 s]
       thrpt:  [162.32 KiB/s 162.32 KiB/s 162.32 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000%]
transfer/pacing-true/same-seed/wallclock-time/run: Change within noise threshold.
       time:   [26.184 ms 26.212 ms 26.245 ms]
       change: [+0.5857% +0.7167% +0.8679%] (p = 0.00 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected.
       time:   [25.301 s 25.301 s 25.301 s]
       thrpt:  [161.89 KiB/s 161.89 KiB/s 161.89 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000%]

Download data for profiler.firefox.com or download performance comparison data.

@github-actions
Copy link
Copy Markdown
Contributor

Client/server transfer results

Performance differences relative to b9c32c7.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ main Δ main
google vs. google 476.2 ± 3.6 470.2 487.6 67.2 ± 8.9
google vs. neqo (cubic, paced) 279.3 ± 4.3 270.1 288.4 114.6 ± 7.4 💚 -2.6 -0.9%
msquic vs. msquic 207.2 ± 83.3 142.7 549.5 154.5 ± 0.4
msquic vs. neqo (cubic, paced) 228.2 ± 82.6 157.4 582.1 140.2 ± 0.4 5.0 2.2%
neqo vs. google (cubic, paced) 779.0 ± 3.5 772.6 789.5 41.1 ± 9.1 💔 1.6 0.2%
neqo vs. msquic (cubic, paced) 158.0 ± 4.6 150.6 166.6 202.6 ± 7.0 0.5 0.3%
neqo vs. neqo (cubic) 93.0 ± 4.2 86.1 103.5 344.0 ± 7.6 -1.2 -1.3%
neqo vs. neqo (cubic, paced) 92.2 ± 4.1 81.9 108.2 346.9 ± 7.8 -0.8 -0.8%
neqo vs. neqo (reno) 91.8 ± 4.5 83.6 105.0 348.4 ± 7.1 💚 -2.0 -2.2%
neqo vs. neqo (reno, paced) 94.9 ± 5.1 87.5 118.2 337.0 ± 6.3 -0.3 -0.3%
neqo vs. quiche (cubic, paced) 194.3 ± 5.1 187.5 216.6 164.7 ± 6.3 💔 1.8 0.9%
neqo vs. s2n (cubic, paced) 220.8 ± 4.7 213.4 230.8 144.9 ± 6.8 💚 -2.2 -1.0%
quiche vs. neqo (cubic, paced) 156.7 ± 4.7 143.1 167.6 204.2 ± 6.8 💔 1.9 1.2%
quiche vs. quiche 144.9 ± 4.6 137.2 156.9 220.9 ± 7.0
s2n vs. neqo (cubic, paced) 170.0 ± 4.2 161.7 182.7 188.3 ± 7.6 0.1 0.0%
s2n vs. s2n 247.4 ± 28.1 231.8 344.6 129.3 ± 1.1

Download data for profiler.firefox.com or download performance comparison data.

mxinden added a commit to mxinden/neqo that referenced this pull request Nov 12, 2025
Currently qlog is defined as:

```rust
pub struct Qlog {
    inner: Rc<RefCell<Option<SharedStreamer>>>,
}
```

mozilla#1894 documents our finding, that
dereferencing the `Rc<RefCell<_>>` is costly, i.e. showing up in our CPU
profiles.

When writing to the qlog fails, we set the (inner) `Option` to `None`. Thus on
each consecutive logging atttempt we dereference the `Rc<RefCell<_>>` only to
find out that logging is disabled.

This patch still sets the inner `Option` to `None` to inform all other clones of
the `Rc` that logging is disabled. In addition, it introduces an outer `Option`,
which as well is set to `None`, thus avoiding the dereferencing of the
`Rc<RefCell<_>>` on each logging attempt once logging has failed.

---

Fixes mozilla#1894.
Alternative for mozilla#3005.
@mxinden
Copy link
Copy Markdown
Member

mxinden commented Nov 12, 2025

In regards to feature flagging, @larseggert what do you think of #3129 instead? It is less intrusive, and should provide the same performance characteristics as this pull request.

If you agree, I suggest merging the changes related to passing now here, but not the feature flagging. I would then clean up #3129.

@larseggert
Copy link
Copy Markdown
Collaborator Author

Closing in favor of #3129.

@larseggert larseggert closed this Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Put qlog behind a feature

4 participants