Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(perf): Pool ActiveQuerys in the query stack #629

Merged
merged 5 commits into from
Mar 17, 2025

Conversation

Veykril
Copy link
Member

@Veykril Veykril commented Dec 13, 2024

ActiveQuery is 192 bytes and has a couple of collections within it that we can potentially re-use the backing allocations for. So pooling those instead of creating, pushing and popping the query from the stack saves us from re-allocating unnecessarily in a bunch of cases.

Copy link

netlify bot commented Dec 13, 2024

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit 59912e8
🔍 Latest deploy log https://app.netlify.com/sites/salsa-rs/deploys/67d7f60c8b9b590008cbc9e1

Copy link

codspeed-hq bot commented Dec 13, 2024

CodSpeed Performance Report

Merging #629 will improve performances by 9.72%

Comparing Veykril:veykril/push-vznumyusmzww (59912e8) with master (d758691)

Summary

⚡ 6 improvements
✅ 6 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
new[Input] 11.4 µs 10.4 µs +9.72%
new[SupertypeInput] 17.5 µs 16.3 µs +7.33%
mutating[10] 14.7 µs 13.6 µs +8.05%
mutating[20] 15 µs 13.9 µs +7.92%
mutating[30] 15.1 µs 14 µs +7.86%
converge_diverge 147.8 µs 142.6 µs +3.63%

@Veykril Veykril changed the title Do not pass ownership of the QueryStack in `Runtime::block_on_or_un… Pool ActiveQuerys in the query stack Dec 13, 2024
@Veykril Veykril marked this pull request as ready for review December 14, 2024 10:50
@MichaReiser
Copy link
Contributor

Do you have any perf numbers from ra that confirm the improvement?

@Veykril
Copy link
Member Author

Veykril commented Dec 23, 2024

Not yet hence why I am fine with waiting on these PRs until salsa can be used for r-a as is so its easier for me to test

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from acbb79e to e9f723b Compare January 4, 2025 10:09
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Huh that's odd, all I did was rebase on latest master and now we have a perf regression?

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch 2 times, most recently from 74e58b4 to 8a90d94 Compare January 4, 2025 10:36
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 8a90d94 to b3da8e1 Compare January 4, 2025 11:03
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

okay locally I see (compared to #626:

  • +2% on Mutating Inputs/new/InternedInput
  • -10% on Mutating Inputs/amortized/InternedInput
  • +19% on Mutating Inputs/new/Input
  • -10% on Mutating Inputs/amortized/Input
  • +9% on many_tracked_structs

@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Looking at the profiling graph in codspeed this regressions to come from different allocation behavior (within unrelated code), so I don't think this is a direct regression?

@MichaReiser
Copy link
Contributor

Codspeed can be flaky at times but 22% suggest that there's something wrong. The crossbeam_queue call now takes significantly longer... I think we should look into why

@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

It can be flaky but this report has been consistent across my 3 or so rebases earlier. My PR does change allocation patterns, so we might just be getting unlucky with the allocator here now. The SegQueue itself doesn't look special, it merely allocates in blocks of 31 items + pointer.

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from b3da8e1 to 05f5469 Compare January 4, 2025 13:37
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Okay after fixing the benchmarks, this regression seems a lot more workable (and reasonable regarding my changes)

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 05f5469 to f4d973b Compare January 4, 2025 13:49
@Veykril
Copy link
Member Author

Veykril commented Jan 4, 2025

Okay my latest drain changes are the culprit for that one. I do want to say that this PR will generally regress perf over our benches as none of them do multiple queries in succession (which is where this PR shines), they tend to do one or two which means this PR strictly does more allocation work than before.

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from f4d973b to f3f6cc4 Compare January 4, 2025 13:50
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from f3f6cc4 to 5d9e78c Compare February 17, 2025 12:50
@Veykril Veykril marked this pull request as draft February 17, 2025 12:52
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 5d9e78c to c2e00ab Compare February 23, 2025 14:38
@Veykril Veykril marked this pull request as ready for review February 23, 2025 14:46
@Veykril Veykril marked this pull request as draft February 28, 2025 07:26
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from c2e00ab to 7d0cb98 Compare March 14, 2025 10:00
@Veykril Veykril marked this pull request as ready for review March 14, 2025 10:00
@Veykril Veykril changed the title Pool ActiveQuerys in the query stack perf: Pool ActiveQuerys in the query stack Mar 14, 2025
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch 2 times, most recently from 8f9c605 to cdf040e Compare March 14, 2025 10:37
@Veykril
Copy link
Member Author

Veykril commented Mar 14, 2025

The regression is a bit confusing to me, I can't quite interpret it. It shouldn't be an allocation as the stack ought to have enough size at that point from the setup. But overall this seems to have some nice reductions

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch 3 times, most recently from 07a84ba to f475f58 Compare March 15, 2025 11:26
@Veykril
Copy link
Member Author

Veykril commented Mar 15, 2025

Alright figured out the cause of the regression, now its all green 🎉

@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from facbe76 to 67c60c7 Compare March 16, 2025 09:50
@Veykril Veykril changed the title perf: Pool ActiveQuerys in the query stack refactor(perf): Pool ActiveQuerys in the query stack Mar 17, 2025
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 67c60c7 to 27312ba Compare March 17, 2025 07:57
@Veykril Veykril force-pushed the veykril/push-vznumyusmzww branch from 27312ba to 59912e8 Compare March 17, 2025 10:14
@Veykril Veykril enabled auto-merge March 17, 2025 10:17
@Veykril Veykril added this pull request to the merge queue Mar 17, 2025
Merged via the queue into salsa-rs:master with commit 67dd29a Mar 17, 2025
11 checks passed
@Veykril Veykril deleted the veykril/push-vznumyusmzww branch March 17, 2025 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants