[codex] Add container pool hysteresis#11850
Conversation
📝 WalkthroughWalkthroughAdds bucket-count and deallocation APIs to ChangesContainer retirement flow
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1f8e2c4 to
71be00a
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a9a081c6-051d-4ded-add2-0b73d389ad36
📒 Files selected for processing (3)
source/core/slang-dictionary.hsource/slang/slang-container-pool.htools/slang-unit-test/unit-test-container-pool.cpp
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7c09d7a2-ebe9-4e9e-916c-323991d02a13
📒 Files selected for processing (1)
source/slang/slang-container-pool.h
| bool updateHashSetUnderuseStreakAndShouldRetire( | ||
| int objectIndex, | ||
| size_t liveCount, | ||
| size_t bucketCount) | ||
| { | ||
| // Hash sets normally keep their buckets after `clear()` so the next large use can reuse | ||
| // that storage. On long runs, though, a pool slot can become expensive if one large use is | ||
| // followed by many tiny uses: every return clears the same large table even though the live | ||
| // count stays small. Track that pattern per slot and only release the buckets after it | ||
| // repeats, so an occasional small use after a large pass does not cause allocation churn. | ||
| bool isUnderused = | ||
| bucketCount >= kContainerPoolHashSetMinRetireBucketCount && | ||
| liveCount <= bucketCount / kContainerPoolHashSetRetireUnderuseDivisor; | ||
| if (!isUnderused) | ||
| { | ||
| m_hashSetUnderuseStreaks[objectIndex] = 0; | ||
| return false; | ||
| } | ||
|
|
||
| if (m_hashSetUnderuseStreaks[objectIndex] < kContainerPoolHashSetRetireUnderuseCount) | ||
| m_hashSetUnderuseStreaks[objectIndex]++; | ||
|
|
||
| if (m_hashSetUnderuseStreaks[objectIndex] < kContainerPoolHashSetRetireUnderuseCount) | ||
| return false; | ||
|
|
||
| m_hashSetUnderuseStreaks[objectIndex] = 0; | ||
| return true; | ||
| } | ||
|
|
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win
Add a "what" sentence and example to the function comment.
The comment explains the rationale well but jumps straight into the "why" without first stating what the function does (e.g., "Returns true if the pool slot at objectIndex has been underused for kContainerPoolHashSetRetireUnderuseCount consecutive returns and should have its hash-set buckets deallocated."). The guideline also calls for a concrete example for non-trivial logic, which this hysteresis decision is.
As per coding guidelines, "Comment functions in complete sentences: say what the function does first, then why if non-obvious; include a concrete example for non-trivial logic."
Suggested addition
+ // Returns true if the pool slot at `objectIndex` has now been underused for
+ // `kContainerPoolHashSetRetireUnderuseCount` consecutive returns and its bucket storage
+ // should be deallocated.
// Hash sets normally keep their buckets after `clear()` so the next large use can reuse
// that storage. On long runs, though, a pool slot can become expensive if one large use is
// followed by many tiny uses: every return clears the same large table even though the live
// count stays small. Track that pattern per slot and only release the buckets after it
// repeats, so an occasional small use after a large pass does not cause allocation churn.
+ //
+ // For example, with the default thresholds, a slot whose set grows to 4096+ buckets and is
+ // then returned twice in a row holding 512 or fewer live entries each time will have its
+ // buckets freed on the second return; a single underused return followed by heavy use again
+ // resets the streak and keeps the buckets.
bool isUnderused =📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bool updateHashSetUnderuseStreakAndShouldRetire( | |
| int objectIndex, | |
| size_t liveCount, | |
| size_t bucketCount) | |
| { | |
| // Hash sets normally keep their buckets after `clear()` so the next large use can reuse | |
| // that storage. On long runs, though, a pool slot can become expensive if one large use is | |
| // followed by many tiny uses: every return clears the same large table even though the live | |
| // count stays small. Track that pattern per slot and only release the buckets after it | |
| // repeats, so an occasional small use after a large pass does not cause allocation churn. | |
| bool isUnderused = | |
| bucketCount >= kContainerPoolHashSetMinRetireBucketCount && | |
| liveCount <= bucketCount / kContainerPoolHashSetRetireUnderuseDivisor; | |
| if (!isUnderused) | |
| { | |
| m_hashSetUnderuseStreaks[objectIndex] = 0; | |
| return false; | |
| } | |
| if (m_hashSetUnderuseStreaks[objectIndex] < kContainerPoolHashSetRetireUnderuseCount) | |
| m_hashSetUnderuseStreaks[objectIndex]++; | |
| if (m_hashSetUnderuseStreaks[objectIndex] < kContainerPoolHashSetRetireUnderuseCount) | |
| return false; | |
| m_hashSetUnderuseStreaks[objectIndex] = 0; | |
| return true; | |
| } | |
| // Returns true if the pool slot at `objectIndex` has now been underused for | |
| // `kContainerPoolHashSetRetireUnderuseCount` consecutive returns and its bucket storage | |
| // should be deallocated. | |
| // Hash sets normally keep their buckets after `clear()` so the next large use can reuse | |
| // that storage. On long runs, though, a pool slot can become expensive if one large use is | |
| // followed by many tiny uses: every return clears the same large table even though the live | |
| // count stays small. Track that pattern per slot and only release the buckets after it | |
| // repeats, so an occasional small use after a large pass does not cause allocation churn. | |
| // | |
| // For example, with the default thresholds, a slot whose set grows to 4096+ buckets and is | |
| // then returned twice in a row holding 512 or fewer live entries each time will have its | |
| // buckets freed on the second return; a single underused return followed by heavy use again | |
| // resets the streak and keeps the buckets. | |
| bool isUnderused = | |
| bucketCount >= kContainerPoolHashSetMinRetireBucketCount && | |
| liveCount <= bucketCount / kContainerPoolHashSetRetireUnderuseDivisor; | |
| if (!isUnderused) | |
| { | |
| m_hashSetUnderuseStreaks[objectIndex] = 0; | |
| return false; | |
| } | |
| if (m_hashSetUnderuseStreaks[objectIndex] < kContainerPoolHashSetRetireUnderuseCount) | |
| m_hashSetUnderuseStreaks[objectIndex]++; | |
| if (m_hashSetUnderuseStreaks[objectIndex] < kContainerPoolHashSetRetireUnderuseCount) | |
| return false; | |
| m_hashSetUnderuseStreaks[objectIndex] = 0; | |
| return true; |
🧰 Tools
🪛 GitHub Actions: Check Formatting (comment /format to auto-fix) / 0_check-formatting.txt
[error] 94-94: Formatting check failed for C++ files. formatting.sh --check-only reported a diff in source/slang/slang-container-pool.h: 'bool isUnderused' line was reformatted (spacing/wrapping).
🪛 GitHub Actions: Check Formatting (comment /format to auto-fix) / check-formatting
[error] 94-97: Formatting check failed in ./extras/formatting.sh --check-only. Diff indicates clang-format (or formatting script) would reformat boolean initialization for isUnderused (line breaks/indentation).
Source: Coding guidelines
There was a problem hiding this comment.
wondering if the similar logic can also be used here?
There was a problem hiding this comment.
I think this "large use followed by small uses" cases should exist for all kinds of objects including List and Dictionary.
There was a problem hiding this comment.
I think Dictionary makes sense since it depends on hash-sets, but for List it's likely not helpful because the cost to clear a list is O(1)
There was a problem hiding this comment.
Verdict: ✅ Clean — no significant issues found
Adds hysteresis to ContainerPool so oversized pooled Dictionary/HashSet slots release their backing buckets after two consecutive underused returns (bucketCount ≥ 4096 and liveCount ≤ bucketCount/8). List pools are correctly left untouched. Streak arrays are pre-sized to kContainerPoolSize in the constructor; getObjectIndex() bounds-checks via SLANG_RELEASE_ASSERT; clearAndDeallocate() is invoked through the concrete typed pointer (before the void*/void* cast used for pool bookkeeping), avoiding any strict-aliasing concern with the shrink path.
Changes Overview
Container helpers (source/core/slang-dictionary.h)
- Adds
Dictionary::clearAndDeallocate()(swap with a freshInnerMapto release bucket storage) andDictionary::getBucketCount().HashSetBaseforwards both.Dictionary::clear()becomes a no-op when the map is already empty, avoiding amap.clear()call over a retained bucket array on repeated clears.
Pool hysteresis (source/slang/slang-container-pool.h)
- New
ObjectPool::getObjectIndex()computes the pool slot with aSLANG_RELEASE_ASSERTbounds check.ContainerPoolnow tracks per-slot underuse streaks for the dictionary and hash-set pools.free()measuresliveCount/bucketCountat return, increments the streak when underused (≥ 4096 buckets, ≤ 1/8 fill), resets it otherwise, and callsclearAndDeallocate()on the second consecutive underused return instead ofclear(). List pool path is unchanged.
Unit tests (tools/slang-unit-test/unit-test-container-pool.cpp, new)
- Covers:
clearAndDeallocatereleases buckets whileclearpreserves them; hash-set/dictionary slots retire only after the second underused return; a substantial use between two small uses resets the streak so retirement is deferred.
TLDR:
Improve management of the pre-allocated pool of containers by releasing overly large hash sets and dictionaries to avoid repeatedly performing expensive clears.
Improves code-gen performance on large codebases (e.g. Falcor) by around 15% since hash sets are heavily reused during simplification. Little effect on smaller shaders.
Motivation
Long compiler runs can repeatedly reuse the same IR container-pool hash-set or dictionary slot after one pass has grown it to a large bucket array. When later uses only insert a few entries, returning that container to the pool still pays the cost of clearing the large backing table. A single temporary spike can therefore leave a hot pool slot expensive for many later small uses.
The motivating case is a pooled hash table that grows past thousands of buckets, is returned to the pool, and is then reused for one-entry or otherwise small containers. The live count is small, but the clear cost still follows the retained bucket count.
Proposed Solution
Add conservative hysteresis to pooled hash sets and dictionaries. A return is considered underused when it has at least 4096 buckets and the live count is no more than one eighth of the bucket count. The pool trims that slot only after two consecutive underused returns for the same slot.
This keeps the fixed pool slot reusable while releasing oversized hash-table storage when the recent use pattern shows that the retained capacity is no longer helping. Lists are left unchanged because their clear behavior does not have the same large bucket-reset cost.
Change Summary
DictionaryandHashSetBase.Dictionary::clear()skip the underlying clear when the dictionary is already empty.ObjectPool::getObjectIndex()soContainerPoolcan track per-slot metadata.Concepts and Vocabulary
ContainerPooland reused across IR passes.Process Report
The core issue sits at the container reuse boundary rather than in any particular IR pass.
ContainerPool::getHashSet<T>()andContainerPool::getDictionary<T, U>()hand out fixed objects from their pools, and their matchingfree()paths clear those objects before returning the slots. If a previous use grew the underlying hash table to a large bucket array, a later small use inherits that capacity and pays for clearing it again on return.The fix therefore belongs in the
ContainerPool::free()paths, where both the final live count and the retained bucket count are available. The new logic records underuse per pool slot, not per call site, because the expensive retained storage also belongs to the slot. A single underused return still uses normalclear(), which preserves capacity for passes that alternate between large and small uses. The second consecutive underused return callsclearAndDeallocate()and resets the streak.clearAndDeallocate()is implemented in the core container wrapper by swapping with a fresh empty container. That gives the pool an explicit way to release backing storage instead of relying onclear(), whose normal contract is to remove entries while keeping capacity available for reuse. The newgetBucketCount()helper exposes only the capacity signal needed by the pool and by the tests.Hash sets are backed by dictionaries, and
ContainerPoolalso has a direct dictionary pool. Applying the same hysteresis to direct pooled dictionaries keeps the policy consistent for the two hash-table-backed container types while leaving list pools alone. Lists do not retain a bucket array, so applying this policy there would add bookkeeping without addressing the repeated hash-table clear cost that motivated the change.The empty guard in
Dictionary::clear()handles a related but narrower case: if a pooled dictionary or hash set is already empty, repeatedly clearing it should not ask the underlying map to reset a retained large table. This is a semantics-preserving fast path because an empty container remains empty.The focused unit tests grow a pooled hash set or dictionary, return it once with a tiny live count, and confirm the capacity is retained. They then repeat the underused return for the same slot and confirm the backing storage is released. A separate reset test performs a substantial use between two small uses to confirm the hysteresis streak resets instead of trimming a slot that is still plausibly benefiting from retained capacity.