fix(iris): preserve thread container isolation in LocalCluster#3637
fix(iris): preserve thread container isolation in LocalCluster#3637claude[bot] merged 2 commits intomainfrom
Conversation
…ites
LocalCluster created with threads=None falls back to the ambient default
ThreadContainer. On shutdown, stopping the controller/autoscaler would
stop unrelated threads sharing that container. Fix by creating a dedicated
ThreadContainer("local-cluster") at each call site.
Fixes the regression identified in #3632 review.
Co-authored-by: Russell Power <rjpower@users.noreply.github.com>
yonromai
left a comment
There was a problem hiding this comment.
🤖 Request changes. The four production entry points from the original review are now isolated, but the LocalCluster constructor itself still defaults to the ambient thread container, so the regression remains for direct callers and future call sites.
I reproduced it on this branch with thread_container_scope("probe"): an unrelated managed thread is alive before cluster.close() and dead after. The fix should move into LocalCluster itself, e.g. create a private ThreadContainer("local-cluster") when threads is None.
Generated with Codex
|
|
||
| cluster = LocalCluster(config) | ||
| threads = ThreadContainer("local-cluster") | ||
| cluster = LocalCluster(config, threads=threads) |
There was a problem hiding this comment.
🤖 This protects connect_cluster(), but it still leaves LocalCluster(...) unsafe by default. LocalCluster.__init__ still stores threads=None, so direct callers still fall back to the ambient container and cluster.close() still stops unrelated managed threads. I reproduced that on this branch with a thread_container_scope("probe") check.
Recommended fix: make LocalCluster allocate its own ThreadContainer("local-cluster") when threads is None, instead of relying on every caller to remember this wrapper.
Generated with Codex
|
@claude do you like 🍎 ? |
|
Claude finished @yonromai's task in 12s —— View job I appreciate the question, but I'm best suited for code-related tasks like reviewing PRs, answering technical questions, and implementing changes. 🍎 opinions are outside my area of expertise! 😄 If you'd like me to address the review feedback on this PR (moving the |
|
@claude go ahead. |
|
Claude finished @yonromai's task in 7m 20s —— View job Implementing review feedback
Moved the self._threads = threads or ThreadContainer("local-cluster")This ensures all direct callers and future call sites get thread isolation by default, without needing to remember to wrap. Removed the per-site Testing: Pre-commit passes. All 1324 Iris tests pass (1 skipped). |
Move the ThreadContainer("local-cluster") allocation from each call site
into LocalCluster.__init__ (threads or ThreadContainer("local-cluster"))
so direct callers and future call sites are safe by default.
Co-authored-by: Romain Yon <yonromai@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
ThreadContainer("local-cluster")at eachLocalClustercall site to prevent shutdown from stopping unrelated managed threads sharing the ambient default containerChanges
cluster/manager.pyThreadContainertoLocalClusterinconnect_cluster()cli/main.pyThreadContainertoLocalClusterinrequire_controller_url()cli/cluster.pyThreadContainertoLocalClusterincluster_start()client/client.pyThreadContainertoLocalClusterinIrisClient.local()Test plan
Generated with Claude Code