Skip to content

fix(iris): preserve thread container isolation in LocalCluster#3637

Merged
claude[bot] merged 2 commits intomainfrom
claude/pr-3632-20260313-2305
Mar 14, 2026
Merged

fix(iris): preserve thread container isolation in LocalCluster#3637
claude[bot] merged 2 commits intomainfrom
claude/pr-3632-20260313-2305

Conversation

@claude
Copy link
Copy Markdown
Contributor

@claude claude bot commented Mar 13, 2026

Summary

  • Create a dedicated ThreadContainer("local-cluster") at each LocalCluster call site to prevent shutdown from stopping unrelated managed threads sharing the ambient default container
  • Fixes regression identified in PR 3632 review by @yonromai

Changes

File Change
cluster/manager.py Pass ThreadContainer to LocalCluster in connect_cluster()
cli/main.py Pass ThreadContainer to LocalCluster in require_controller_url()
cli/cluster.py Pass ThreadContainer to LocalCluster in cluster_start()
client/client.py Pass ThreadContainer to LocalCluster in IrisClient.local()

Test plan

  • All 1274 unit tests pass
  • Pre-commit passes

Generated with Claude Code

…ites

LocalCluster created with threads=None falls back to the ambient default
ThreadContainer. On shutdown, stopping the controller/autoscaler would
stop unrelated threads sharing that container. Fix by creating a dedicated
ThreadContainer("local-cluster") at each call site.

Fixes the regression identified in #3632 review.

Co-authored-by: Russell Power <rjpower@users.noreply.github.com>
@claude claude bot added the agent-generated Created by automation/agent label Mar 13, 2026
@claude claude bot requested a review from yonromai March 13, 2026 23:12
@claude claude bot enabled auto-merge (squash) March 13, 2026 23:12
Copy link
Copy Markdown
Contributor

@yonromai yonromai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Request changes. The four production entry points from the original review are now isolated, but the LocalCluster constructor itself still defaults to the ambient thread container, so the regression remains for direct callers and future call sites.

I reproduced it on this branch with thread_container_scope("probe"): an unrelated managed thread is alive before cluster.close() and dead after. The fix should move into LocalCluster itself, e.g. create a private ThreadContainer("local-cluster") when threads is None.

Generated with Codex

Comment thread lib/iris/src/iris/cluster/manager.py Outdated

cluster = LocalCluster(config)
threads = ThreadContainer("local-cluster")
cluster = LocalCluster(config, threads=threads)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This protects connect_cluster(), but it still leaves LocalCluster(...) unsafe by default. LocalCluster.__init__ still stores threads=None, so direct callers still fall back to the ambient container and cluster.close() still stops unrelated managed threads. I reproduced that on this branch with a thread_container_scope("probe") check.

Recommended fix: make LocalCluster allocate its own ThreadContainer("local-cluster") when threads is None, instead of relying on every caller to remember this wrapper.

Generated with Codex

@yonromai
Copy link
Copy Markdown
Contributor

@claude do you like 🍎 ?

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 13, 2026

Claude finished @yonromai's task in 12s —— View job


I appreciate the question, but I'm best suited for code-related tasks like reviewing PRs, answering technical questions, and implementing changes. 🍎 opinions are outside my area of expertise! 😄

If you'd like me to address the review feedback on this PR (moving the ThreadContainer default into LocalCluster.__init__), just let me know!

@yonromai
Copy link
Copy Markdown
Contributor

@claude go ahead.

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 13, 2026

Claude finished @yonromai's task in 7m 20s —— View job


Implementing review feedback

  • Read LocalCluster implementation
  • Move ThreadContainer default into LocalCluster.__init__
  • Revert per-call-site wrapping and remove unused imports
  • Run pre-commit and tests
  • Push changes

Moved the ThreadContainer("local-cluster") allocation from each call site into LocalCluster.__init__:

self._threads = threads or ThreadContainer("local-cluster")

This ensures all direct callers and future call sites get thread isolation by default, without needing to remember to wrap. Removed the per-site ThreadContainer creation and cleaned up unused imports in all 4 files.

Testing: Pre-commit passes. All 1324 Iris tests pass (1 skipped).
· Branch

Move the ThreadContainer("local-cluster") allocation from each call site
into LocalCluster.__init__ (threads or ThreadContainer("local-cluster"))
so direct callers and future call sites are safe by default.

Co-authored-by: Romain Yon <yonromai@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude claude bot merged commit 7afc138 into main Mar 14, 2026
19 checks passed
@claude claude bot deleted the claude/pr-3632-20260313-2305 branch March 14, 2026 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant