Skip to content

fix(iris): add SYS_PTRACE to controller pod for profiling#4418

Merged
rjpower merged 2 commits intomainfrom
claude/issue-4416-20260404-0352
Apr 4, 2026
Merged

fix(iris): add SYS_PTRACE to controller pod for profiling#4418
rjpower merged 2 commits intomainfrom
claude/issue-4416-20260404-0352

Conversation

@ravwojdyla
Copy link
Copy Markdown
Contributor

Adds SYS_PTRACE capability to the controller Deployment so py-spy and memray profiling works on the controller pod.

Closes #4416

Generated with Claude Code

The controller Deployment was missing the SYS_PTRACE capability that
task pods already have, causing py-spy and memray to fail with
"Permission denied (os error 13)" when profiling the controller.

Closes #4416

Co-authored-by: Rafal Wojdyla <ravwojdyla@users.noreply.github.com>
provider.shutdown()


def test_controller_deployment_has_sys_ptrace():
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this test

],
"ports": [{"containerPort": port}],
"env": s3_env_vars,
"securityContext": {"capabilities": {"add": ["SYS_PTRACE"]}},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there other capabilities present in the other providers that we should include here?

],
"ports": [{"containerPort": port}],
"env": s3_env_vars,
"securityContext": {"capabilities": {"add": ["SYS_PTRACE"]}},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this work in k8s?

@ravwojdyla
Copy link
Copy Markdown
Contributor Author

@claude address the comments above please

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 4, 2026

Claude finished @ravwojdyla's task in 2m 35s —— View job


Addressing PR review comments

  • Remove test as requested
  • Check other providers for capabilities to include
  • Address whether SYS_PTRACE works in k8s
  • Run pre-commit and tests
  • Push changes

Changes made

Removed the standalone test (test_controller_deployment_has_sys_ptrace) as requested.

Answers to review questions

Other capabilities from other providers?tasks.py (the worker pod builder) adds SYS_PTRACE and conditionally SYS_RESOURCE (for TPU memlock ulimits). The controller doesn't run on TPU nodes, so SYS_PTRACE alone is the right choice here.

Will this work in k8s? — Yes. securityContext.capabilities.add is a standard Kubernetes container-level field (core/v1 API). It's already used the same way in tasks.py for worker pods, so there's no compatibility concern. The only prerequisite is that the cluster's PodSecurityPolicy/PodSecurityStandard allows SYS_PTRACE, which the baseline profile does by default.

Testing

  • ./infra/pre-commit.py --all-files --fix — ✅ all checks pass
  • pytest lib/iris/tests/cluster/providers/k8s/test_coreweave.py — ✅ 22 tests pass
    · Branch

The capability is already validated implicitly by
test_start_controller_creates_all_resources.

Co-authored-by: Rafal Wojdyla <ravwojdyla@users.noreply.github.com>
@ravwojdyla ravwojdyla requested a review from rjpower April 4, 2026 04:26
Copy link
Copy Markdown
Collaborator

@rjpower rjpower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your guess is as good as mine! Off the top of my head ptrace is the one I recall adding to the docker runtime.

@rjpower rjpower merged commit 2399834 into main Apr 4, 2026
41 of 42 checks passed
@rjpower rjpower deleted the claude/issue-4416-20260404-0352 branch April 4, 2026 15:59
Helw150 pushed a commit that referenced this pull request Apr 8, 2026
Adds `SYS_PTRACE` capability to the controller Deployment so py-spy and
memray profiling works on the controller pod.

Closes #4416

Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Rafal Wojdyla <ravwojdyla@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[iris/CW] Add SYS_PTRACE to controller pod for profiling on K8s

2 participants