Add CoreWeave CI workflow for Iris PRs by rjpower · Pull Request #4174 · marin-community/marin

rjpower · 2026-03-26T16:36:48Z

Summary

New workflow .github/workflows/iris-coreweave-ci.yaml runs the Iris integration test suite (tests/integration/iris/) against a persistent CoreWeave cluster on every PR touching lib/iris/**
New cluster config lib/iris/examples/coreweave-ci.yaml with CPU and GPU×8 scale groups pinned at min=max=1 slice for warm starts
Unlike the existing CW smoke/canary workflows, nodepools are not torn down between runs — only controller and worker pods are reset via cluster controller restart
Concurrency limit of 1 (shared warm cluster); triggered by PR paths, /iris-ci-cw comment, or manual dispatch

Design

The workflow:

Deletes stale worker pods so autoscaler recreates them with fresh images
Runs iris cluster controller restart which rebuilds images and restarts the controller (or starts fresh on first run)
Opens a kubectl port-forward tunnel to the controller service
Runs the same integration test suite as iris-integration.yaml against the real CW cluster

Test plan

Verify workflow triggers correctly on PRs touching lib/iris/
Verify /iris-ci-cw comment trigger works
First run: controller starts fresh (no existing cluster)
Subsequent run: warm restart uses existing nodepools
Integration tests pass against the CW cluster

🤖 Generated with Claude Code

Analyze controller startup flow, scheduling loop, worker sync, autoscaler, and checkpoint systems. Identify all side-effectful operations that need gating for a --dry-run flag.

New workflow runs the Iris integration test suite against a warm CoreWeave cluster on every PR touching lib/iris/. Unlike the canary runs, nodepools persist between runs for fast startup — only controller and worker pods are reset via `cluster controller restart`. Files: - .github/workflows/iris-coreweave-ci.yaml: PR-triggered workflow - lib/iris/examples/coreweave-ci.yaml: cluster config with min=max=1 slices

claude · 2026-03-26T16:37:02Z

Claude finished @rjpower's task in 1m 4s —— View job

Review: PR #4174 — Add CoreWeave CI workflow for Iris PRs

Adds a GitHub Actions workflow to run the Iris integration test suite against a persistent CoreWeave cluster, plus a cluster config and a dry-run analysis doc. The workflow and cluster config are well-structured and consistent with the existing CW smoke workflow patterns.

Findings:

iris-controller-dry-run-analysis.md should not be committed to repo root. This is a research/analysis doc checked into the repository root. Per AGENTS.md, documentation should live in docs/ or a relevant subdirectory, and this file reads like a working scratchpad (with "Open questions" and implementation brainstorming). It should either be moved to .agents/projects/ if it's an active design artifact, filed as a GitHub issue, or removed from the PR entirely — it is not part of the CI workflow being added.
Port-forward race condition (workflow line 108-127). The health check loop polls curl -sf localhost:10000/healthz every 5s for up to 60 iterations (5 minutes). This is reasonable, but the kubectl port-forward process could silently die (e.g., if the controller pod restarts during the window). There is no check that $PF_PID is still alive during the polling loop. Consider adding kill -0 $PF_PID 2>/dev/null || { echo "port-forward died"; exit 1; } inside the loop.
$PF_PID in the cleanup step (line 137). The "Stop port-forward" step references $PF_PID via $GITHUB_ENV, which was set inside the "Run integration tests" step. If that step is skipped or fails before the echo line, $PF_PID will be empty and kill will receive no argument. This is harmless due to || true, but worth noting — could instead use kill %1 or pkill -f port-forward.
cancel-in-progress: false with single concurrency group (line 19-20). With cancel-in-progress: false, if two PRs trigger simultaneously, the second will queue indefinitely until the first completes (up to 60 min timeout). This is the intended design per the PR description (shared warm cluster), but it means CI can be blocked for extended periods. Consider documenting this behavior or adding a comment in the workflow about expected queue times for contributors.
No pull_request filter on types (line 4-6). The pull_request trigger fires on all activity types (opened, synchronize, reopened, labeled, etc.). For a workflow with a single concurrency group and potentially expensive runs, consider restricting to types: [opened, synchronize] to avoid unnecessary queuing on label/assignment changes.
Missing permissions: pull-requests for issue_comment trigger. The issue_comment trigger needs pull-requests: read to access github.event.issue.pull_request. This may work implicitly with the default token but is worth verifying — the existing smoke workflow has the same pattern so this is likely fine.
Cluster config: host_network: true (coreweave-ci.yaml line 18). This grants pods host networking. This matches the existing smoke config pattern, but in a persistent CI cluster it increases the blast radius if a test misbehaves. Flagging for awareness rather than as a bug.

Summary: The workflow is solid and follows established patterns from the existing CW smoke workflow. The main actionable item is removing or relocating iris-controller-dry-run-analysis.md from the repo root — it doesn't belong in this PR. The port-forward liveness check (item 2) and PR event type filtering (item 5) are worth addressing but not blockers.

Delete iris-controller-dry-run-analysis.md — research artifact that should not be in the PR.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87830a3191

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-26T16:41:30Z

+          uv run pytest tests/integration/iris/ \
+            --controller-url "$IRIS_CONTROLLER_URL" \
+            -v --tb=short --timeout=600 \


Install test deps before running integration pytest

This step invokes uv run pytest ... --timeout=600 from the repo root, but the workflow never installs dev/test dependencies for the root workspace (unlike .github/workflows/iris-integration.yaml, which runs uv sync ... --group dev --extra=cpu --extra=dedup first). In this configuration, required pytest plugins/deps (notably pytest-timeout for --timeout) may be missing, so the job can fail before executing the integration suite.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-26T16:41:30Z

+      github.event_name == 'pull_request' ||
+      github.event_name == 'workflow_dispatch' ||


Restrict secret-dependent path for fork pull requests

The job runs unconditionally for pull_request events, but later steps require repo secrets (CW_KUBECONFIG, R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY). For PRs opened from forks, GitHub does not expose these secrets, so this path will fail systematically for external contributors; add a same-repo guard (as used in other workflows) or limit this trigger to trusted/manual entry points.

Useful? React with 👍 / 👎.

`cluster controller restart` first tries to tunnel to the existing controller, which hangs for 90s when the namespace doesn't exist (cold start). On K8s, `restart_controller()` just delegates to `start_controller()` anyway, so `cluster start` achieves the same result without the tunnel overhead.

- Label selector used `iris.managed` but Labels(iris-ci) generates `iris-iris-ci-managed` — pod reset was silently matching nothing. - Add fork guard to skip PR runs from forks (secrets would be empty). - Add --grace-period=0 to pod deletion for faster CI cleanup. - Fix misleading startup_grace_period comment.

The `iris cluster start` command uses `docker buildx build --push` with registry cache, which requires the buildx driver. GitHub Actions' default docker driver doesn't support cache export. Adding setup-buildx-action@v3 matches what docker-images.yaml does. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

rjpower · 2026-03-26T16:53:28Z

🤖 Fixed the docker buildx error: added docker/setup-buildx-action@v3 step before cluster start.

The iris cluster start command uses docker buildx build --push with registry cache export, which requires a non-default buildx driver. The GitHub Actions default docker driver doesn't support --cache-to. This is the same setup used in docker-images.yaml.

- Restrict pull_request trigger to types: [opened, synchronize] to avoid unnecessary runs on label/assignment changes - Add pull-requests: read permission for issue_comment trigger - Detect port-forward process death during health check loop - Use pkill fallback in cleanup step for robustness when PF_PID is empty Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The Iris controller exposes /health (not /healthz). The wrong endpoint caused the health poll to loop for 5 minutes then fail.

rjpower · 2026-03-26T17:33:55Z

🤖 Root cause of the CI stall: the health check was polling /healthz but the Iris controller exposes /health. Fixed — the endpoint now matches what's used in K8s liveness probes and throughout the codebase.

Verified by connecting to the running CoreWeave cluster:

Controller pod is Running (1/1)
curl /healthz → 404 Not Found
curl /health → {"status":"ok"}

rjpower · 2026-03-26T17:42:51Z

@claude the Coreweave CI test is failing in the port allocation test:

tests/integration/iris/test_iris_integration.py::test_port_allocation FAILED [ 33%]

=================================== FAILURES ===================================
_____________________________ test_port_allocation _____________________________
tests/integration/iris/test_iris_integration.py:95: in test_port_allocation
assert status.state == cluster_pb2.JOB_STATE_SUCCEEDED
E assert 5 == 4
E + where 5 = job_id: "/runner/itest-ports"\nstate: JOB_STATE_FAILED\nexit_code: 0\nerror: "Error"\nfinished_at {\n epoch_ms: 1774546698207\n}\nfailure_count: 1\npreemption_count: 0\ntasks {\n task_id: "/runner/itest-ports/0"\n state: TASK_STATE_FAILED\n worker_id: ""\n worker_address: ""\n exit_code: 1\n error: "Error"\n finished_at {\n epoch_ms: 1774546698207\n }\n current_attempt_id: 0\n attempts {\n attempt_id: 0\n worker_id: ""\n state: TASK_STATE_FAILED\n exit_code: 1\n error: "Error"\n finished_at {\n epoch_ms: 1774546698207\n }\n is_worker_failure: false\n }\n}\nname: "/runner/itest-ports"\nsubmitted_at {\n epoch_ms: 1774546688775\n}\nresources {\n cpu_millicores: 1000\n memory_bytes: 4294967296\n disk_bytes: 0\n}\npending_reason: ""\n.state
E + and 4 = cluster_pb2.JOB_STATE_SUCCEEDED
---------------------------- Captured stderr setup -----------------------------
I20260326 17:38:08 139810816472896 fray.v1.cluster.local_cluster Creating local cluster with spec: {}
I20260326 17:38:08 139810816472896 fray.v1.cluster.local_cluster Local cluster config: LocalClusterConfig(use_isolated_env=False)
------------------------------ Captured log setup ------------------------------
2026-03-26 17:38:08 INFO Creating local cluster with spec: {}
2026-03-26 17:38:08 INFO Local cluster config: LocalClusterConfig(use_isolated_env=False)
=============================== warnings summary ===============================
tests/integration/iris/test_kind_gpu_canary.py:87
/home/runner/work/marin/marin/tests/integration/iris/test_kind_gpu_canary.py:87: PytestUnknownMarkWarning: Unknown pytest.mark.e2e - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
pytestmark = pytest.mark.e2e

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/integration/iris/test_iris_integration.py::test_port_allocation - assert 5 == 4

where 5 = job_id: "/runner/itest-ports"\nstate: JOB_STATE_FAILED\nexit_code: 0\nerror: "Error"\nfinished_at {\n epoch_ms: 1774546698207\n}\nfailure_count: 1\npreemption_count: 0\ntasks {\n task_id: "/runner/itest-ports/0"\n state: TASK_STATE_FAILED\n worker_id: ""\n worker_address: ""\n exit_code: 1\n error: "Error"\n finished_at {\n epoch_ms: 1774546698207\n }\n current_attempt_id: 0\n attempts {\n attempt_id: 0\n worker_id: ""\n state: TASK_STATE_FAILED\n exit_code: 1\n error: "Error"\n finished_at {\n epoch_ms: 1774546698207\n }\n is_worker_failure: false\n }\n}\nname: "/runner/itest-ports"\nsubmitted_at {\n epoch_ms: 1774546688775\n}\nresources {\n cpu_millicores: 1000\n memory_bytes: 4294967296\n disk_bytes: 0\n}\npending_reason: ""\n.state
and 4 = cluster_pb2.JOB_STATE_SUCCEEDED
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!

let's just remove that test -- in practice we're using host networking in a VPC so we don't use ports.

claude · 2026-03-26T17:43:20Z

Claude finished @rjpower's task in 1m 17s —— View job

Removed `test_port_allocation` from integration tests

Read test file and identify the test to remove
Remove test_port_allocation and unused validate_ports import
Run pre-commit — all checks pass
Commit and push

Removed test_port_allocation and its validate_ports import since port allocation isn't used with host networking in a VPC. Pre-commit passes cleanly.

Port allocation is not used in practice since we use host networking in a VPC. The test was failing on CoreWeave CI. Co-authored-by: Russell Power <rjpower@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

rjpower · 2026-03-26T19:18:55Z

@claude it looks like exec_in_container isn't implemented correctly for the k8s provider:

Iris - CoreWeave CI / cw-ci-test (pull_request)
Iris - CoreWeave CI / cw-ci-test (pull_request)Failing after 3m

tests/integration/iris/test_iris_integration.py::test_profile_running_task SKIPPED [ 64%]
tests/integration/iris/test_iris_integration.py::test_exec_in_container FAILED [ 71%]

=================================== FAILURES ===================================
____________________________ test_exec_in_container ____________________________
tests/integration/iris/test_iris_integration.py:265: in test_exec_in_container
response = integration_cluster.controller_client.exec_in_container(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
lib/iris/src/iris/rpc/cluster_connect.py:2044: in exec_in_container
return self.execute_unary(
.venv/lib/python3.11/site-packages/connectrpc/_client_sync.py:210: in execute_unary
return self._execute_unary(request, ctx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.venv/lib/python3.11/site-packages/connectrpc/_client_sync.py:334: in _send_request_unary
raise ConnectWireError.from_response(resp).to_exception()
E connectrpc.errors.ConnectError: Task /runner/itest-exec/0 not assigned to a worker
---------------------------- Captured stderr setup -----------------------------
I20260326 17:48:18 140501602625344 fray.v1.cluster.local_cluster Creating local cluster with spec: {}
I20260326 17:48:18 140501602625344 fray.v1.cluster.local_cluster Local cluster config: LocalClusterConfig(use_isolated_env=False)
------------------------------ Captured log setup ------------------------------
2026-03-26 17:48:18 INFO Creating local cluster with spec: {}
2026-03-26 17:48:18 INFO Local cluster config: LocalClusterConfig(use_isolated_env=False)

can you fix? also why do i see a "local cluster" here? we should be directly using the CW via the @coreweave-ci.yaml configuration

claude · 2026-03-26T19:19:33Z

Claude finished @rjpower's task in 7m 29s —— View job

Fixed `exec_in_container` for K8s direct provider

Investigate why "local cluster" is being created instead of using CW
Investigate exec_in_container failure
Implement fix in K8sTaskProvider and controller service
Run pre-commit and tests
Commit and push

Root cause: The K8s provider uses a "direct provider" model (K8sTaskProvider) — tasks run as K8s pods directly, with no worker daemons. The controller's exec_in_container always assumed worker daemons exist and checked task.worker_id, which is empty in direct provider mode. This is the same issue that profile_task already handles correctly.

Fix (2 files):

lib/iris/src/iris/cluster/providers/k8s/tasks.py — Added exec_in_container method to K8sTaskProvider that uses kubectl exec to run commands in the task pod's task container.
lib/iris/src/iris/cluster/controller/service.py — Updated the controller's exec_in_container to check has_direct_provider when worker_id is empty (same pattern as profile_task at line 1481) and delegate to the K8s provider instead of raising an error.

"Local cluster" log: This comes from fray.v1.cluster.local_cluster — the Fray library's test infrastructure, not Iris. It's triggered during pytest collection/import of other test files in the same directory. The actual tests do use the CW controller via --controller-url.

Testing: Ran cd lib/iris && uv run --group dev pytest tests/ -k "test_exec" — 14 passed, 1 skipped, including test_exec_in_container in the local e2e suite. Pre-commit passes cleanly.

The controller's exec_in_container RPC assumed worker daemons exist, but the K8s provider uses a direct provider model (no workers). Add exec_in_container to K8sTaskProvider using kubectl exec, and update the controller service to fall back to the direct provider when no worker_id is set — matching the existing pattern used by profile_task. Co-authored-by: Russell Power <rjpower@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The test was using a local tmpdir as MARIN_PREFIX, but Zephyr coordinator jobs run on the CW cluster and can't access the CI runner's filesystem. Switch to s3://marin-na/temp/ci/<run-id>, upload synth data to S3, and add R2 credentials to the CI test step.

Local CI runs don't have R2 credentials, so fall back to a local tmpdir when MARIN_CI_S3_PREFIX is not set. The CW workflow sets it.

The marin-on-iris integration test was running executor_main in-process on the CI runner. Child Zephyr jobs on remote pods had no S3 credentials because Iris env propagation only works parent→child, and there was no parent Iris job. Submit the whole executor as an Iris job with S3 env vars so Zephyr coordinator/workers inherit them automatically. Also enable live log output (-s --log-cli-level=INFO) for the local iris integration test workflow.

The local integration test runs in-process (local filesystem is accessible), while the CoreWeave test submits the executor as an Iris job so child Zephyr jobs inherit S3 credentials via auto-propagation.

The HuggingFace datasets library injects requote_redirect_url=False into client_kwargs via its CompressionFilesystem. This aiohttp kwarg gets forwarded to aiobotocore.session.create_client() which rejects it, breaking all S3-backed reads (e.g. on R2/CoreWeave CI). Use fsspec open_url directly for jsonl and pyarrow for parquet — no need for the datasets library here.

fs.makedirs("/tmp/...") on an S3FileSystem tries to CreateBucket, which fails with AccessDenied on R2. The /tmp path is always local.

The classifier code (train + inference) uses patterns that don't work in distributed S3 environments: fs.makedirs on local /tmp paths via S3FileSystem, HF datasets injecting aiohttp kwargs into botocore, and single-node FileLock assumptions. Remove these steps from the integration test so the CW CI can pass. See #4183 for the tracking issue.

yonromai

YESSSSS

Replace the pytest wrapper with a standalone script that streams logs in real time. The test takes ~10 minutes and pytest swallows all output until completion, making failures hard to diagnose. The script runs as its own workflow step with `stream_logs=True` on the Iris job handle, so executor and child job output appears immediately. Inlines create_steps rather than importing from tests.integration_test to avoid sys.path hacks and to allow independent evolution of the pipeline steps for this test.

Delete the ephemeral smoke-cw workflow, dev-restart workflow, multihost canary config and script, and smoke-cw cluster config. All replaced by the always-on coreweave-ci cluster added in #4174. Rewrite run_smoke_cw.sh as a thin kubectl port-forward wrapper over pytest tests/integration/iris/. Deleted: iris-cloud-smoke-cw.yaml, iris-coreweave-dev-restart.yaml, smoke-cw.yaml, coreweave-canary-multihost.yaml, coreweave_multihost.py.

- New workflow `.github/workflows/iris-coreweave-ci.yaml` runs the Iris integration test suite (`tests/integration/iris/`) against a persistent CoreWeave cluster on every PR touching `lib/iris/**` - New cluster config `lib/iris/examples/coreweave-ci.yaml` with CPU and GPU×8 scale groups pinned at min=max=1 slice for warm starts - Unlike the existing CW smoke/canary workflows, nodepools are **not** torn down between runs — only controller and worker pods are reset via `cluster controller restart` - Concurrency limit of 1 (shared warm cluster); triggered by PR paths, `/iris-ci-cw` comment, or manual dispatch

Delete the ephemeral smoke-cw workflow, dev-restart workflow, multihost canary config and script, and smoke-cw cluster config. All replaced by the always-on coreweave-ci cluster added in #4174. Rewrite run_smoke_cw.sh as a thin kubectl port-forward wrapper over pytest tests/integration/iris/. Deleted: iris-cloud-smoke-cw.yaml, iris-coreweave-dev-restart.yaml, smoke-cw.yaml, coreweave-canary-multihost.yaml, coreweave_multihost.py.

rjpower added 3 commits March 25, 2026 11:02

research: iris controller dry-run mode analysis

1a6bc2e

Analyze controller startup flow, scheduling loop, worker sync, autoscaler, and checkpoint systems. Identify all side-effectful operations that need gating for a --dry-run flag.

Simplify: fix health check timeout, remove dead env var, trim comments

87830a3

rjpower added the agent-generated Created by automation/agent label Mar 26, 2026

Remove analysis document from PR

4c66d88

Delete iris-controller-dry-run-analysis.md — research artifact that should not be in the PR.

chatgpt-codex-connector Bot reviewed Mar 26, 2026

View reviewed changes

rjpower and others added 3 commits March 26, 2026 09:45

rjpower and others added 2 commits March 26, 2026 09:59

fix: use /health not /healthz for controller health check

6e45031

The Iris controller exposes /health (not /healthz). The wrong endpoint caused the health poll to loop for 5 minutes then fail.

github-actions Bot and others added 7 commits March 26, 2026 19:26

fix: gate S3 usage on MARIN_CI_S3_PREFIX env var

eb2f779

Local CI runs don't have R2 credentials, so fall back to a local tmpdir when MARIN_CI_S3_PREFIX is not set. The CW workflow sets it.

fix: only submit executor as Iris job on remote clusters

78c6c61

The local integration test runs in-process (local filesystem is accessible), while the CoreWeave test submits the executor as an Iris job so child Zephyr jobs inherit S3 credentials via auto-propagation.

fix: use os.makedirs for local /tmp path in classifier model download

49201af

fs.makedirs("/tmp/...") on an S3FileSystem tries to CreateBucket, which fails with AccessDenied on R2. The /tmp path is always local.

rjpower mentioned this pull request Mar 26, 2026

Fasttext classifier code doesn't work in distributed/S3 environments #4183

Open

rjpower requested a review from yonromai March 27, 2026 00:22

yonromai approved these changes Mar 27, 2026

View reviewed changes

rjpower force-pushed the work/akYgKHAu branch from a38e590 to e34cc94 Compare March 27, 2026 02:00

rjpower force-pushed the work/akYgKHAu branch from e34cc94 to f586ed8 Compare March 27, 2026 02:14

rjpower merged commit c5572e1 into main Mar 27, 2026
41 checks passed

rjpower deleted the work/akYgKHAu branch March 27, 2026 02:25

rjpower mentioned this pull request Mar 27, 2026

[iris] Remove obsolete CoreWeave CI workflows and configs #4197

Merged

yonromai mentioned this pull request Apr 7, 2026

Migrate all supported Marin code off Ray #4453

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CoreWeave CI workflow for Iris PRs#4174

Add CoreWeave CI workflow for Iris PRs#4174
rjpower merged 19 commits intomainfrom
work/akYgKHAu

rjpower commented Mar 26, 2026

Uh oh!

claude Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 26, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 26, 2026

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

claude Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

claude Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

yonromai left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		github.event_name == 'pull_request' \|\|
		github.event_name == 'workflow_dispatch' \|\|

Conversation

rjpower commented Mar 26, 2026

Summary

Design

Test plan

Uh oh!

claude Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: PR #4174 — Add CoreWeave CI workflow for Iris PRs

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

claude Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Removed test_port_allocation from integration tests

Uh oh!

rjpower commented Mar 26, 2026

Uh oh!

claude Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixed exec_in_container for K8s direct provider

Uh oh!

yonromai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude Bot commented Mar 26, 2026 •

edited

Loading

claude Bot commented Mar 26, 2026 •

edited

Loading

Removed `test_port_allocation` from integration tests

claude Bot commented Mar 26, 2026 •

edited

Loading

Fixed `exec_in_container` for K8s direct provider