[codex] Add generic telemetry and custom benchmark support#43
Merged
ishandhanani merged 3 commits intomainfrom Apr 18, 2026
Merged
[codex] Add generic telemetry and custom benchmark support#43ishandhanani merged 3 commits intomainfrom
ishandhanani merged 3 commits intomainfrom
Conversation
- srtctl apply --json: emit one JSON line per submission on stdout (slurm_job_id,
job_name, output_dir, metadata_path, config_path, tags). Prose goes to stderr.
Errors emit {"status":"error","error":...} and exit non-zero. Module-level
console is restored on exit so direct library callers of submit_* don't see
a leaked stderr binding.
- srtctl/mock.py: MockInfra context manager swapping 16 external surfaces
(start_srun_process in 8 modules, hostname/IP resolution in 4, port and
model health in 3, status HTTP in 1) with local fakes. FakePopen drop-in
mimics subprocess.Popen. run_mock_sweep executes the full
SweepOrchestrator against those fakes and writes realistic artifacts
(status.json, status_events.jsonl, result.json, recipe.lock.yaml, logs).
- srtctl.cli.mock_worker: CLI entry (`python -m srtctl.cli.mock_worker ...`)
that wraps run_mock_sweep for spawning as a subprocess.
- srtctl apply --mock [--mock-tick-s T]: stubs sbatch inside the real submit
flow (submit_with_orchestrator still runs for real — config load, metadata
write, JSON submission emission), then detaches a mock_worker subprocess
that drives the full SweepOrchestrator against the output_dir.
- Tests: test_apply_json (3), test_apply_mock (2), test_mock_sweep (3).
- CI: new `mock-and-server` job explicitly runs the new test files plus
test_integration_status, and smoke-tests both `srtctl apply --mock --json`
and `python -m srtctl.cli.mock_worker` as real subprocesses.
af623f5 to
c9856d7
Compare
On GitHub Actions runners RUNNER_NAME is auto-set, which causes get_job_name() to return the runner identity instead of the configured job name, tripping the job_name assertion.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
This PR adds the
srt-slurmpieces needed for benchmarking workflows without makingsrt-slurmitself user-facing.custombenchmark runner with optional benchmark-specific container and environment supportWhy
Impact
Root cause for the durability fix
The postprocess container used
set -eand ran parsing beforeaws s3 sync, so a parser failure prevented raw artifacts from being uploaded at all. This PR makes parsing best-effort and keeps upload as the priority path.