ci: pin oasdiff version to avoid GitHub API rate-limit flake#3739
Conversation
The "Install oasdiff" step runs upstream install.sh, which queries the GitHub API to resolve the latest oasdiff release. Unauthenticated API requests are capped at 60/hr per IP, so on shared GitHub-hosted runners the lookup intermittently fails with "Failed to get oasdiff version. This could be due to GitHub API rate limiting or network issues", failing the whole check before the breakage step runs. Pin the version so install.sh skips the API lookup (per the script's own error-message guidance) and downloads the release asset directly. This also makes the check deterministic and bumpable via a PR.
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review — PR #3739
🟢 Good taste — Elegant, simple solution.
This is a textbook example of a minimal, well-documented fix for a real CI infrastructure problem.
Analysis
Problem Solved: The PR correctly identifies and addresses a genuine GitHub API rate-limit flake in the CI pipeline. The evidence provided (multiple failed runs with timestamps, root cause traced to get_latest_release()) is compelling and well-documented.
Why Version 1.19.1: The PR description correctly notes that version 1.19.1 was chosen to match "latest" rather than downgrading. This is the right call — pinning to a specific version should match the current behavior until the team intentionally wants to upgrade.
Change Quality:
-fsSLflags are appropriate:-ffails fast on HTTP errors (better than piping error pages to shell),-ssilences progress,-Sshows errors.- The inline comments explaining the "why" are justified here — the rate-limit problem is CI infrastructure knowledge that isn't obvious from reading the code alone.
Risk Assessment:
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW - This is a CI-only change affecting workflow reliability. No impact on production code paths. Actually improves reliability by eliminating an intermittent failure mode.
VERDICT: ✅ Worth merging — Clean, focused fix with excellent documentation. No blocking issues.
KEY INSIGHT: Pinning tool versions in CI is a pragmatic pattern for avoiding external API dependencies. The -f flag addition is a nice hardening touch that prevents silent failures.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the workflow’s pinned oasdiff install path installs oasdiff 1.19.1 and removes the unauthenticated GitHub releases/latest API lookup that caused the flake.
Does this PR achieve its stated goal?
Yes. The PR’s goal is to make the REST API breakage workflow’s Install oasdiff step deterministic by pinning oasdiff and avoiding the rate-limited latest-release API call. In a before/after execution of the install script, the old unpinned invocation called https://api.github.com/repos/oasdiff/oasdiff/releases/latest once, while the pinned version=1.19.1 invocation made 0 such calls, downloaded the v1.19.1 release asset directly, and produced oasdiff version 1.19.1. The exact PR workflow command also completed successfully locally.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Checked out PR branch at 7393101; no project dependency install needed for this CI-only install-step change. |
| CI Status | 🟡 Relevant REST API (OpenAPI) check is green; some broader Agent Server / QA jobs were still in progress when checked. |
| Functional Verification | ✅ Exercised old and new oasdiff install flows, exact pinned workflow command, YAML parsing, and curl HTTP-error behavior. |
Functional Verification
Test 1: Baseline unpinned install performs the rate-limited lookup
Step 1 — Reproduce / establish baseline without the fix:
Ran an isolated equivalent of the old unpinned install command:
curl -L https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh \
| INSTALL_DIR=/tmp/oasdiff-qa-z65aun/old-curlL sh -x
/tmp/oasdiff-qa-z65aun/old-curlL/oasdiff --versionObserved:
+ get_latest_release oasdiff/oasdiff
+ curl --silent -L https://api.github.com/repos/oasdiff/oasdiff/releases/latest
+ version=1.19.1
Downloading https://github.com/oasdiff/oasdiff/releases/download/v1.19.1/oasdiff_1.19.1_linux_amd64.tar.gz
Installed oasdiff to /tmp/oasdiff-qa-z65aun/old-curlL
oasdiff version 1.19.1
old-curlL: api.github.com latest-release occurrences = 1
This confirms the old behavior depended on the unauthenticated GitHub latest-release API endpoint before downloading the release asset.
Step 2 — Apply the PR's changes:
Used the PR’s pinned version=1.19.1 environment variable while keeping installation isolated to a temp directory.
Step 3 — Re-run with the fix in place:
Ran:
curl -fsSL https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh \
| INSTALL_DIR=/tmp/oasdiff-qa-z65aun/new-env version=1.19.1 sh -x
/tmp/oasdiff-qa-z65aun/new-env/oasdiff --versionObserved:
+ [ -n 1.19.1 ]
+ version=1.19.1
Downloading https://github.com/oasdiff/oasdiff/releases/download/v1.19.1/oasdiff_1.19.1_linux_amd64.tar.gz
Installed oasdiff to /tmp/oasdiff-qa-z65aun/new-env
oasdiff version 1.19.1
new-env: api.github.com latest-release occurrences = 0
This shows the pin makes install.sh short-circuit the latest-release lookup and still installs the intended 1.19.1 binary.
Test 2: Exact PR workflow command succeeds
Ran the exact install step command from the PR:
curl -fsSL https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh \
| version=1.19.1 sh -s -- -b /usr/local/bin
oasdiff --versionObserved:
Downloading https://github.com/oasdiff/oasdiff/releases/download/v1.19.1/oasdiff_1.19.1_linux_amd64.tar.gz
Validating checksum
Extracting tar file
Installed oasdiff to /usr/local/bin
oasdiff version 1.19.1
This confirms the workflow command a runner would execute completes and leaves oasdiff available at the expected version.
Test 3: curl -fsSL fails fast on HTTP errors
Ran old vs new curl flags against a missing raw GitHub file under bash -o pipefail:
OLD curl -L: downloaded the 404 body, then sh failed with `sh: 1: 404:: not found` and rc=127.
NEW curl -fsSL: failed at curl with `curl: (22) The requested URL returned error: 404` and rc=22.
This confirms the new curl flags produce a clearer failure before piping an HTTP error body into sh.
Test 4: Workflow YAML still parses
Ran:
python - <<'PY'
import pathlib, yaml
path = pathlib.Path('.github/workflows/agent-server-rest-api-breakage.yml')
with path.open() as f:
data = yaml.safe_load(f)
print(type(data).__name__)
print(data['name'])
print('jobs:', ', '.join(data['jobs'].keys()))
PYObserved:
dict
REST API breakage checks
jobs: agent-server-rest-api
This confirms the changed workflow file remains valid YAML.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The workflow install step was exercised under a simulated GitHub API rate-limit failure; the pinned command avoided the failing API lookup, installed oasdiff 1.19.1, and the installed binary completed a real OpenAPI comparison.
Does this PR achieve its stated goal?
Yes. The stated goal is to remove the REST API breakage workflow flake caused by oasdiff install.sh resolving latest through unauthenticated api.github.com. In a simulated rate-limit scenario, the unpinned baseline failed with Failed to get oasdiff version after calling https://api.github.com/repos/oasdiff/oasdiff/releases/latest, while the PR command with version=1.19.1 completed successfully and made only release-asset/checksum requests.
| Phase | Result |
|---|---|
| Environment Setup | ✅ PR checkout confirmed at 7393101f65a438ef9b4290e8aacb0fd358ad5a50; exercised the CI shell step with real curl, sh, and oasdiff install behavior. |
| CI Status | 🟡 Observed 39 successful checks, 9 skipped checks, and 1 qa-changes check in progress at review time. |
| Functional Verification | ✅ Baseline failure reproduced under simulated API rate limit; pinned install succeeded and the installed CLI performed a real OpenAPI breaking-change check. |
Functional Verification
Test 1: Baseline unpinned install fails when the latest-release API lookup is rate-limited
Step 1 — Reproduce / establish baseline without the fix:
I ran the unpinned install command with a PATH-local curl wrapper that returned an API-rate-limit style JSON response only for https://api.github.com/repos/oasdiff/oasdiff/releases/latest and delegated all other URLs to real curl:
curl -fsSL https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh \
| PATH="$QA_DIR/fakebin:$PATH" sh -s -- -b "$QA_DIR/base-bin"Observed output:
exit_code=1
Error: Failed to get oasdiff version.
This could be due to GitHub API rate limiting or network issues.
internal curl calls:
--silent -L https://api.github.com/repos/oasdiff/oasdiff/releases/latest
This confirms the old behavior is vulnerable to the reported flake: without version, the installer depends on the unauthenticated GitHub latest-release API call before it can download oasdiff.
Step 2 — Apply the PR's changes:
I reran the same installer path with the PR's version=1.19.1 environment variable while keeping the same simulated API-rate-limit wrapper active.
Step 3 — Re-run with the fix in place:
curl -fsSL https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh \
| PATH="$QA_DIR/fakebin:$PATH" version=1.19.1 sh -s -- -b "$QA_DIR/pr-bin"Observed output:
exit_code=0
Downloading https://github.com/oasdiff/oasdiff/releases/download/v1.19.1/oasdiff_1.19.1_linux_amd64.tar.gz
Validating checksum
Extracting tar file
Installed oasdiff to /usr/local/bin
internal curl calls:
-sL -O https://github.com/oasdiff/oasdiff/releases/download/v1.19.1/oasdiff_1.19.1_linux_amd64.tar.gz
-sL https://github.com/oasdiff/oasdiff/releases/download/v1.19.1/checksums.txt
api lookup present in PR path? no
This shows the PR path bypasses the releases/latest API lookup entirely and downloads the pinned release asset directly, so the simulated rate-limit condition no longer blocks installation.
Test 2: Installed oasdiff binary performs a real OpenAPI operation
After the pinned install, I invoked the installed CLI against two small OpenAPI specs where the revision adds a non-breaking endpoint:
oasdiff --version
oasdiff breaking "$QA_DIR/specs/base.yaml" "$QA_DIR/specs/revision.yaml"Observed output:
oasdiff version 1.19.1
No breaking changes to report, but the specs are different.
Run 'oasdiff diff' to see structural differences.
exit_code=0
This verifies more than argument parsing: the pinned install produced an executable oasdiff 1.19.1 binary that can run the workflow-relevant OpenAPI comparison operation successfully.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
HUMAN:
API breakage tests are sometime flaky. This is an effort to make them more reliable.
AGENT:
Why
The REST API breakage checks workflow (
.github/workflows/agent-server-rest-api-breakage.yml) is intermittently red in theInstall oasdiffstep with:This is a CI infra flake, not a real API-breakage finding — the actual
Run agent server REST API breakage checkstep never runs when the install fails.Root cause (verified): the step pipes the upstream
install.shintoshwith no version pin. With noversionset,install.shcallsget_latest_release()→https://api.github.com/repos/oasdiff/oasdiff/releases/latestto resolve "latest". Unauthenticated GitHub API requests are capped at 60/hr per IP; on shared GitHub-hosted runners that limit is easily exhausted, so the lookup fails intermittently.Evidence (all from run logs on 2026-06-15, gh CLI):
mainpush (unrelated code), run27550782864— failed inInstall oasdiffwith the exact error above. A failure onmainwith unrelated code confirms it is environmental, not PR-specific.27551171682— failed inInstall oasdiff.27551356083— nowsuccessonrun_attempt: 2, i.e. a plain re-run cleared it.Resolves #3738.
Summary
version=1.19.1in theInstall oasdiffstep soinstall.shshort-circuits before the GitHub API "latest release" lookup (if [ -n "$version" ]) and downloads the release asset directly — removing the rate-limited API call that caused the flake.curl -L→curl -fsSLso a transient HTTP error fails fast with a clear message instead of piping an error page intosh.Note: the linked issue suggested
version=1.11.7, but that is only the placeholder ininstall.sh's error message. The workflow currently tracks "latest" (v1.19.1), so this PR pins to 1.19.1 to keep behavior identical rather than silently downgrading ~8 minor versions.Issue Number
#3738
How to Test
This is a CI-only change (single workflow step). Verified the pinned install command locally on macOS into a temp dir, observing that it skips the API lookup and installs the pinned version:
The download comes from the
releases/download/...asset URL (a CDN/release download, not the rate-limited API), so noapi.github.com/.../releases/latestrequest is made. The workflow YAML was also validated withyaml.safe_load.On CI, the
Install oasdiffstep should now printoasdiff version 1.19.1deterministically and never hit the rate-limit error.Video/Screenshots
N/A — CI workflow change; install output and version captured above.
Type
Notes
GITHUB_TOKENon the step (to raise the API limit to 5000/hr) is intentionally not added: with the version pinned,install.shmakes no GitHub API calls at all, so the token would be unused.Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:7393101-pythonRun
All tags pushed for this build
About Multi-Architecture Support
7393101-python) is a multi-arch manifest supporting both amd64 and arm647393101-python-amd64) are also available if needed