|
| 1 | +# IBM/mcp-cli#242 — fix(ping): use transport-level health check for SSE servers |
| 2 | + |
| 3 | +PR: https://github.com/IBM/mcp-cli/pull/242 (fixes #203) |
| 4 | +Author: kimjune01 (us). State: OPEN, mergeable, 0 reviews/comments. |
| 5 | +Question entering investigate: CI is red on all 16 test shards — is our fix broken, or is this a repo-wide gate? |
| 6 | + |
| 7 | +## H₀ — Our diff broke the tests |
| 8 | + |
| 9 | +- **Null:** Tests pass; failures are environmental/configurational. |
| 10 | +- **Perturbation:** Read the failed-test log for `test (tests/adapters)` (job 75147370968 representative slice). |
| 11 | +- **Result:** |
| 12 | + ``` |
| 13 | + collected 133 items |
| 14 | + tests/adapters/test_*.py .................... [100%] |
| 15 | + ============================= 133 passed in 2.48s ============================== |
| 16 | + ERROR: Coverage failure: total of 12 is less than fail-under=60 |
| 17 | + FAIL Required test coverage of 60.0% not reached. Total coverage: 11.51% |
| 18 | + ##[error]Process completed with exit code 1. |
| 19 | + ``` |
| 20 | +- **Trajectory shape:** Divergent against. Tests pass; the job fails because of a coverage threshold gate, not a test assertion. |
| 21 | +- **Status:** KILLED. |
| 22 | +- **Edge:** What's enforcing the 60% gate per-shard, and does it fail every PR? |
| 23 | + |
| 24 | +## H₁ — Repo-wide CI gate fails every PR, not ours |
| 25 | + |
| 26 | +- **Null:** Only our PR fails; main is green via some skipped path. |
| 27 | +- **Perturbation:** `gh pr list ... --json statusCheckRollup` across the seven most-recent PRs (#233–#240, #242), plus `gh run list --branch main`. |
| 28 | +- **Result:** |
| 29 | + - Every open PR (#236 pyasn1 bump, #238 download-artifact bump, #239 cryptography security bump, #240 upload-artifact bump, #242 ours) fails the same 16 test shards. |
| 30 | + - #234 ("Code stuff") was **merged** with the same 16 shard failures present on the PR. |
| 31 | + - Recent main runs are all `success` — but those are Dependabot metadata updates that don't trigger the test workflow, not the test workflow itself. |
| 32 | +- **Trajectory shape:** Divergent for. 100% of PRs run through this CI configuration fail identically. The maintainer is already merging despite the red. |
| 33 | +- **Status:** CONFIRMED. |
| 34 | +- **Reasoning mode:** Induction (observed across population of PRs). |
| 35 | +- **Confidence:** 95%. |
| 36 | + |
| 37 | +## H₂ — The gate is per-shard `fail-under=60` against full `src/` |
| 38 | + |
| 39 | +- **Null:** Coverage is aggregated across shards before the gate runs. |
| 40 | +- **Perturbation:** Inspect CI invocation in the failing log. |
| 41 | +- **Result:** Each shard runs `uv run pytest --cov=src --cov-report= tests/<shard>`. The `report-coverage` job (which would aggregate) is `SKIPPED` because upstream jobs fail. The `fail-under=60` threshold (set in `pyproject.toml`'s `[tool.coverage.report]`) fires per-shard. tests/adapters exercising 11% of `src/` is structurally expected — each shard covers its own slice. |
| 42 | +- **Trajectory shape:** Divergent. The gate is incoherent as configured. |
| 43 | +- **Status:** CONFIRMED. |
| 44 | +- **Reasoning mode:** Deduction (read the invocation, traced the consequence). |
| 45 | +- **Confidence:** 97%. |
| 46 | + |
| 47 | +## Provenance check |
| 48 | + |
| 49 | +- Not our regression — `git blame` on the CI config would show this gate predates branch `fix-203-sse-ping`. Confirmed indirectly by #233 (merged) and Dependabot PRs failing identically. |
| 50 | +- Maintainer behavior reveals the truth: #234 merged with these failures. The red shards are treated as advisory, not blocking. |
| 51 | +- DCO check is `ACTION_REQUIRED` on #242 — that *is* on us (missing `Signed-off-by` trailer). Separate concern from the test shards. |
| 52 | + |
| 53 | +## Diagnosis |
| 54 | + |
| 55 | +Two findings, only one of which we own: |
| 56 | + |
| 57 | +1. **Test shards red (not ours).** Repo-wide pre-existing CI misconfiguration: per-shard coverage gate at 60% applied to full-`src/` coverage measured by a single shard. Structurally impossible to satisfy. Affects every PR including recently-merged #234. Not blocking — maintainer merges through it. |
| 58 | +2. **DCO missing (ours).** `Signed-off-by` trailer absent on the commit. One-line fix: amend or rebase with `-s`. |
| 59 | + |
| 60 | +## Frontier |
| 61 | + |
| 62 | +- **Edge A (ship, do nothing on tests):** Land DCO sign-off; tests will stay "red" but maintainer's track record (#234) shows this isn't merge-blocking. |
| 63 | +- **Edge B (helpful side-quest):** Open a separate small PR proposing aggregated coverage. Either drop `fail-under` from `pyproject.toml` and add it only to the `report-coverage` job, or run a single non-sharded coverage step. Out-of-scope for #242 — flag for triage / `/drip`, don't fold in here. |
| 64 | +- **Edge C (review-side):** No reviewer feedback yet (0 reviews). When it arrives, re-enter the graph from that observation. |
| 65 | + |
| 66 | +## Reasoning mode table |
| 67 | + |
| 68 | +| Node | Mode | Confidence | |
| 69 | +|------|------|------------| |
| 70 | +| H₀ killed (tests pass) | Induction — read the log | 99% | |
| 71 | +| H₁ repo-wide gate | Induction — observed across 7 PRs + 1 merge | 95% | |
| 72 | +| H₂ per-shard gate against full src | Deduction — read invocation, traced consequence | 97% | |
| 73 | +| DCO action required | Deduction — read check status | 99% | |
| 74 | + |
| 75 | +## Action |
| 76 | + |
| 77 | +No code change to #242 for the test shard failures. Two follow-ups: |
| 78 | + |
| 79 | +1. Add DCO sign-off to the commit on `fix-203-sse-ping` (operator decision — `git commit --amend -s && git push --force-with-lease` requires explicit approval per project rules). |
| 80 | +2. Optionally surface the CI coverage misconfiguration as a separate triage candidate. |
| 81 | + |
| 82 | +Frontier closes here unless reviewer feedback arrives. |
0 commit comments