Skip to content

Commit 448c3a6

Browse files
fix: exclude run-*-tests.sh aggregators from bash runner discovery (c4d6-61f7)
The bash runner in test-batched.sh discovered run-*-tests.sh suite aggregators alongside individual test-*.sh files. Since aggregators internally run all test-*.sh files, the batched runner treated the entire suite as a single test item that exceeded the time budget on every invocation, preventing per-file resume from working. Remove run-*-tests.sh from the find pattern in bash-runner.sh. Update docs (CLAUDE.md, COMMIT-WORKFLOW.md) to remove the >60s qualifier for test-batched.sh usage — it should be the default test runner, not a fallback for long suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f4e6696 commit 448c3a6

File tree

6 files changed

+166
-21
lines changed

6 files changed

+166
-21
lines changed

.test-index

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,3 +157,5 @@ plugins/dso/scripts/acli-version-resolver.sh: tests/scripts/test-acli-version-re
157157
plugins/dso/scripts/gh-availability-check.sh: tests/scripts/test-gh-availability-check.sh
158158
plugins/dso/scripts/purge-non-project-tickets.sh: tests/scripts/test-purge-non-project-tickets.sh
159159
plugins/dso/scripts/sprint-next-batch.sh: tests/scripts/test-sprint-next-batch.sh
160+
plugins/dso/scripts/runners/bash-runner.sh: tests/scripts/test-bash-runner-discovery.sh
161+
plugins/dso/scripts/test-batched.sh: tests/scripts/test-batched-state-integrity.sh, tests/scripts/test-bash-runner-discovery.sh

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ These rules protect core structural boundaries. Violating them causes subtle bug
127127
9. **When fixing a bug, search for the same anti-pattern elsewhere.** After fixing a bug, search the codebase for other code that follows the same anti-pattern you just fixed. Create a bug ticket (`.claude/scripts/dso ticket create bug "<title>"`) for each occurrence found so they can be tracked and fixed systematically.
128128
10. **Write a failing test to verify your CI/staging bug hypothesis before fixing.** When diagnosing a CI or staging failure, write a unit or integration test that reproduces the suspected root cause FIRST. Run it to confirm it fails (RED). Only then implement the fix and verify the test passes (GREEN). This prevents fixing symptoms instead of causes and guards against the fix being wrong.
129129
11. **Always set `timeout: 600000` on Bash tool calls for commands expected to exceed 30 seconds, AND on all Bash calls during commit/review workflows.** Claude Code's hard timeout ceiling is ~73s even with max timeout. Without `timeout: 600000`, the ceiling drops to ~48s. Commands known to exceed 30s: `validate.sh --ci`, `make test`, `.claude/scripts/dso ticket sync`, ticket write commands in worktrees with many tickets. Additionally, set `timeout: 600000` on ALL Bash tool calls during COMMIT-WORKFLOW.md and REVIEW-WORKFLOW.md execution — even fast commands like `ruff check` can receive SIGURG (exit 144) from tool-call cancellation during internal event processing (see INC-016 scenario 4).
130-
12. **Use `test-batched.sh` for test commands expected to exceed 60 seconds.** The script supports runner drivers that decompose test suites into individual items for per-test resume. **Prefer `--runner=bash --test-dir=<dir>` for bash test suites** — this discovers `test-*.sh` and `run-*-tests.sh` files and runs each as a separate item, enabling per-script resume on timeout. Example: `$(git rev-parse --show-toplevel)/plugins/dso/scripts/test-batched.sh --timeout=50 --runner=bash --test-dir=tests/scripts`. The generic fallback (`--timeout=50 "command"`) treats the entire command as a single item — use it only when no runner driver applies. Available runners: `bash` (test-*.sh files), `node` (*.test.js files), `pytest` (pytest collection). Run the printed `RUN:` command in subsequent Bash tool calls until the summary appears. Do NOT use `while` polling loops — they get killed by the ~73s tool timeout ceiling, producing spurious exit 144. For non-test long-running commands (e.g., `.claude/scripts/dso ticket sync`), see INC-016 in KNOWN-ISSUES.md for the managed launch/poll script pattern.
130+
12. **Use `test-batched.sh` for running tests.** The script supports runner drivers that decompose test suites into individual items for per-test resume. **Prefer `--runner=bash --test-dir=<dir>` for bash test suites** — this discovers `test-*.sh` files and runs each as a separate item, enabling per-script resume on timeout. Example: `$(git rev-parse --show-toplevel)/plugins/dso/scripts/test-batched.sh --timeout=50 --runner=bash --test-dir=tests/scripts`. The generic fallback (`--timeout=50 "command"`) treats the entire command as a single item — use it only when no runner driver applies. Available runners: `bash` (test-*.sh files), `node` (*.test.js files), `pytest` (pytest collection). Run the printed `RUN:` command in subsequent Bash tool calls until the summary appears. Do NOT use `while` polling loops — they get killed by the ~73s tool timeout ceiling, producing spurious exit 144. For non-test long-running commands (e.g., `.claude/scripts/dso ticket sync`), see INC-016 in KNOWN-ISSUES.md for the managed launch/poll script pattern.
131131

132132
## Task Start Workflow
133133

plugins/dso/docs/workflows/COMMIT-WORKFLOW.md

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -91,30 +91,18 @@ Run unit tests to catch breakage before investing in review.
9191

9292
> **Timeout rule**: Always set `timeout: 600000` on ALL Bash tool calls in this workflow — including fast commands like `ruff check`. Claude Code's hard ceiling is ~73s without the explicit timeout parameter (drops to ~48s), and even short commands can receive SIGURG (exit 144) during internal event processing. See CLAUDE.md rule 11 (Always Do These).
9393
94-
> **Long-running test suites (>60s)**: If the project's test command is expected to exceed 60 seconds (e.g., `bash tests/run-all.sh`), wrap it with `test-batched.sh` instead of invoking it bare. A bare invocation will be killed by the ~73s tool timeout ceiling (exit 144), producing spurious failures. See CLAUDE.md rule 12 (Always Do These).
95-
96-
Resolve the test command from config, then run it:
94+
Resolve the test command from config, then run it using `test-batched.sh` for per-test resume on timeout. **Prefer `--runner=bash --test-dir=<dir>` for bash test suites** — this discovers `test-*.sh` files and runs each as a separate item, so completed tests are skipped on resume:
9795

9896
```bash
9997
REPO_ROOT=$(git rev-parse --show-toplevel)
100-
TEST_CMD="$(".claude/scripts/dso read-config.sh" commands.test_unit 2>/dev/null || echo "make test-unit-only")"
101-
```
102-
103-
**If the test command is expected to complete in under 60s**, run it directly (with `timeout: 600000` on the Bash tool call):
104-
105-
```bash
106-
cd app && $TEST_CMD 2>&1 | tail -5
107-
```
108-
109-
**If the test command is expected to exceed 60s** (e.g., `bash tests/run-all.sh`), use `test-batched.sh` with a runner driver for per-test resume. **Prefer `--runner=bash --test-dir=<dir>` for bash test suites** — this discovers `test-*.sh` and `run-*-tests.sh` files and runs each as a separate item, so completed tests are skipped on resume:
110-
111-
```bash
11298
.claude/scripts/dso test-batched.sh --timeout=50 --runner=bash --test-dir="$REPO_ROOT/tests/scripts"
11399
```
114100

115-
If no runner driver applies (the test command is not a directory of scripts), fall back to the generic runner which wraps the entire command as a single item (no sub-test resume):
101+
If no runner driver applies (the test command is not a directory of scripts), resolve the test command from config and fall back to the generic runner which wraps the entire command as a single item (no sub-test resume):
116102

117103
```bash
104+
REPO_ROOT=$(git rev-parse --show-toplevel)
105+
TEST_CMD="$(".claude/scripts/dso read-config.sh" commands.test_unit 2>/dev/null || echo "make test-unit-only")"
118106
.claude/scripts/dso test-batched.sh --timeout=50 "$TEST_CMD"
119107
```
120108

plugins/dso/scripts/runners/bash-runner.sh

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,17 @@
2424

2525
# _bash_discover_files <dir>
2626
# Prints one file path per line for test-*.sh files; returns non-zero if none found.
27+
# Excludes run-*-tests.sh aggregator scripts — these are suite orchestrators that
28+
# run all test-*.sh files internally. Including them causes the batched runner to
29+
# treat the entire suite as a single test item, which gets killed by the time budget
30+
# and prevents per-file resume from working.
2731
_bash_discover_files() {
2832
local dir="$1"
2933
local found=0
3034
# Use a while loop with sorted glob expansion for portability (no find -print0)
3135
while IFS= read -r f; do
3236
[ -f "$f" ] && [ -x "$f" ] && { echo "$f"; found=1; }
33-
done < <(find "$dir" -maxdepth 1 \( -name 'test-*.sh' -o -name 'run-*-tests.sh' \) -print 2>/dev/null | sort)
37+
done < <(find "$dir" -maxdepth 1 -name 'test-*.sh' -print 2>/dev/null | sort)
3438
[ "$found" -eq 1 ]
3539
}
3640

@@ -48,7 +52,7 @@ if [ "$RUNNER" = "bash" ]; then
4852
done < <(_bash_discover_files "$TEST_DIR" 2>/dev/null || true)
4953

5054
if [ "${#BASH_FILES[@]}" -eq 0 ]; then
51-
echo "WARNING: --runner=bash: no test-*.sh or run-*-tests.sh files found under $TEST_DIR; falling back to generic runner." >&2
55+
echo "WARNING: --runner=bash: no test-*.sh files found under $TEST_DIR; falling back to generic runner." >&2
5256
else
5357
USE_BASH_RUNNER=1
5458
fi

plugins/dso/scripts/test-batched.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@ set -uo pipefail
2828
# files exist under --test-dir.
2929
# Falls back to generic when: pytest not installed, no test files found,
3030
# collection fails, or collection yields no test IDs.
31-
# bash Discovers test-*.sh and run-*-tests.sh files under --test-dir
31+
# bash Discovers test-*.sh files under --test-dir
3232
# and runs each via: bash <file>
33-
# Auto-detected when: test-*.sh or run-*-tests.sh files exist
33+
# Auto-detected when: test-*.sh files exist
3434
# under --test-dir (after node and pytest auto-detect).
3535
# Falls back to generic when: no matching files found.
3636
# generic (default) Runs <command> as a single test item.
Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
#!/usr/bin/env bash
2+
# tests/scripts/test-bash-runner-discovery.sh
3+
# Tests for plugins/dso/scripts/runners/bash-runner.sh discovery behavior.
4+
#
5+
# Verifies that the bash runner discovers test-*.sh files but excludes
6+
# run-*-tests.sh aggregator scripts (which are suite orchestrators, not
7+
# individual test items).
8+
#
9+
# Usage: bash tests/scripts/test-bash-runner-discovery.sh
10+
# Returns: exit 0 if all tests pass, exit 1 if any fail
11+
12+
set -uo pipefail
13+
14+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
15+
PLUGIN_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
16+
DSO_PLUGIN_DIR="$PLUGIN_ROOT/plugins/dso"
17+
REPO_ROOT="$(cd "$SCRIPT_DIR" && git rev-parse --show-toplevel)"
18+
BASH_RUNNER="$DSO_PLUGIN_DIR/scripts/runners/bash-runner.sh"
19+
TEST_BATCHED="$DSO_PLUGIN_DIR/scripts/test-batched.sh"
20+
21+
source "$SCRIPT_DIR/../lib/run_test.sh"
22+
23+
echo "=== test-bash-runner-discovery.sh ==="
24+
25+
# ── Helpers ──────────────────────────────────────────────────────────────────
26+
27+
_CLEANUP_DIRS=()
28+
_cleanup() { for d in "${_CLEANUP_DIRS[@]}"; do rm -rf "$d"; done; }
29+
trap _cleanup EXIT
30+
31+
# ── Test 1: bash-runner.sh exists and is executable ──────────────────────────
32+
echo "Test 1: bash-runner.sh exists and is executable"
33+
if [ -f "$BASH_RUNNER" ] && [ -r "$BASH_RUNNER" ]; then
34+
echo " PASS: bash-runner.sh exists"
35+
(( PASS++ ))
36+
else
37+
echo " FAIL: bash-runner.sh not found at $BASH_RUNNER" >&2
38+
(( FAIL++ ))
39+
fi
40+
41+
# ── Test 2: Discovery does NOT include run-*-tests.sh files ──────────────────
42+
echo "Test 2: test_discovery_excludes_aggregators — run-*-tests.sh not discovered"
43+
test_discovery_excludes_aggregators() {
44+
local tmpdir
45+
tmpdir=$(mktemp -d)
46+
_CLEANUP_DIRS+=("$tmpdir")
47+
48+
# Create test-*.sh files (should be discovered)
49+
cat > "$tmpdir/test-alpha.sh" << 'EOF'
50+
#!/usr/bin/env bash
51+
echo "PASSED: 1 FAILED: 0"
52+
exit 0
53+
EOF
54+
chmod +x "$tmpdir/test-alpha.sh"
55+
56+
cat > "$tmpdir/test-beta.sh" << 'EOF'
57+
#!/usr/bin/env bash
58+
echo "PASSED: 1 FAILED: 0"
59+
exit 0
60+
EOF
61+
chmod +x "$tmpdir/test-beta.sh"
62+
63+
# Create run-*-tests.sh aggregator (should NOT be discovered)
64+
cat > "$tmpdir/run-all-tests.sh" << 'EOF'
65+
#!/usr/bin/env bash
66+
echo "I am an aggregator — I should not be run as an individual test item"
67+
exit 0
68+
EOF
69+
chmod +x "$tmpdir/run-all-tests.sh"
70+
71+
# Run batched runner and check output for the aggregator filename
72+
local output exit_code=0
73+
output=$(TEST_BATCHED_STATE_FILE="$tmpdir/state.json" \
74+
bash "$TEST_BATCHED" --timeout=30 --runner=bash --test-dir="$tmpdir" 2>&1) || exit_code=$?
75+
76+
# The aggregator must NOT appear in the run output
77+
if echo "$output" | grep -q "run-all-tests.sh"; then
78+
echo " DEBUG: aggregator was discovered and run" >&2
79+
return 1
80+
fi
81+
82+
# test-alpha.sh and test-beta.sh must appear
83+
echo "$output" | grep -q "test-alpha.sh" || return 1
84+
echo "$output" | grep -q "test-beta.sh" || return 1
85+
}
86+
if test_discovery_excludes_aggregators; then
87+
echo " PASS: run-*-tests.sh excluded from discovery"
88+
(( PASS++ ))
89+
else
90+
echo " FAIL: run-*-tests.sh was discovered as a test item" >&2
91+
(( FAIL++ ))
92+
fi
93+
94+
# ── Test 3: Discovery DOES include test-*.sh files ───────────────────────────
95+
echo "Test 3: test_discovery_includes_test_files — test-*.sh files discovered"
96+
test_discovery_includes_test_files() {
97+
local tmpdir
98+
tmpdir=$(mktemp -d)
99+
_CLEANUP_DIRS+=("$tmpdir")
100+
101+
cat > "$tmpdir/test-gamma.sh" << 'EOF'
102+
#!/usr/bin/env bash
103+
echo "PASSED: 1 FAILED: 0"
104+
exit 0
105+
EOF
106+
chmod +x "$tmpdir/test-gamma.sh"
107+
108+
local output exit_code=0
109+
output=$(TEST_BATCHED_STATE_FILE="$tmpdir/state.json" \
110+
bash "$TEST_BATCHED" --timeout=30 --runner=bash --test-dir="$tmpdir" 2>&1) || exit_code=$?
111+
112+
echo "$output" | grep -q "test-gamma.sh"
113+
}
114+
if test_discovery_includes_test_files; then
115+
echo " PASS: test-*.sh files discovered"
116+
(( PASS++ ))
117+
else
118+
echo " FAIL: test-*.sh files not discovered" >&2
119+
(( FAIL++ ))
120+
fi
121+
122+
# ── Test 4: Warning message does not mention run-*-tests.sh ──────────────────
123+
echo "Test 4: test_warning_message_no_aggregator_mention — fallback warning updated"
124+
test_warning_message_no_aggregator_mention() {
125+
# When no test files exist, the warning should only mention test-*.sh
126+
local tmpdir
127+
tmpdir=$(mktemp -d)
128+
_CLEANUP_DIRS+=("$tmpdir")
129+
mkdir -p "$tmpdir/empty-dir"
130+
131+
local output exit_code=0
132+
output=$(TEST_BATCHED_STATE_FILE="$tmpdir/state.json" \
133+
bash "$TEST_BATCHED" --timeout=30 --runner=bash --test-dir="$tmpdir/empty-dir" 2>&1) || exit_code=$?
134+
135+
# Warning should NOT mention run-*-tests.sh
136+
if echo "$output" | grep -q "run-\*-tests.sh"; then
137+
return 1
138+
fi
139+
return 0
140+
}
141+
if test_warning_message_no_aggregator_mention; then
142+
echo " PASS: warning message does not mention run-*-tests.sh"
143+
(( PASS++ ))
144+
else
145+
echo " FAIL: warning message still mentions run-*-tests.sh" >&2
146+
(( FAIL++ ))
147+
fi
148+
149+
echo ""
150+
echo "Results: $PASS passed, $FAIL failed"
151+
[ "$FAIL" -eq 0 ]

0 commit comments

Comments
 (0)