Skip to content

Commit f286710

Browse files
authored
Merge pull request #52 from frankbria/feature/issue-10-cli-parsing-tests
test(cli): add comprehensive CLI argument parsing tests
2 parents f476022 + 3e76f80 commit f286710

3 files changed

Lines changed: 702 additions & 0 deletions

File tree

.github/workflows/test.yml

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,12 @@ on:
66
pull_request:
77
branches: [ main ]
88

9+
env:
10+
# Coverage threshold - configurable, not hardcoded
11+
# Set to 0 to disable threshold enforcement
12+
COVERAGE_THRESHOLD: 70
13+
KCOV_VERSION: "42"
14+
915
jobs:
1016
test:
1117
runs-on: ubuntu-latest
@@ -37,3 +43,143 @@ jobs:
3743
run: |
3844
echo "## Test Results" >> $GITHUB_STEP_SUMMARY
3945
echo "✅ Unit tests passed" >> $GITHUB_STEP_SUMMARY
46+
47+
coverage:
48+
runs-on: ubuntu-latest
49+
needs: test
50+
51+
steps:
52+
- uses: actions/checkout@v3
53+
54+
- name: Setup Node.js
55+
uses: actions/setup-node@v3
56+
with:
57+
node-version: '18'
58+
59+
- name: Install dependencies
60+
run: |
61+
npm install
62+
sudo apt-get update
63+
sudo apt-get install -y jq
64+
65+
- name: Build and install kcov from source
66+
run: |
67+
# Install kcov build dependencies
68+
sudo apt-get install -y \
69+
cmake \
70+
g++ \
71+
binutils-dev \
72+
libcurl4-openssl-dev \
73+
libdw-dev \
74+
libiberty-dev \
75+
zlib1g-dev \
76+
libssl-dev
77+
78+
# Clone and build kcov
79+
git clone --depth 1 --branch v${KCOV_VERSION} https://github.com/SimonKagstrom/kcov.git /tmp/kcov-src
80+
cd /tmp/kcov-src
81+
mkdir build && cd build
82+
cmake -DCMAKE_INSTALL_PREFIX=/usr/local ..
83+
make -j$(nproc)
84+
sudo make install
85+
86+
# Verify installation
87+
/usr/local/bin/kcov --version
88+
89+
- name: Verify kcov installation
90+
run: |
91+
which kcov
92+
kcov --version
93+
94+
- name: Run tests with coverage
95+
run: |
96+
mkdir -p coverage
97+
98+
# Run CLI parsing tests under kcov
99+
kcov --include-path="$(pwd)/ralph_loop.sh,$(pwd)/lib" \
100+
--exclude-pattern=tests/,node_modules/ \
101+
coverage/cli-parsing \
102+
bash -c "bats tests/unit/test_cli_parsing.bats" || true
103+
104+
# Run all unit tests under kcov for comprehensive coverage
105+
kcov --include-path="$(pwd)/ralph_loop.sh,$(pwd)/lib" \
106+
--exclude-pattern=tests/,node_modules/ \
107+
coverage/all-unit \
108+
bash -c "bats tests/unit/" || true
109+
110+
- name: Parse coverage results
111+
id: coverage
112+
run: |
113+
# Extract coverage percentage from kcov JSON output
114+
COVERAGE_FILE="coverage/all-unit/kcov-merged/coverage.json"
115+
116+
if [[ -f "$COVERAGE_FILE" ]]; then
117+
COVERAGE_PCT=$(jq -r '.percent_covered // "0"' "$COVERAGE_FILE" | cut -d'.' -f1)
118+
echo "coverage_percent=$COVERAGE_PCT" >> $GITHUB_OUTPUT
119+
echo "Coverage: ${COVERAGE_PCT}%"
120+
else
121+
# Fallback: try to find any coverage.json
122+
COVERAGE_FILE=$(find coverage -name "coverage.json" -type f 2>/dev/null | head -1)
123+
if [[ -n "$COVERAGE_FILE" && -f "$COVERAGE_FILE" ]]; then
124+
COVERAGE_PCT=$(jq -r '.percent_covered // "0"' "$COVERAGE_FILE" | cut -d'.' -f1)
125+
echo "coverage_percent=$COVERAGE_PCT" >> $GITHUB_OUTPUT
126+
echo "Coverage (from $COVERAGE_FILE): ${COVERAGE_PCT}%"
127+
else
128+
echo "coverage_percent=0" >> $GITHUB_OUTPUT
129+
echo "Warning: Could not find coverage results"
130+
# List what we do have for debugging
131+
find coverage -type f -name "*.json" 2>/dev/null || echo "No JSON files found"
132+
ls -laR coverage/ 2>/dev/null || echo "Coverage directory empty or not found"
133+
fi
134+
fi
135+
136+
- name: Check coverage threshold
137+
run: |
138+
COVERAGE=${{ steps.coverage.outputs.coverage_percent }}
139+
THRESHOLD=${{ env.COVERAGE_THRESHOLD }}
140+
141+
echo "## Coverage Report" >> $GITHUB_STEP_SUMMARY
142+
echo "" >> $GITHUB_STEP_SUMMARY
143+
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
144+
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
145+
echo "| Coverage | ${COVERAGE}% |" >> $GITHUB_STEP_SUMMARY
146+
echo "| Threshold | ${THRESHOLD}% |" >> $GITHUB_STEP_SUMMARY
147+
echo "" >> $GITHUB_STEP_SUMMARY
148+
149+
if [[ "$THRESHOLD" -eq 0 ]]; then
150+
echo "✅ Coverage threshold enforcement disabled" >> $GITHUB_STEP_SUMMARY
151+
echo "Coverage threshold enforcement disabled (COVERAGE_THRESHOLD=0)"
152+
exit 0
153+
fi
154+
155+
if [[ -z "$COVERAGE" || "$COVERAGE" == "0" ]]; then
156+
echo "⚠️ Coverage measurement failed - skipping threshold check" >> $GITHUB_STEP_SUMMARY
157+
echo "Coverage measurement failed - skipping threshold check"
158+
exit 0
159+
fi
160+
161+
if [[ "$COVERAGE" -lt "$THRESHOLD" ]]; then
162+
echo "❌ Coverage ${COVERAGE}% is below threshold ${THRESHOLD}%" >> $GITHUB_STEP_SUMMARY
163+
echo "::error::Coverage ${COVERAGE}% is below threshold ${THRESHOLD}%"
164+
exit 1
165+
else
166+
echo "✅ Coverage ${COVERAGE}% meets threshold ${THRESHOLD}%" >> $GITHUB_STEP_SUMMARY
167+
echo "Coverage ${COVERAGE}% meets threshold ${THRESHOLD}%"
168+
fi
169+
170+
- name: Upload coverage artifacts
171+
uses: actions/upload-artifact@v4
172+
if: always()
173+
with:
174+
name: coverage-report
175+
path: coverage/
176+
retention-days: 7
177+
178+
- name: Upload coverage to Codecov (optional)
179+
uses: codecov/codecov-action@v4
180+
if: always()
181+
continue-on-error: true
182+
with:
183+
directory: coverage/all-unit
184+
fail_ci_if_error: false
185+
verbose: true
Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Code Review Report: CLI Parsing Tests
2+
3+
**Date:** 2026-01-08
4+
**Reviewer:** Code Review Agent
5+
**Component:** CLI Argument Parsing Unit Tests
6+
**Files Reviewed:** `tests/unit/test_cli_parsing.bats`
7+
**Ready for Production:** Yes
8+
9+
## Executive Summary
10+
11+
The CLI parsing test file is well-structured and provides comprehensive coverage of all 12 CLI flags in `ralph_loop.sh`. The tests follow BATS best practices with proper isolation, setup/teardown, and clear organization. One minor enhancement opportunity identified.
12+
13+
**Critical Issues:** 0
14+
**Major Issues:** 0
15+
**Minor Issues:** 1
16+
**Positive Findings:** 6
17+
18+
---
19+
20+
## Review Context
21+
22+
**Code Type:** Test Infrastructure (BATS unit tests)
23+
**Risk Level:** Low
24+
**Business Constraints:** Test reliability and maintainability
25+
26+
### Review Focus Areas
27+
28+
The review focused on the following areas based on context analysis:
29+
- ✅ Test Quality and Coverage - Primary concern for test code
30+
- ✅ Test Isolation and Cleanup - Prevent flaky tests
31+
- ✅ Resource Management - Temp directory handling
32+
- ✅ Code Maintainability - Long-term test maintenance
33+
- ❌ OWASP Web Security - Not applicable to test infrastructure
34+
- ❌ OWASP LLM/ML Security - Not applicable
35+
36+
---
37+
38+
## Priority 1 Issues - Critical
39+
40+
**None identified.**
41+
42+
---
43+
44+
## Priority 2 Issues - Major
45+
46+
**None identified.**
47+
48+
---
49+
50+
## Priority 3 Issues - Minor
51+
52+
### Missing dedicated test for `--allowed-tools` validation
53+
54+
**Location:** `tests/unit/test_cli_parsing.bats`
55+
**Severity:** Minor
56+
**Category:** Test Coverage
57+
58+
**Problem:**
59+
The `--allowed-tools` flag is tested in the "All flags combined" test (line 276) but lacks a dedicated test for its validation behavior. The implementation in `ralph_loop.sh:976-981` calls `validate_allowed_tools()` which should be tested independently.
60+
61+
**Recommendation:**
62+
Add a dedicated test for `--allowed-tools` validation to match the pattern used for other validated flags like `--timeout` and `--output-format`.
63+
64+
**Suggested Approach:**
65+
```bash
66+
@test "--allowed-tools flag accepts valid tool list" {
67+
run bash "$RALPH_SCRIPT" --allowed-tools "Write,Read,Bash" --help
68+
69+
assert_success
70+
[[ "$output" == *"Usage:"* ]]
71+
}
72+
```
73+
74+
**Note:** This is low priority since the flag is covered in combination tests and the validation function may have its own tests elsewhere.
75+
76+
---
77+
78+
## Positive Findings
79+
80+
### Excellent Practices
81+
82+
- **Comprehensive Flag Coverage:** All 12 CLI flags are tested including both long and short forms
83+
- **Boundary Testing:** The `--timeout` test validates edge cases (0, 1, 120, 121, -5, "abc")
84+
- **Clear Organization:** Well-structured sections with descriptive headers make tests easy to navigate
85+
- **Early Exit Pattern:** Clever use of `--help` as escape hatch to test flag parsing without triggering main loop
86+
87+
### Good Architectural Decisions
88+
89+
- **Test Isolation:** Each test creates its own temp directory with proper cleanup in teardown
90+
- **Minimal Stubs:** Only creates stub libraries actually needed by CLI parsing, not the entire system
91+
- **Git Initialization:** Proper setup of git repo required by some flags
92+
93+
### Testing Wins
94+
95+
- **Short Flag Equivalence:** Bonus tests verify `-c`, `-p`, `-s`, `-m`, `-v`, `-t` work identically to long forms
96+
- **Multiple Flag Combinations:** Tests verify flags work together and are order-independent
97+
- **Error Message Validation:** Tests check for specific error messages, not just failure status
98+
99+
---
100+
101+
## Team Collaboration Needed
102+
103+
### Handoffs to Other Agents
104+
105+
**Architecture Agent:**
106+
- No issues identified
107+
108+
**UX Designer Agent:**
109+
- Not applicable for CLI tests
110+
111+
**DevOps Agent:**
112+
- Tests integrate well with existing CI/CD via `bats tests/unit/`
113+
114+
---
115+
116+
## Testing Recommendations
117+
118+
### Unit Tests Needed
119+
- [x] Help flag tests (2) - Implemented
120+
- [x] Flag value setting tests (6) - Implemented
121+
- [x] Status flag tests (2) - Implemented
122+
- [x] Circuit breaker tests (2) - Implemented
123+
- [x] Invalid input tests (3) - Implemented
124+
- [x] Multiple flags tests (3) - Implemented
125+
- [x] Flag order tests (2) - Implemented
126+
- [x] Short flag equivalence tests (6) - Implemented (bonus)
127+
- [ ] Dedicated `--allowed-tools` validation test - Optional enhancement
128+
129+
### Integration Tests
130+
- Existing integration tests in `tests/integration/` cover full loop execution
131+
132+
---
133+
134+
## Future Considerations
135+
136+
### Patterns for Project Evolution
137+
- If new CLI flags are added, this test file provides a clear template
138+
- Consider extracting flag validation functions for easier unit testing
139+
140+
### Technical Debt Items
141+
- Minor: Could add `--allowed-tools` dedicated test (non-blocking)
142+
143+
---
144+
145+
## Compliance & Best Practices
146+
147+
### Testing Standards Met
148+
- ✅ BATS framework used consistently
149+
- ✅ Setup/teardown isolation pattern
150+
- ✅ Clear test naming conventions
151+
- ✅ Both positive and negative test cases
152+
- ✅ Boundary value testing
153+
154+
### Enterprise Best Practices
155+
- Test file follows project conventions from `test_helper.bash`
156+
- Uses fixtures helper for consistency
157+
- Proper temp directory cleanup prevents resource leaks
158+
159+
---
160+
161+
## Action Items Summary
162+
163+
### Immediate (Before Production)
164+
None - code is ready for merge
165+
166+
### Short-term (Next Sprint)
167+
1. Consider adding dedicated `--allowed-tools` validation test (optional)
168+
169+
### Long-term (Backlog)
170+
None identified
171+
172+
---
173+
174+
## Conclusion
175+
176+
The CLI parsing test file is production-ready with excellent coverage of all CLI flags. The test design is sound, using the `--help` escape hatch pattern to validate argument parsing without triggering the main execution loop. Tests are well-isolated with proper resource cleanup.
177+
178+
**Recommendation:** Approve for merge. The one minor issue (missing dedicated `--allowed-tools` test) is non-blocking since the flag is tested in combination with other flags.
179+
180+
---
181+
182+
## Appendix
183+
184+
### Tools Used for Review
185+
- Manual code review
186+
- BATS test execution
187+
188+
### References
189+
- BATS documentation
190+
- Project CLAUDE.md testing standards
191+
192+
### Metrics
193+
- **Lines of Code Reviewed:** 354
194+
- **Test Cases Reviewed:** 26
195+
- **CLI Flags Covered:** 12/12 (100%)

0 commit comments

Comments
 (0)