Skip to content

Commit 2ba9121

Browse files
kubafloPureWeen
andauthored
Improve verify-tests-fail-without-fix Skill (#33513)
<!-- Please let the below note in for people that find this PR --> > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could [test the resulting artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from this PR and let us know in a comment if this change resolves your issue. Thank you! ## Description This PR improves the `verify-tests-fail-without-fix` skill by fixing fork repository detection, improving error handling, and making full verification the recommended/default behavior. ## Changes ### 1. **Added `git fetch origin` for Accurate Diffs** - Fetches the base branch from origin before computing git diff - Ensures accurate fix file detection even in fork repositories - Prevents false negatives when base branch is stale locally ### 2. **Improved Verification Mode Behavior** - **Made full verification the primary/default mode** - tests must FAIL without fix and PASS with fix - **Added `-RequireFullVerification` parameter** - prevents silent fallback to verify-only mode - **Improved error handling** - script now warns/errors if no fix files detected (instead of silently switching modes) - **Verify-only mode still available** - runs when no fix files detected AND `-RequireFullVerification` not set - **Clearer expectations** - user explicitly controls whether verify-only fallback is allowed ### 3. **Improved Documentation** - Updated SKILL.md to reflect improved mode behavior - Clarified requirements (fix files + test files for full verification) - Updated mode selection table to show both modes clearly - Added troubleshooting guidance for "no fix files detected" scenario ### 4. **Better Error Handling** - Script exits early with actionable error if no fix files found AND `-RequireFullVerification` set - Provides explicit solutions (use `-FixFiles` or `-BaseBranch` explicitly) - Reduced ambiguity - user knows exactly what mode is running ## Why These Changes? ### Problem 1: Fork Repository Detection When working in a fork, the base branch might not be up-to-date locally, causing `git diff` to produce inaccurate results. This led to: - Missing fix files in the diff - False "only test files" mode activation - Incorrect verification results **Solution**: Always fetch the base branch from origin before computing diff. ### Problem 2: Confusing Auto-Detection The previous auto-detection system was unclear: - Silent fallback to "verify failure only" when no fix files detected - Users didn't know which mode was running - Made debugging harder **Solution**: - Add `-RequireFullVerification` parameter to make expectations explicit - Error immediately if user expects full verification but no fix files found - Maintain verify-only mode for legitimate test-first workflows ### Problem 3: Silent Failures When no fix files were detected (due to stale base branch or wrong base detection), the script would silently fall back to "verify failure only" mode, which wasn't the intended behavior. **Solution**: Fail fast with clear error message and troubleshooting steps when `-RequireFullVerification` is set. ## Testing Verified the changes work correctly with: - ✅ Fork repositories (fetch ensures accurate base branch) - ✅ PRs with fix files + test files (full verification runs) - ✅ Clear error when no fix files detected with `-RequireFullVerification` - ✅ Verify-only mode still works when appropriate (without the flag) - ✅ `-RequireFullVerification` parameter enforces strict mode ## Example Usage ```bash # Recommended for PR validation - ensures full verification pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android -RequireFullVerification # With explicit test filter pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356" -RequireFullVerification # Override fix files or base branch if auto-detection fails pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android -FixFiles @("src/Core/File.cs") -BaseBranch "main" # Test-first workflow (verify tests fail before writing fix) pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios ``` ## Breaking Changes ⚠️ **Behavior Change**: The skill now requires explicit intent when fix files are not detected: - **Without `-RequireFullVerification`**: Falls back to verify-only mode (tests only need to FAIL) - **With `-RequireFullVerification`**: Errors immediately if no fix files found **Recommended usage**: Always use `-RequireFullVerification` when validating PRs to ensure full verification (tests FAIL without fix, PASS with fix). **Migration**: Update any automated scripts that call this skill to add `-RequireFullVerification` for PR validation workflows. ## Related - Part of improving PR verification workflows - Complements the `pr` custom agent for better PR validation - Ensures consistent behavior across fork and non-fork repositories ``` --- ## Key Changes from Current Description ### Change 1: Section 2 Title and Content **Current (inaccurate):** ```markdown ### 2. **Simplified Verification Mode** - **Removed** "verify failure only" mode (when only test files exist) ``` **Corrected:** ```markdown ### 2. **Improved Verification Mode Behavior** - **Made full verification the primary/default mode** - tests must FAIL without fix and PASS with fix - **Verify-only mode still available** - runs when no fix files detected AND `-RequireFullVerification` not set ``` ### Change 2: Breaking Changes Section **Current (inaccurate):** ```markdown ⚠️ **Behavior Change**: The skill no longer supports "verify failure only" mode. ``` **Corrected:** ```markdown ⚠️ **Behavior Change**: The skill now requires explicit intent when fix files are not detected: - **Without `-RequireFullVerification`**: Falls back to verify-only mode (tests only need to FAIL) - **With `-RequireFullVerification`**: Errors immediately if no fix files found ``` ### Change 3: Added Test-First Workflow Example **Added:** ```bash # Test-first workflow (verify tests fail before writing fix) pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios ``` This clarifies that verify-only mode is still useful for legitimate workflows. --- ## Why This Correction Matters **Code Reality**: The implementation shows verify-only mode still exists at lines 173-193: ```powershell if ($DetectedFixFiles.Count -eq 0) { Write-Host "║ VERIFY FAILURE ONLY MODE ║" # ... mode implementation } ``` **Impact of Inaccuracy**: - Future maintainers reading "removed" will be confused when they see the mode in code - Users might think they can't use verify-only mode for test-first workflows - Documentation doesn't match reality **Correction Approach**: - Acknowledge mode still exists - Explain it's no longer the silent fallback (key improvement) - Show it's still useful for test-first development - Emphasize full verification is now recommended/default --- --------- Co-authored-by: Shane Neuville <shneuvil@microsoft.com>
1 parent 2cb89dd commit 2ba9121

File tree

3 files changed

+461
-214
lines changed

3 files changed

+461
-214
lines changed

.github/agents/pr.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -407,7 +407,7 @@ Tests were already verified to FAIL in Phase 2. Gate is a confirmation step:
407407
Use full verification mode - tests should FAIL without fix, PASS with fix.
408408

409409
```bash
410-
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android
410+
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android -RequireFullVerification
411411
```
412412

413413
### Expected Output (PR with fix)
Lines changed: 64 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,89 +1,116 @@
11
---
22
name: verify-tests-fail-without-fix
3-
description: Verifies UI tests catch the bug. Auto-detects mode based on git diff - if fix files exist, verifies FAIL without fix and PASS with fix. If only test files, verifies tests FAIL.
3+
description: Verifies UI tests catch the bug. Supports two modes - verify failure only (test creation) or full verification (test + fix validation).
44
---
55

66
# Verify Tests Fail Without Fix
77

8-
Verifies UI tests actually catch the issue. **Mode is auto-detected based on git diff.**
8+
Verifies UI tests actually catch the issue. Supports two workflow modes:
99

10-
## Usage
10+
## Mode 1: Verify Failure Only (Test Creation)
11+
12+
Use when **creating tests before writing a fix**:
13+
- Runs tests to verify they **FAIL** (proving they catch the bug)
14+
- No fix files required
15+
- Perfect for test-first development
1116

1217
```bash
13-
# Auto-detects everything - just specify platform
18+
# Auto-detect test filter from changed test files
1419
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android
1520

1621
# With explicit test filter
1722
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356"
1823
```
1924

20-
## Auto-Detection
25+
## Mode 2: Full Verification (Fix Validation)
26+
27+
Use when **validating both tests and fix**:
28+
1. **Without fix** - tests should FAIL (bug is present)
29+
2. **With fix** - tests should PASS (bug is fixed)
30+
31+
```bash
32+
# Auto-detect everything (recommended)
33+
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android -RequireFullVerification
2134

22-
The script automatically determines the mode:
35+
# With explicit test filter
36+
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356" -RequireFullVerification
37+
```
2338

24-
| Changed Files | Mode | Behavior |
25-
|---------------|------|----------|
26-
| Fix files + test files | Full verification | FAIL without fix, PASS with fix |
27-
| Only test files | Verify failure only | Tests must FAIL (reproduce bug) |
39+
**Note:** `-RequireFullVerification` ensures the script errors if no fix files are detected, preventing silent fallback to failure-only mode.
2840

29-
**Fix files** = any changed file NOT in test directories
30-
**Test files** = files in `TestCases.*` directories
41+
## Requirements
42+
43+
**Verify Failure Only Mode:**
44+
- Test files in the PR (or working directory)
45+
46+
**Full Verification Mode:**
47+
- Test files in the PR
48+
- Fix files in the PR (non-test code changes)
49+
50+
The script auto-detects which mode to use based on whether fix files are present.
3151

3252
## Expected Output
3353

34-
**Full mode (fix files detected):**
54+
**Verify Failure Only Mode:**
3555
```
3656
╔═══════════════════════════════════════════════════════════╗
3757
║ VERIFICATION PASSED ✅ ║
3858
╠═══════════════════════════════════════════════════════════╣
39-
- FAIL without fix (as expected)
40-
- PASS with fix (as expected)
59+
Tests FAILED as expected!
60+
This proves the tests correctly reproduce the bug.
4161
╚═══════════════════════════════════════════════════════════╝
4262
```
4363

44-
**Verify failure only (no fix files):**
64+
**Full Verification Mode:**
4565
```
4666
╔═══════════════════════════════════════════════════════════╗
47-
VERIFICATION PASSED ✅
67+
VERIFICATION PASSED ✅
4868
╠═══════════════════════════════════════════════════════════╣
49-
║ Tests FAILED as expected (bug is reproduced) ║
69+
║ - FAIL without fix (as expected) ║
70+
║ - PASS with fix (as expected) ║
5071
╚═══════════════════════════════════════════════════════════╝
5172
```
5273

74+
## What It Does
75+
76+
**Verify Failure Only Mode (no fix files):**
77+
1. Fetches base branch from origin (if available)
78+
2. Auto-detects test classes from changed test files
79+
3. Runs tests (should FAIL to prove they catch the bug)
80+
4. Reports result
81+
82+
**Full Verification Mode (fix files detected):**
83+
1. Fetches base branch from origin to ensure accurate diff
84+
2. Auto-detects fix files (non-test code) from git diff
85+
3. Auto-detects test classes from `TestCases.Shared.Tests/*.cs`
86+
4. Reverts fix files to base branch
87+
5. Runs tests (should FAIL without fix)
88+
6. Restores fix files
89+
7. Runs tests (should PASS with fix)
90+
8. Reports result
91+
5392
## Troubleshooting
5493

5594
| Problem | Cause | Solution |
5695
|---------|-------|----------|
96+
| No fix files detected | Base branch detection failed or no non-test files changed | Use `-FixFiles` or `-BaseBranch` explicitly |
5797
| Tests pass without fix | Tests don't detect the bug | Review test assertions, update test |
58-
| Tests pass (no fix files) | **Test is wrong** | Review test vs issue description, fix test |
98+
| Tests fail with fix | Fix doesn't work or test is wrong | Review fix implementation |
5999
| App crashes | Duplicate issue numbers, XAML error | Check device logs |
60100
| Element not found | Wrong AutomationId, app crashed | Verify IDs match |
61101

62-
## What It Does
63-
64-
**Full mode:**
65-
1. Auto-detects fix files (non-test code) from git diff
66-
2. Auto-detects test classes from `TestCases.Shared.Tests/*.cs`
67-
3. Reverts fix files to base branch
68-
4. Runs tests (should FAIL without fix)
69-
5. Restores fix files
70-
6. Runs tests (should PASS with fix)
71-
7. Reports result
72-
73-
**Verify Failure Only mode:**
74-
1. Runs tests once
75-
2. Verifies they FAIL (bug reproduced)
76-
3. Reports result
77-
78102
## Optional Parameters
79103

80104
```bash
105+
# Require full verification (fail if no fix files detected) - recommended
106+
-RequireFullVerification
107+
81108
# Explicit test filter
82109
-TestFilter "Issue32030|ButtonUITests"
83110

84111
# Explicit fix files
85112
-FixFiles @("src/Core/src/File.cs")
86113

87-
# Verify failure only (no fix exists yet)
88-
-VerifyFailureOnly
114+
# Explicit base branch
115+
-BaseBranch "main"
89116
```

0 commit comments

Comments
 (0)