Report unit test files with no result#3105
Conversation
|
/bot run |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds deterministic per-test junit XML path computation, centralizes recording of failed and "no result" tests, updates sequential and parallel runners to treat missing JUnit artifacts as "NO RESULT", and updates final summary to report and list no-result tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner
participant PyTest
participant FS as Filesystem
participant Collator
Runner->>FS: compute junit_file_for_test(test.py)
Runner->>FS: rm -f junit_file
Runner->>PyTest: run pytest --junitxml=junit_file
PyTest-->>FS: write junit_file (or not)
PyTest-->>Runner: return exit status
Runner->>Collator: submit job metadata (status, junit_file)
Collator->>FS: check junit_file and marker files
alt junit exists
Collator->>Collator: increment PASSED_TESTS / FAILED_TESTS based on junit contents
else missing junit
Collator->>Collator: record_no_result_test(test.py)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request enhances the test utility script by introducing tracking for tests that fail to produce result artifacts, categorized as "No result." It adds helper functions for recording test outcomes and describing missing artifacts, while updating the execution summary to include these new metrics. Review feedback suggests using printf for joining array elements to avoid Bash IFS limitations and recommends deduplicating the failed_count calculation for better maintainability.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/test_utils.sh (1)
655-669:⚠️ Potential issue | 🔴 CriticalCritical:
readin the sort loop dropsjunit_fileintofile_index, corrupting the pid extraction.
test_result_files[$pid]now stores 5 colon-separated fields (result_file:test_file:log_file:file_index:junit_file, set at line 632), but this loop still reads only 4 variables:IFS=':' read -r result_file test_file log_file file_index <<< "${test_result_files[$pid]}"With 4 target vars, Bash packs the remaining input (including the separator) into the last variable, so
file_indexbecomes"<idx>:<abs/junit/path>.xml". Then:sorted_pids+=("$file_index:$pid") # e.g. "1:/abs/junit/tests_foo.xml:12345" ... local pid="${entry#*:}" # strips only "1:" → "/abs/junit/tests_foo.xml:12345"So
pidis no longer a real pid, the subsequenttest_result_files[$pid]lookup returns empty, the[ -f "$result_file" ]check at line 678 takes theelsebranch, and every parallel test result is recorded as NO RESULT, regardless of actual pass/fail.EXIT_CODEis also forced to 1 unconditionally.This silently inverts the intent of the PR on the parallel path (which
task_run_unit_tests.shenables by default viaPARALLEL_TESTS=true).🐛 Proposed fix
for pid in "${!test_result_files[@]}"; do - local result_file test_file log_file file_index - IFS=':' read -r result_file test_file log_file file_index <<< "${test_result_files[$pid]}" + local result_file test_file log_file file_index junit_file + IFS=':' read -r result_file test_file log_file file_index junit_file <<< "${test_result_files[$pid]}" sorted_pids+=("$file_index:$pid") doneAlternatively, keep
${entry#*:}robust by using longest-match when extracting the pid:local pid="${entry##*:}". The read fix above is still required so downstream consumers see a correctjunit_file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_utils.sh` around lines 655 - 669, The loop that builds and then processes sorted_pids corrupts pid extraction because test_result_files stores five colon-separated fields but the first read (inside the sort loop) only reads four variables, causing file_index to capture the junit_file and later making local pid="${entry#*:}" produce a non-pid; fix by reading all five fields there (use IFS=':' read -r result_file test_file log_file file_index junit_file <<< "${test_result_files[$pid]}") so sorted_pids gets a clean numeric file_index, and make the pid extraction robust when processing entries by using longest-match extraction (replace local pid="${entry#*:}" with local pid="${entry##*:}") to ensure the actual pid is recovered from "file_index:pid" entries; these changes touch the variables/test_result_files handling around sorted_pids, the read lines, and the pid extraction in the processing loop.
🧹 Nitpick comments (2)
scripts/test_utils.sh (2)
197-212: Minor:local IFS=', 'joins with only the first character.
${missing[*]}expansion uses the first character ofIFSas the separator, so the resulting string is comma-separated (the space is ignored for joining, it just happens to appear in the literal JUnit path). The current output is correct, but the' 'inIFS=', 'is misleading — considerlocal IFS=','and include an explicit space in a join if desired, e.g. a small loop orprintf.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_utils.sh` around lines 197 - 212, The IFS assignment in describe_missing_artifacts is misleading because IFS=', ' only uses the first character as the separator; change the join to use local IFS=',' and then explicitly add a space when printing (e.g. expand missing with separator and insert spaces) or implement a small loop/printf to join with ", " so the output reliably uses a comma+space separator when echoing "${missing[*]}"; update the IFS line and the final echo to use the chosen join method while keeping the missing array logic unchanged.
515-590: Verify: job-info field parsing assumes colon-free paths.
job_infois packed aspid:test_file:result_file:log_file:file_index:junit_fileand parsed withIFS=':' read -rinto 6 variables at line 631. Sincejunit_fileis last, trailing colons in its path would be tolerated, but a colon anywhere inJUNIT_DIR(fromrealpath ./junit) or in the working directory path would shift earlier fields and misparse. In typical CI this is fine, but ifJUNIT_DIRis ever overridden to a path containing:, downstream logic (including the sort loop at line 658) will silently break.Consider using a different delimiter (e.g. tab or
$'\x1f') or emitting structured lines (one field per line) to eliminate this fragility.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_utils.sh` around lines 515 - 590, The job-info string emitted by run_single_test_background is split on ':' (job_info -> pid:test_file:result_file:log_file:file_index:junit_file) which breaks when any path contains ':'; change the delimiter to a safe separator (e.g. $'\x1f' / ASCII unit separator or '\t' or NUL) when echoing the job_info line and update the parser that does IFS=':' read -r ... to split on the same new separator (or use read -d for NUL-terminated fields), ensuring symbols to update include run_single_test_background (the echo of "$pid:...:$junit_file") and the consumer that reads job_info (the IFS=':' read -r ... handling and the sort/loop that iterates over those lines).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@scripts/test_utils.sh`:
- Around line 655-669: The loop that builds and then processes sorted_pids
corrupts pid extraction because test_result_files stores five colon-separated
fields but the first read (inside the sort loop) only reads four variables,
causing file_index to capture the junit_file and later making local
pid="${entry#*:}" produce a non-pid; fix by reading all five fields there (use
IFS=':' read -r result_file test_file log_file file_index junit_file <<<
"${test_result_files[$pid]}") so sorted_pids gets a clean numeric file_index,
and make the pid extraction robust when processing entries by using
longest-match extraction (replace local pid="${entry#*:}" with local
pid="${entry##*:}") to ensure the actual pid is recovered from "file_index:pid"
entries; these changes touch the variables/test_result_files handling around
sorted_pids, the read lines, and the pid extraction in the processing loop.
---
Nitpick comments:
In `@scripts/test_utils.sh`:
- Around line 197-212: The IFS assignment in describe_missing_artifacts is
misleading because IFS=', ' only uses the first character as the separator;
change the join to use local IFS=',' and then explicitly add a space when
printing (e.g. expand missing with separator and insert spaces) or implement a
small loop/printf to join with ", " so the output reliably uses a comma+space
separator when echoing "${missing[*]}"; update the IFS line and the final echo
to use the chosen join method while keeping the missing array logic unchanged.
- Around line 515-590: The job-info string emitted by run_single_test_background
is split on ':' (job_info ->
pid:test_file:result_file:log_file:file_index:junit_file) which breaks when any
path contains ':'; change the delimiter to a safe separator (e.g. $'\x1f' /
ASCII unit separator or '\t' or NUL) when echoing the job_info line and update
the parser that does IFS=':' read -r ... to split on the same new separator (or
use read -d for NUL-terminated fields), ensuring symbols to update include
run_single_test_background (the echo of "$pid:...:$junit_file") and the consumer
that reads job_info (the IFS=':' read -r ... handling and the sort/loop that
iterates over those lines).
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/test_utils.sh (1)
687-710:⚠️ Potential issue | 🟡 MinorPreserve sanity case counts before the no-result early exit.
Line 687 can
continuebefore consuming thePASSED:total:sampled/FAILED:total:sampledmetadata. In parallel sanity mode, a no-result file with a result marker will be counted as executed but omitted from coverage totals.Proposed fix
TOTAL_TESTS=$((TOTAL_TESTS + 1)) + if [ "$mode" = "sanity" ] && [[ "$result" == PASSED* || "$result" == FAILED* ]]; then + local total_in_file sampled_in_file + # shellcheck disable=SC2034 # status is part of the read but unused + IFS=':' read -r _ total_in_file sampled_in_file <<< "$result" + TOTAL_TEST_CASES=$((TOTAL_TEST_CASES + total_in_file)) + SAMPLED_TEST_CASES=$((SAMPLED_TEST_CASES + sampled_in_file)) + fi + if [ ! -f "$junit_file" ]; then echo "⚠️ NO RESULT: $test_file (missing JUnit XML: $junit_file)" record_no_result_test "$test_file" continue fi if [[ "$result" == PASSED* ]]; then PASSED_TESTS=$((PASSED_TESTS + 1)) - if [ "$mode" = "sanity" ]; then - local total_in_file sampled_in_file - # shellcheck disable=SC2034 # status is part of the read but unused - IFS=':' read -r _ total_in_file sampled_in_file <<< "$result" - TOTAL_TEST_CASES=$((TOTAL_TEST_CASES + total_in_file)) - SAMPLED_TEST_CASES=$((SAMPLED_TEST_CASES + sampled_in_file)) - fi elif [[ "$result" == FAILED* ]]; then record_failed_test "$test_file" - if [ "$mode" = "sanity" ]; then - local total_in_file sampled_in_file - # shellcheck disable=SC2034 # status is part of the read but unused - IFS=':' read -r _ total_in_file sampled_in_file <<< "$result" - TOTAL_TEST_CASES=$((TOTAL_TEST_CASES + total_in_file)) - SAMPLED_TEST_CASES=$((SAMPLED_TEST_CASES + sampled_in_file)) - fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_utils.sh` around lines 687 - 710, The no-result early continue skips updating sanity-mode counters, so parse and apply the PASSED:total:sampled or FAILED:total:sampled metadata before the missing-JUnit early exit: in the block around record_no_result_test and the junit_file check, if mode == "sanity" inspect the existing result variable (same format handled later) and increment TOTAL_TEST_CASES and SAMPLED_TEST_CASES accordingly before calling record_no_result_test and continue; update the same parsing logic used in the PASSED*/FAILED* branches (use IFS=':' read -r _ total_in_file sampled_in_file) so counts remain consistent for both passed/failed and no-result cases while leaving record_no_result_test and PASSED_TESTS/record_failed_test behavior unchanged.
🧹 Nitpick comments (1)
scripts/test_utils.sh (1)
174-178: Consider making JUnit paths collision-resistant for robustness.Line 177's
/→_mapping is theoretically lossy. While the current test file set has no collisions, distinct test files could map to the same XML name if naming patterns overlap (e.g.,tests/foo_bar/test_x.pyandtests/foo/bar_test_x.py). With the newrm -fin parallel jobs, collisions could cause one job to delete another's JUnit artifact, corrupting no-result reporting. Adding a checksum suffix would make the mapping collision-resistant for future-proofing.Suggested direction
junit_file_for_test() { local test_file=$1 - echo "${JUNIT_DIR}/${test_file//\//_}.xml" + local safe_name + local suffix + safe_name=${test_file//\//_} + suffix=$(printf '%s' "$test_file" | cksum | awk '{print $1}') + echo "${JUNIT_DIR}/${safe_name}.${suffix}.xml" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_utils.sh` around lines 174 - 178, The junit_file_for_test function currently maps slashes to underscores which can collide (e.g., tests/foo_bar/test_x.py vs tests/foo/bar_test_x.py); change junit_file_for_test to append a short checksum of the original test_file to the generated name to make it collision-resistant: keep the existing sanitized name (${JUNIT_DIR}/${test_file//\//_}.xml) but add a suffix like -<hexsum> before .xml computed from the raw test_file (use md5sum/sha1 or shasum -a1 and truncate to e.g. 8-12 chars) so the function junit_file_for_test and variable JUNIT_DIR produce unique, deterministic JUnit paths across parallel jobs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@scripts/test_utils.sh`:
- Around line 687-710: The no-result early continue skips updating sanity-mode
counters, so parse and apply the PASSED:total:sampled or FAILED:total:sampled
metadata before the missing-JUnit early exit: in the block around
record_no_result_test and the junit_file check, if mode == "sanity" inspect the
existing result variable (same format handled later) and increment
TOTAL_TEST_CASES and SAMPLED_TEST_CASES accordingly before calling
record_no_result_test and continue; update the same parsing logic used in the
PASSED*/FAILED* branches (use IFS=':' read -r _ total_in_file sampled_in_file)
so counts remain consistent for both passed/failed and no-result cases while
leaving record_no_result_test and PASSED_TESTS/record_failed_test behavior
unchanged.
---
Nitpick comments:
In `@scripts/test_utils.sh`:
- Around line 174-178: The junit_file_for_test function currently maps slashes
to underscores which can collide (e.g., tests/foo_bar/test_x.py vs
tests/foo/bar_test_x.py); change junit_file_for_test to append a short checksum
of the original test_file to the generated name to make it collision-resistant:
keep the existing sanitized name (${JUNIT_DIR}/${test_file//\//_}.xml) but add a
suffix like -<hexsum> before .xml computed from the raw test_file (use
md5sum/sha1 or shasum -a1 and truncate to e.g. 8-12 chars) so the function
junit_file_for_test and variable JUNIT_DIR produce unique, deterministic JUnit
paths across parallel jobs.
|
/bot run |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/test_utils.sh (1)
681-709:⚠️ Potential issue | 🟠 MajorHandle malformed result markers so CI cannot exit green.
If
$result_fileexists but is empty/corrupt and$junit_fileexists, this block incrementsTOTAL_TESTSbut records neither failed nor no-result, leavingEXIT_CODEunchanged even though the summary derives a failure. Sincescripts/task_run_unit_tests.shexits withEXIT_CODE, classify unknown markers explicitly.🐛 Proposed fix
- if [[ "$result" == PASSED* ]]; then - PASSED_TESTS=$((PASSED_TESTS + 1)) - elif [[ "$result" == FAILED* ]]; then - record_failed_test "$test_file" - fi + case "$result" in + PASSED|PASSED:*) + PASSED_TESTS=$((PASSED_TESTS + 1)) + ;; + FAILED|FAILED:*) + record_failed_test "$test_file" + ;; + *) + echo "⚠️ NO RESULT: $test_file (unrecognized result marker: $result_file)" + record_no_result_test "$test_file" + ;; + esac🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_utils.sh` around lines 681 - 709, The test-result handling loop increments TOTAL_TESTS but currently ignores malformed/unknown $result markers (empty or corrupt) and so neither records a failure nor a no-result; update the branch after checking for JUnit XML and result prefixes in the block using TOTAL_TESTS, PASSED_TESTS, record_failed_test and record_no_result_test so that any $result that does not match the expected "PASSED:*", "FAILED:*" or "SKIPPED*" patterns is treated explicitly (e.g., call record_failed_test or record_no_result_test) to ensure malformed markers contribute to EXIT_CODE rather than leaving them unaccounted for.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@scripts/test_utils.sh`:
- Around line 681-709: The test-result handling loop increments TOTAL_TESTS but
currently ignores malformed/unknown $result markers (empty or corrupt) and so
neither records a failure nor a no-result; update the branch after checking for
JUnit XML and result prefixes in the block using TOTAL_TESTS, PASSED_TESTS,
record_failed_test and record_no_result_test so that any $result that does not
match the expected "PASSED:*", "FAILED:*" or "SKIPPED*" patterns is treated
explicitly (e.g., call record_failed_test or record_no_result_test) to ensure
malformed markers contribute to EXIT_CODE rather than leaving them unaccounted
for.
|
It means that the test ran but did not complete. Most likely due to OOM but we don’t have enough info beyond that.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/test_utils.sh`:
- Around line 210-214: The record_no_result_test function updates
NO_RESULT_TESTS and NO_RESULT_COUNT but doesn't mark the overall run as failed;
modify record_no_result_test to set a non-zero EXIT_CODE (e.g., EXIT_CODE=1)
when invoked so suites that only have missing JUnit/result artifacts are treated
as failures; update the function (record_no_result_test) to assign EXIT_CODE=1
after incrementing NO_RESULT_COUNT and before returning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
📌 Description
Currently, we have a handful of tests during parallel runs that are being OOM killed due to high memory usage. I previously attempted to resolve it in #2961, but some of the tests will need a little more pruning before they can be added in solo as they take too long in addition to consuming tons of memory.
🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
Summary by CodeRabbit
Bug Fixes
Chores