Skip to content

Fix speccpu2017 processor to find CSVs in nested result directories#22

Merged
grdumas merged 1 commit into
redhat-performance:mainfrom
sayalibhavsar:fix/speccpu2017-nested-csv-discovery
Jun 14, 2026
Merged

Fix speccpu2017 processor to find CSVs in nested result directories#22
grdumas merged 1 commit into
redhat-performance:mainfrom
sayalibhavsar:fix/speccpu2017-nested-csv-discovery

Conversation

@sayalibhavsar

@sayalibhavsar sayalibhavsar commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

  • The speccpu2017-wrapper v2.6 produces archives where CSV results are nested inside results_speccpu_*/result/ rather than directly in result/
  • The processor searched one level deep with glob("*.csv"), causing all SPEC CPU 2017 results to be silently empty, an empty document was indexed to OpenSearch with no suite data
  • Switched from glob() to rglob() so the processor discovers CSVs regardless of directory nesting depth

Archive structure produced by wrapper v2.6

results_speccpu2017.zip
  └── results_speccpu2017_virtual-guest.tar
        └── speccpu2017_{timestamp}/                          ← extracted_path
              ├── meta_data.yml
              ├── version
              └── results_speccpu_virtual-guest_{timestamp}/  ← CSVs are here
                    └── result/
                          ├── CPU2017.003.intrate.refrate.results.csv
                          └── CPU2017.004.fprate.refrate.results.csv

The processor searched extracted_path/ and extracted_path/result/ but never extracted_path/results_speccpu_*/result/.

Test plan

The speccpu2017-wrapper v2.6 produces archives where CSV results are
nested inside results_speccpu_*/result/ rather than directly in
result/. The processor only searched one level deep with glob(),
causing all SPEC CPU 2017 results to be silently empty.

Switch from glob() to rglob() so the processor discovers CSVs
regardless of directory depth.
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR modifies SpecCPU2017Processor to discover SPEC CPU 2017 suite CSV files using recursive directory traversal instead of non-recursive scan, enabling the processor to find timestamp CSVs in nested result subdirectories. A test validates the behavior with the wrapper v2.6+ directory layout.

Changes

Recursive CSV Discovery in Nested Results

Layer / File(s) Summary
Recursive CSV file discovery in fallback path
src/chronicler/processors/speccpu2017_processor.py
SpecCPU2017Processor.parse_runs replaces glob("*.csv") with rglob("*.csv") to recursively search the chosen result directory for intrate and fprate timestamp CSVs.
Test nested CSV discovery
tests/test_speccpu2017_timestamps.py
New test test_speccpu2017_csv_in_nested_results_subdir verifies the processor discovers and parses SPEC CPU 2017 intrate and fprate CSVs located under nested results_speccpu_*/result/ subdirectories.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix speccpu2017 processor to find CSVs in nested result directories' clearly and concisely summarizes the main change: switching from glob() to rglob() to discover CSV files at any nesting depth.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly explains the issue with speccpu2017-wrapper v2.6 producing nested CSV directories, the specific problem with glob() vs rglob(), and includes the exact archive structure and test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@grdumas grdumas left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@grdumas grdumas merged commit 91727d5 into redhat-performance:main Jun 14, 2026
2 checks passed
@grdumas grdumas added the bug Something isn't working label Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

speccpu2017_processor fails to find CSV results - incorrect directory search depth

2 participants