Skip to content

RPOPC-1317: Extract STREAMS benchmark version from CSV metadata#49

Merged
grdumas merged 4 commits into
mainfrom
fix/RPOPC-1317-streams-version-extraction
Jun 16, 2026
Merged

RPOPC-1317: Extract STREAMS benchmark version from CSV metadata#49
grdumas merged 4 commits into
mainfrom
fix/RPOPC-1317-streams-version-extraction

Conversation

@grdumas

@grdumas grdumas commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fix StreamsProcessor to extract the STREAMS benchmark version from CSV metadata comments instead of incorrectly using the wrapper version.

Problem

StreamsProcessor was using the wrapper version (e.g., "v2.8") for test.version instead of the actual STREAMS benchmark version (e.g., "5.10") found in CSV metadata.

Acceptance Criteria

  • Extract benchmark version from CSV metadata comments during _parse_streams_csv()
  • Use regex pattern to match streams_version_# followed by version number
  • Store extracted version in self._benchmark_version
  • Override build_test_info() to use the extracted benchmark version instead of wrapper version
  • test.version contains the benchmark version (e.g., "5.10") not the wrapper version (e.g., "v2.8")

Changes

  • Added _benchmark_version instance variable to StreamsProcessor
  • Extract version from # streams_version_# X.Y comments in _parse_streams_csv()
  • Use first occurrence if multiple version comments present
  • Override build_test_info() to prioritize benchmark version over wrapper version
  • Fall back to wrapper version when no benchmark version found

Testing

  • Unit tests added for version extraction (5 new tests)
  • All existing tests passing (273 total tests)
  • Edge cases covered: whitespace variations, missing version, multiple versions, different formats

Related

Agent VM and others added 2 commits June 15, 2026 22:55
- Add test for extracting version from CSV comment
- Add test for whitespace variations
- Add test for fallback to wrapper version when missing
- Add test for using first occurrence when multiple versions
- Add test for different version number formats

Part of RPOPC-1317. Tests currently fail (RED phase).
- Add _benchmark_version instance variable to store extracted version
- Extract version from '# streams_version_# X.Y' CSV comment in _parse_streams_csv()
- Use first occurrence if multiple version comments present
- Override build_test_info() to use benchmark version for test.version
- Fall back to wrapper version when benchmark version not found

Implements RPOPC-1317. All tests passing (GREEN phase).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@grdumas, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 minutes and 35 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 2ced592c-1521-4fc1-9fae-a1e319afcff3

📥 Commits

Reviewing files that changed from the base of the PR and between 2a64221 and 6d78954.

📒 Files selected for processing (2)
  • src/chronicler/processors/streams_processor.py
  • tests/test_streams_version_extraction.py
📝 Walkthrough

Walkthrough

StreamsProcessor gains an __init__ that initializes _benchmark_version and a build_test_info() override that populates TestInfo.version with a version string parsed from # streams_version_# <version> comment lines in results_streams.csv. A new test module covers extraction, whitespace tolerance, fallback, first-occurrence, and multiple version string formats.

Changes

STREAMS benchmark version extraction

Layer / File(s) Summary
Version extraction and build_test_info override
src/chronicler/processors/streams_processor.py
__init__ initializes _benchmark_version = None; the CSV parsing loop detects # streams_version_# ... comment lines via regex and sets _benchmark_version on the first match only; build_test_info() calls the base implementation and substitutes TestInfo.version with the extracted version while keeping wrapper_version from the parent.
Unit tests for version extraction
tests/test_streams_version_extraction.py
Adds _write_csv helper and five test functions covering: basic comment extraction, whitespace variations, fallback to wrapper version, first-occurrence-only behavior, and acceptance of x.y, x.y.z, vX.Y, and YYYY.X version formats.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: extracting STREAMS benchmark version from CSV metadata, which is the primary focus of the changeset.
Description check ✅ Passed The description is directly related to the changeset, providing a comprehensive summary of the problem, solution, acceptance criteria, and testing approach for extracting benchmark versions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@grdumas grdumas self-assigned this Jun 16, 2026
@grdumas grdumas added the bug Something isn't working label Jun 16, 2026

@grdumas grdumas left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: RPOPC-1317: Extract STREAMS benchmark version from CSV metadata

Summary

This PR successfully addresses the version conflation issue for the STREAMS benchmark by correctly parsing its version from the CSV metadata comments and utilizing the newly established build_test_info() override pattern.

Critical Issues (MUST FIX)

None found.

Security Delta

None found. No security-sensitive code was removed or weakened.

Major Issues (SHOULD FIX)

None found.

Minor Issues (NICE TO HAVE)

None found.

Nitpicks (OPTIONAL)

None found.

Positive Notes

  • Pattern Adoption: Excellent use of the build_test_info() override pattern established in the base processor. The fallback logic self._benchmark_version or base_info.version is clean and defensive.
  • Edge Case Handling: The regex r'streams_version_#\s+(\S+)' correctly handles arbitrary whitespace variations, and the condition if self._benchmark_version is None: safely ensures that only the first version comment in the file is captured.
  • Testing: The 5 new unit tests are comprehensive, covering variations in whitespace, missing versions, multiple version strings, and different version formats.

Overall Assessment

  • Status: APPROVE
  • Reasoning: The code cleanly extracts the correct benchmark version without impacting the existing wrapper version assignment. It handles the parsing robustly and includes a strong suite of tests.
  • Next Steps: Ready to merge.

Reviewed by: Gemini Pro via automated code review

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_streams_version_extraction.py (1)

23-109: ⚡ Quick win

Add a regression test for processor reuse across multiple parses.

Current tests don’t exercise calling parse_runs() twice on the same StreamsProcessor instance (first CSV has streams_version_#, second CSV omits it). That scenario should assert test.version falls back correctly on the second parse and does not retain stale benchmark version.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_streams_version_extraction.py` around lines 23 - 109, Add a new
regression test function to verify that the StreamsProcessor correctly handles
multiple sequential parse_runs() calls without retaining stale state. Create a
test that instantiates a StreamsProcessor once, calls parse_runs with a CSV
containing a streams_version_# comment (e.g., "5.10"), then calls parse_runs
again with a different CSV that omits the version comment, and verifies that
build_test_info() returns the fallback wrapper version (not the stale "5.10"
from the first parse). This ensures that the processor resets its benchmark
version appropriately on subsequent parses when the comment is absent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/chronicler/processors/streams_processor.py`:
- Around line 223-227: The `_benchmark_version` attribute in the
`StreamsProcessor` class is captured only once per processor instance and never
reset before a new parse, causing stale benchmark version metadata to persist
across multiple input parses when the same processor instance is reused. Reset
`_benchmark_version` to None at the beginning of the parse method (or whichever
method initiates a new parse operation) to ensure each parse starts with a clean
state and captures the correct version for the current input.

---

Nitpick comments:
In `@tests/test_streams_version_extraction.py`:
- Around line 23-109: Add a new regression test function to verify that the
StreamsProcessor correctly handles multiple sequential parse_runs() calls
without retaining stale state. Create a test that instantiates a
StreamsProcessor once, calls parse_runs with a CSV containing a
streams_version_# comment (e.g., "5.10"), then calls parse_runs again with a
different CSV that omits the version comment, and verifies that
build_test_info() returns the fallback wrapper version (not the stale "5.10"
from the first parse). This ensures that the processor resets its benchmark
version appropriately on subsequent parses when the comment is absent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 8a31cf16-7453-44eb-902e-ebee28b4d7dd

📥 Commits

Reviewing files that changed from the base of the PR and between fe9c9db and 2a64221.

📒 Files selected for processing (2)
  • src/chronicler/processors/streams_processor.py
  • tests/test_streams_version_extraction.py

Comment thread src/chronicler/processors/streams_processor.py
Agent VM and others added 2 commits June 15, 2026 23:03
Add test to verify _benchmark_version doesn't leak between
parse_runs() calls when reusing the same processor instance.

Currently fails (RED phase) - demonstrates the bug where
second parse incorrectly retains first parse's version.

Addresses review feedback on PR #49.
Reset self._benchmark_version to None at the start of
parse_runs() to prevent state leakage when the same processor
instance is reused across multiple parses.

Without this fix, if a processor parsed CSV1 (with version)
then CSV2 (without version), CSV2 would incorrectly inherit
CSV1's benchmark version instead of falling back to wrapper
version.

Addresses review feedback on PR #49.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@grdumas

grdumas commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

PR Update: Addressed Review Feedback

What was done

  1. Added regression test for processor state leakage (commit 10ae8d3)

    • Test verifies _benchmark_version doesn't leak between parse_runs() calls when reusing the same processor instance
    • First parse with version "5.10", second parse without version should fall back to wrapper "v2.8"
    • Test initially failed (RED phase), confirming the bug
  2. Reset benchmark version state at start of each parse (commit 6d78954)

    • Added self._benchmark_version = None at the beginning of parse_runs()
    • Ensures clean state for each parse operation
    • Prevents stale version metadata from leaking across multiple parses

What was not done

None - all requested changes were implemented.

Why this approach

The state leakage bug was subtle because all existing tests created fresh StreamsProcessor instances for each test case, never exercising the reuse scenario. In production, if the same processor instance processes multiple result files, the second file would incorrectly inherit the first file's benchmark version instead of extracting its own or falling back to the wrapper version.

Resetting _benchmark_version at the start of parse_runs() is the minimal fix that:

  • Ensures each parse starts with clean state
  • Follows the principle that parse_runs() is the entry point for processing a new result
  • Has zero performance impact
  • Maintains backward compatibility

Verification

All 274 tests pass (6 version extraction tests + 7 other STREAMS tests + 261 other tests).

Before fix:

test_streams_version_resets_between_parses FAILED
AssertionError: Second parse should fall back to wrapper version, not retain stale '5.10'
assert '5.10' == 'v2.8'

After fix:

tests/test_streams_version_extraction.py::test_streams_version_resets_between_parses PASSED

The PR is now ready for re-review.


Responded by: Claude Sonnet 4.5 via automated workflow

@grdumas grdumas left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: RPOPC-1317: Extract STREAMS benchmark version from CSV metadata

Summary

This update cleanly addresses the potential issue of state leakage when a single StreamsProcessor instance processes multiple distinct test runs sequentially.

Critical Issues (MUST FIX)

None found.

Major Issues (SHOULD FIX)

None found.

Minor Issues (NICE TO HAVE)

None found.

Nitpicks (OPTIONAL)

None found.

Positive Notes

  • State Management: Resetting self._benchmark_version = None at the top of parse_runs() is a great catch and an excellent practice for defensive programming, ensuring no stale data leaks between processing tasks.
  • Testing: The new test, test_streams_version_resets_between_parses, flawlessly proves the regression is avoided. Testing stateful behaviors like this is critical for long-running batch systems.

Overall Assessment

  • Status: APPROVE
  • Reasoning: The core implementation remains robust, and the state-leakage fix ensures safety during batch processing. Test coverage remains excellent and all 274 tests pass.
  • Next Steps: Ready to merge.

Reviewed by: Gemini Pro via automated code review

@grdumas grdumas left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@grdumas grdumas merged commit 1614097 into main Jun 16, 2026
2 checks passed
@grdumas grdumas deleted the fix/RPOPC-1317-streams-version-extraction branch June 16, 2026 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant