RPOPC-1347: Add UUID tracking fields to Chronicler schema by grdumas · Pull Request #58 · redhat-performance/chronicler

grdumas · 2026-06-29T23:00:45Z

Summary

Add three UUID fields (project_uuid, run_uuid, result_uuid) to the Metadata dataclass in Chronicler schema to support Zathras UUID-based tracking for better result correlation across different indices and systems.

Acceptance Criteria

Add three UUID fields to the Metadata dataclass: project_uuid, run_uuid, result_uuid
All UUID fields are Optional[str] for backward compatibility
UUIDs are included in to_dict() method output when present
UUIDs are excluded from calculate_content_hash() (they're identifiers, not content)
Verify schema serialization/deserialization with UUIDs present and absent
Confirm backward compatibility with documents lacking UUID fields
Update test_schema.py with UUID test cases

Changes

Added project_uuid, run_uuid, result_uuid fields to Metadata dataclass (all Optional[str])
Updated calculate_content_hash() to exclude UUIDs from hash calculation
Added comprehensive unit tests for UUID serialization and backward compatibility
Updated docstring to document UUID exclusion from content hash

Testing

Unit tests added for UUID serialization (present and absent)
Unit tests added for UUID exclusion from content hash
All 43 schema tests passing
Backward compatibility verified (existing tests still pass)

Technical Notes

UUIDs use Optional[str] type for backward compatibility
Existing to_dict() method already handles Optional fields correctly
UUIDs are excluded from content hash because they are identifiers for tracking/linking, not part of the actual test result content
This ensures duplicate detection is based on test data, not tracking metadata

- Add failing tests for project_uuid, run_uuid, result_uuid fields - Implement three optional UUID fields in Metadata dataclass - Verify serialization includes UUIDs when present - Verify backward compatibility when UUIDs are None Part of RPOPC-1347. Supports Zathras UUID-based tracking for better result correlation across different indices and systems. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add failing test for UUID exclusion from content hash - Implement UUID exclusion in calculate_content_hash() - Update docstring to document UUID exclusion UUIDs are identifiers for tracking/linking, not part of the actual test result content. Excluding them ensures duplicate detection is based on test data, not tracking metadata. Part of RPOPC-1347. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai · 2026-06-29T23:00:57Z

Warning

Review limit reached

@grdumas, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 42 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: c53d6d56-c9a6-4612-a774-2ad29afe42d7

📥 Commits

Reviewing files that changed from the base of the PR and between a823b10 and 122482a.

📒 Files selected for processing (2)

src/chronicler/schema.py
tests/test_schema.py

📝 Walkthrough

Walkthrough

Three optional UUID fields (project_uuid, run_uuid, result_uuid) are added to Metadata. ZathrasDocument.calculate_content_hash is updated to strip these fields before SHA256 computation. Tests verify serialization presence/absence and hash stability across differing UUIDs.

Changes

UUID Fields and Hash Exclusion

Layer / File(s)	Summary
Metadata UUID fields and hash exclusion logic `src/chronicler/schema.py`	Adds `project_uuid`, `run_uuid`, `result_uuid` as optional fields on `Metadata`; updates `calculate_content_hash` docstring and removes these fields from `doc_dict['metadata']` before JSON serialization and SHA256 hashing.
Tests for UUID serialization and hash stability `tests/test_schema.py`	Adds `to_dict()` tests asserting UUID fields serialize when set and are omitted when absent; adds `calculate_content_hash` test confirming differing UUID values produce identical hashes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly matches the main change: adding UUID tracking fields to the Chronicler schema.
Description check	✅ Passed	The description is directly about the schema UUID fields and hash behavior changed in this PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/chronicler/schema.py`:
- Line 397: The hash contract in calculate_content_hash and the related schema
logic is inconsistent because UUIDs are only removed when
exclude_processing_timestamp is true. Update the UUID exclusion path in
src/chronicler/schema.py so project_uuid, run_uuid, and result_uuid are always
omitted from the hashed content, regardless of the timestamp flag, and keep the
timestamp handling separate from UUID filtering. Verify the normalization used
by calculate_content_hash matches the documented behavior and the constants or
helpers that build the content payload.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: b4135823-2f91-4250-86f9-1aa908b5bb7e

📥 Commits

Reviewing files that changed from the base of the PR and between 1e1ae48 and a823b10.

📒 Files selected for processing (2)

src/chronicler/schema.py
tests/test_schema.py

grdumas

PR Review: RPOPC-1347: Add UUID tracking fields to Chronicler schema

Summary

PR adds three optional UUID fields (project_uuid, run_uuid, result_uuid) to the Metadata dataclass for Zathras integration. Implementation is correct with proper backward compatibility and content hash exclusion. No blocking issues found.

Critical Issues (MUST FIX)

None found.

Major Issues (SHOULD FIX)

None found.

Minor Issues (NICE TO HAVE)

None found.

Nitpicks (OPTIONAL)

None found.

Positive Notes

Proper use of Optional[str] for backward compatibility with existing documents
Correct exclusion of UUIDs from content hash calculation (identifiers vs content)
Comprehensive test coverage including serialization, backward compatibility, and hash exclusion
Clear documentation in docstring explaining why UUIDs are excluded from hash
Test assertions verify both presence and absence of UUID fields
Consistent with existing field patterns (pcp_archive_url follows same Optional approach)

Overall Assessment

Status: APPROVE
Reasoning: Clean implementation with appropriate type safety, backward compatibility, and test coverage. UUIDs are correctly treated as opaque identifiers rather than validated content, which is appropriate for external tracking metadata from Zathras.
Next Steps: Ready to merge

Reviewed by: Claude Sonnet 4.5 via automated code review

Restructured calculate_content_hash() to always exclude UUIDs, document_id, and pcp_archive_url regardless of the exclude_processing_timestamp parameter. This ensures the implementation matches the documented contract. Previously, UUIDs were only excluded when exclude_processing_timestamp was True, which violated the docstring promise that "UUIDs are always excluded" from the hash. - Moved UUID/document_id/pcp_archive_url removal outside conditional - Kept timestamp removal conditional as intended - Added test verifying UUIDs excluded with timestamp flag False Addresses review feedback on PR #58. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

grdumas · 2026-06-29T23:18:20Z

PR Update: Addressed Review Feedback

What was done

Fixed UUID exclusion logic (commit 122482a)
- Restructured calculate_content_hash() to separate timestamp handling from UUID/document_id/pcp_archive_url exclusion
- UUIDs, document_id, and pcp_archive_url are now always excluded from hash calculation, regardless of the exclude_processing_timestamp parameter
- Previously these fields were only excluded when exclude_processing_timestamp=True, which violated the documented contract

Why this approach

The original implementation had an inconsistency: the docstring promised "UUIDs are always excluded" from the hash, but the code only excluded them conditionally. This created a contract violation where calling calculate_content_hash(exclude_processing_timestamp=False) would include UUIDs in the hash.

The fix separates two distinct concerns:

Timestamp exclusion: Conditional (controlled by the exclude_processing_timestamp flag)
Identifier exclusion: Unconditional (document_id, pcp_archive_url, and UUIDs are always omitted)

This makes the code match its documentation and provides more predictable behavior - identifiers never affect the content hash, while timestamps can optionally be included.

Verification

Added test test_calculate_content_hash_excludes_uuids_regardless_of_timestamp_flag to verify UUIDs are excluded even when exclude_processing_timestamp=False:

# Same content, different UUIDs, timestamp flag False - UUIDs still excluded
hash1 = doc1.calculate_content_hash(exclude_processing_timestamp=False)
hash2 = doc2.calculate_content_hash(exclude_processing_timestamp=False)
assert hash1 == hash2  # ✅ Passes

All 44 schema tests pass.

The PR is now ready for re-review.

Responded by: Claude Sonnet 4.5 via automated workflow

grdumas

PR Review: RPOPC-1347: Add UUID tracking fields to Chronicler schema (Updated)

Summary

Updated PR adds three optional UUID fields (project_uuid, run_uuid, result_uuid) to the Metadata dataclass for Zathras integration. Implementation includes a correctness fix for calculate_content_hash() that makes document_id and pcp_archive_url exclusion independent of the timestamp flag, matching the documented behavior. No blocking issues found.

Critical Issues (MUST FIX)

None found.

Major Issues (SHOULD FIX)

None found.

Minor Issues (NICE TO HAVE)

File: src/chronicler/schema.py:412-425
Consideration: The refactored hash calculation logic is a behavior change for callers using exclude_processing_timestamp=False. Previously, document_id and pcp_archive_url were only excluded when the flag was True. Now they're always excluded.
Impact: This is actually a bug fix (implementation now matches docstring), but it's a breaking change for the edge case where someone passes False. The only known caller (base_processor.py:143) uses the default True, so production impact is minimal.
Suggestion: Consider documenting this behavior change in the PR description or commit message for future reference. The existing commit message for 122482a already does this well.

Nitpicks (OPTIONAL)

None found.

Positive Notes

Excellent test coverage including the edge case with exclude_processing_timestamp=False
Correct identification and fix of the document_id/pcp_archive_url exclusion bug
Clear commit message in 122482a explaining the rationale for the refactoring
Comments in the code clearly explain the "always remove" vs "conditional remove" distinction
Test name "test_calculate_content_hash_excludes_uuids_regardless_of_timestamp_flag" is descriptive
Implementation now correctly matches the documented contract in the docstring

Overall Assessment

Status: APPROVE
Reasoning: Clean implementation with appropriate bug fixes. The hash calculation refactoring corrects a long-standing bug where the implementation didn't match the documented behavior. UUID fields are properly added with backward compatibility, comprehensive test coverage, and correct hash exclusion logic. The behavioral change for exclude_processing_timestamp=False is actually a correctness improvement.
Next Steps: Ready to merge

Reviewed by: Claude Sonnet 4.5 via automated code review

Agent VM and others added 2 commits June 29, 2026 18:56

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/chronicler/schema.py

grdumas commented Jun 29, 2026

View reviewed changes

grdumas self-assigned this Jun 29, 2026

grdumas added the enhancement New feature or request label Jun 29, 2026

grdumas commented Jun 29, 2026

View reviewed changes

grdumas merged commit e2b25ca into main Jun 29, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RPOPC-1347: Add UUID tracking fields to Chronicler schema#58

RPOPC-1347: Add UUID tracking fields to Chronicler schema#58
grdumas merged 3 commits into
mainfrom
feat/RPOPC-1347-uuid-fields

grdumas commented Jun 29, 2026

Uh oh!

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

grdumas left a comment

Uh oh!

grdumas commented Jun 29, 2026

Uh oh!

grdumas left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

grdumas commented Jun 29, 2026

Summary

Acceptance Criteria

Changes

Testing

Technical Notes

Related

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

grdumas left a comment

Choose a reason for hiding this comment

PR Review: RPOPC-1347: Add UUID tracking fields to Chronicler schema

Summary

Critical Issues (MUST FIX)

Major Issues (SHOULD FIX)

Minor Issues (NICE TO HAVE)

Nitpicks (OPTIONAL)

Positive Notes

Overall Assessment

Uh oh!

grdumas commented Jun 29, 2026

PR Update: Addressed Review Feedback

What was done

Why this approach

Verification

Uh oh!

grdumas left a comment

Choose a reason for hiding this comment

PR Review: RPOPC-1347: Add UUID tracking fields to Chronicler schema (Updated)

Summary

Critical Issues (MUST FIX)

Major Issues (SHOULD FIX)

Minor Issues (NICE TO HAVE)

Nitpicks (OPTIONAL)

Positive Notes

Overall Assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading