Skip to content

Conversation

@rockythorn
Copy link
Collaborator

Fix CSAF parser for modular packages and add EUS filtering

Summary

This PR fixes the CSAF (Common Security Advisory Framework) parser to correctly extract modular package information and adds filtering to exclude EUS (Extended Update Support) products from advisory processing. It also fixes a bug in the CSV merge logic that was preventing advisory updates from being detected.

Changes

Modular Package Extraction Fix:

  • Rewrote package extraction to use product_tree structure instead of vulnerability product_status
  • New _extract_packages_from_product_tree() function traverses the product tree to extract NEVRAs directly from product_id fields
  • Correctly handles both regular and modular packages by extracting from product_version entries
  • Strips modular suffix (::module:stream) from NEVRAs to get clean package identifiers
  • Filters out non-RPM artifacts (containers, etc.) using PURL validation

EUS Product Filtering:

  • Added _is_eus_product() function to detect EUS/E4S/AUS/TUS products
  • Checks both CPE product field and product name keywords for reliable detection
  • Filters EUS products at multiple levels:
    • During affected product extraction
    • During package extraction from product tree
  • Skips advisories that only affect EUS products
  • Prevents EUS-only packages from contaminating regular release repositories

CSV Merge Fix:

  • Fixed merge order to prioritize changes.csv over releases.csv
  • changes.csv contains the most recent modification timestamps for advisories
  • releases.csv contains original publication dates
  • Previous order was causing updated advisories to be skipped if they were already in releases.csv

Web UI Addition:

  • Added admin interface for managing CSAF index timestamp
  • Allows administrators to manually adjust the "last indexed" timestamp
  • Useful for reprocessing advisories or troubleshooting ingestion issues

Technical Details

The previous implementation extracted packages from vulnerability.product_status.fixed, which only contained regular packages. Modular packages are stored in the product_tree structure under product_version entries with format "package-nevra::module:stream". The new implementation correctly traverses this structure and extracts both package types.

EUS products are detected using CPE strings (e.g., cpe:/a:redhat:rhel_e4s:9.0) and product name keywords. These products provide extended support for specific minor versions and should not be mixed with regular releases.

@rockythorn rockythorn force-pushed the bugfix/modular-package-extraction branch from 9e85e56 to 5ad84c9 Compare November 10, 2025 19:57
This commit enhances the generate_rocky_config.py script with two key improvements:

1. Flexible version matching for RHEL 8/9/10+ compatibility:
   - Major-only filtering (e.g., --version 9): Matches any minor version within
     that major version (9.0, 9.1, 9.2, 9.6, etc.)
   - Full version filtering (e.g., --version 9.6): Requires exact match to the
     specified major.minor version

   This addresses differences in Red Hat's advisory format across RHEL versions:
   - RHEL 8 & 9: Advisories typically don't include minor versions
   - RHEL 10+: Advisories now include minor versions (e.g., "RHEL 10.2")

   The flexible matching ensures that repository configurations can be generated
   with appropriate version matching rules (NULL match_minor_version for RHEL 8/9,
   specific match_minor_version for RHEL 10+).

2. Custom mirror naming with --mirror-name-base option:
   - Allows specifying a custom base name for generated mirror configurations
   - Example: --mirror-name-base "Rocky Linux 9" generates "Rocky Linux 9 x86_64"
     instead of "Rocky Linux 9.6 x86_64"
   - Useful for creating legacy product entries or custom naming schemes
   - Works in combination with --name-suffix for additional flexibility

These changes improve Apollo's ability to generate configurations that align
with Red Hat's advisory matching requirements across different major versions.
- Remove redundant None and empty string checks in mirror name building
- Consolidate version filtering logic into single condition block
- Eliminate unnecessary ternary operator in version parsing
Any advisory that addresses at least one CVE should be considered a
Security Advisory and should returned by the OSV api. Instead of
filtering strictly on the advisory "kind" (eg- Security, Bug Fix,
Enhancement) we should instead filter based on if there are associated
CVEs for the given advisory.
Remove self-explanatory comments that restate what the code does:
- Removed obvious filter condition comments
- Removed type conversion comment
- Removed severity calculation comment
This commit refactors the Red Hat CSAF parser to fix two major issues:

1. Modular Package Extraction Bug
   - Old code failed to extract modular packages due to ::module:stream suffix
   - New code extracts NEVRA directly from product_tree product_id field
   - Strips ::module:stream suffix while preserving full NEVRA with epoch
   - Fixes 12+ affected advisories (e.g., RHSA-2025:12008 for redis:7)

2. EUS Advisory Filtering
   - Detects EUS/E4S/AUS/TUS products via CPE and product name
   - Filters out EUS-only advisories during ingestion
   - Reduces processed advisories by ~50%
   - Skips advisories where all products are EUS-related

Changes:
- apollo/rhcsaf/__init__.py:
  - Added _is_eus_product() helper for EUS detection
  - Added _extract_packages_from_product_tree() for product_tree parsing
  - Updated extract_rhel_affected_products_for_db() to filter EUS products
  - Updated red_hat_advisory_scraper() to use new extraction and skip EUS-only

- apollo/tests/test_rhcsaf.py:
  - Updated test data to include product_version entries
  - Added TestEUSDetection class (3 tests)
  - Added TestModularPackages class (1 test)
  - Added TestEUSAdvisoryFiltering class (1 test)

Validation:
- Standalone testing in temp/modular_package_fix/ confirmed:
  - 18 modular packages extracted (was 0)
  - Regular packages work identically (no regression)
  - EUS advisories correctly filtered
  - All data fields preserved (CVEs, Bugzillas, metadata)
The previous code incorrectly let releases.csv overwrite changes.csv timestamps.
This caused the workflow to miss advisory updates, as changes.csv contains the
most recent modification times while releases.csv contains original publication
dates.

With this fix, when Red Hat updates advisories (like the mass update on
2025-11-07), the workflow will correctly detect and reprocess them.

Changes:
- Reversed merge order: {**releases, **changes} so changes.csv takes precedence
- Updated comment to clarify the intended behavior
- Ensures updated advisories are reprocessed to catch corrections/additions
Add admin interface to view and update the last_indexed_at timestamp
that controls which CSAF advisories are processed by the Poll RHCSAF
workflow.

Changes:
- Add DatabaseService methods for getting and updating last_indexed_at
- Add admin route handlers for timestamp management
- Add UI section with date picker and automatic ISO 8601 conversion
- Remove duplicate timestamp display from Poll RHCSAF section
- Fix preview results text readability
- Add comprehensive unit tests for DatabaseService
- Update BUILD.bazel and CI workflow to include new tests
This commit fixes multiple issues in test_csaf_processing.py that caused
CI failures:

1. Missing unittest.main() call
   - Added 'if __name__ == "__main__": unittest.main()' block
   - Without this, Bazel's py_test runs the file as a script but never
     executes the tests, causing false positives
   - pytest doesn't need this (auto-discovers tests), but Bazel does

2. Fixed async test lifecycle methods
   - Changed 'async def tearDown' to 'async def asyncTearDown'
   - Removed incorrect @classmethod decorators from asyncSetUp/asyncTearDown
   - These must be instance methods in unittest.IsolatedAsyncioTestCase
   - Consolidated setUp logic into asyncSetUp
   - Added close_test_db() call to asyncTearDown for proper cleanup

3. Updated test CSAF data structure
   - Added product_version entries in product_tree (required by refactored parser)
   - Changed from EUS to MAIN product variant (EUS products are filtered out)
   - Added proper product_id, purl, and CPE format
   - The refactored CSAF parser (commit ccb297e) extracts packages from
     product_tree instead of vulnerabilities.product_status.fixed

4. Fixed test assertions
   - Changed minor_version expectation from 4 to None (CPE has no minor version)
   - Fixed test_no_fixed_packages to remove product_tree entries instead of
     just clearing the fixed array

Root cause analysis:
- Bazel tests were never actually running (missing unittest.main())
- GitHub Actions tests were running via pytest in Integration Tests step
- pytest auto-discovers unittest tests without needing __main__ block
- This is why CI showed failures while local Bazel tests appeared to pass

All tests now pass in both Bazel and pytest environments.
Extracted magic constants from _is_eus_product() function to improve
maintainability and readability:

- EUS_CPE_PRODUCTS: CPE product identifiers for EUS variants
- EUS_PRODUCT_NAME_KEYWORDS: Keywords for identifying EUS products

Using frozenset for better performance on membership checks.
- Move product_name and cpe declarations closer to usage
- Simplify modular package NEVRA extraction using split directly
- Remove redundant nevra variable and empty string check
Replace explicit length comparison with truthiness check for
red_hat_affected_products set.
Removed comments that simply restated what the code clearly does.
Kept only comments that provide non-obvious context such as:
- CPE format examples
- Product ID format variations
- Business logic explanations
- Remove redundant str() calls in f-strings
- Use 'raise ... from e' to preserve exception chain
Converted nested helper functions to standalone pure functions:
- _traverse_for_eus: Now takes and returns product_eus_map explicitly
- _extract_packages_from_branches: Now takes and returns packages explicitly

This makes the code more testable, readable, and eliminates hidden
state mutations from closure variables.
Check if advisory only affects EUS products immediately after
verifying vulnerabilities exist, before extracting packages,
CVEs, and other data. This saves processing time for advisories
that will be skipped anyway.

Also cleaned up redundant product_full_name variable.
@rockythorn rockythorn force-pushed the bugfix/modular-package-extraction branch from bb80ab9 to d391c3d Compare November 13, 2025 20:52
@rockythorn rockythorn closed this Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant