Skip to content

Conversation

@jpramos123
Copy link
Contributor

@jpramos123 jpramos123 commented Dec 9, 2025

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Overview

This PR is being created to address RHINENG-22144.
This PR make changes to the MQ and notifications so it uses the new serialization and host_type fields

PR Checklist

  • Keep PR title short, ideally under 72 characters
  • Descriptive comments provided in complex code blocks
  • Include raw query examples in the PR description, if adding/modifying SQL query
  • Tests: validate optimal/expected output
  • Tests: validate exceptions and failure scenarios
  • Tests: edge cases
  • Recovers or fails gracefully during potential resource outages (e.g. DB, Kafka)
  • Uses type hinting, if convenient
  • Documentation, if this PR changes the way other services interact with host inventory
  • Links to related PRs

Secure Coding Practices Documentation Reference

You can find documentation on this checklist here.

Secure Coding Checklist

  • Input Validation
  • Output Encoding
  • Authentication and Password Management
  • Session Management
  • Access Control
  • Cryptographic Practices
  • Error Handling and Logging
  • Data Protection
  • Communication Security
  • System Configuration
  • Database Security
  • File Management
  • Memory Management
  • General Coding Practices

Summary by Sourcery

Update host MQ serialization and notifications to use normalized host fields and new system profile storage while preserving backward compatibility for downstream consumers.

New Features:

  • Include additional canonical fact fields like insights_id, subscription_manager_id, satellite_id, fqdn, and network identifiers in host serialization and delete events.
  • Build system_profile objects from normalized static and dynamic system profile tables instead of system_profile_facts.

Bug Fixes:

  • Avoid KeyError when removing null canonical facts by checking field presence before deletion.

Enhancements:

  • Introduce a reusable UUID serialization helper and apply it across host, group, staleness, and outbox payload serialization.
  • Derive event header metadata such as host_type, operating system name, and bootc status from static system profile records.
  • Align host_type handling for MQ events with downstream expectations via a backward-compatibility mapping.
  • Use normalized host fields (e.g., insights_id, subscription_manager_id, static_system_profile.owner_id) for cache invalidation and delete event processing.
  • Simplify notification canonical facts handling by always deserializing from the host payload.

@jpramos123 jpramos123 requested a review from a team as a code owner December 9, 2025 18:58
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 9, 2025

Reviewer's Guide

Refactors host serialization, MQ event headers, and notifications to use the new normalized host fields and system profile tables instead of legacy canonical_facts/system_profile_facts, while adding host_type backward-compat behavior and centralizing UUID/datetime handling.

Sequence diagram for producing host update MQ event with normalized fields

sequenceDiagram
    actor Caller
    participant HostSynchronizeModule as HostSynchronize
    participant SerializationModule as Serialization
    participant Host as HostModel
    participant StaticSystemProfile as StaticProfile
    participant DynamicSystemProfile as DynamicProfile
    participant EventsModule as Events
    participant EventProducer as MQ

    Caller->>HostSynchronize: _synchronize_hosts_for_org(org_hosts_query, custom_staleness_dict, event_producer)
    loop for each host
        HostSynchronize->>Serialization: serialize_host(host, timestamps, staleness, fields, for_mq, system_profile_fields)
        activate Serialization
        Serialization->>HostModel: read basic fields (id, account, org_id, reporter, host_type, etc.)
        Serialization->>Serialization: serialize_uuid(host.id, host.insights_id, host.subscription_manager_id, host.openshift_cluster_id)
        Serialization->>Serialization: _build_system_profile_from_normalized(host, system_profile_fields)
        activate Serialization
        Serialization->>StaticProfile: iterate columns, filter, convert UUID fields
        Serialization-->>Serialization: partial system_profile (static)
        Serialization->>DynamicProfile: iterate columns, filter, convert datetime fields
        Serialization-->>Serialization: merged system_profile (static + dynamic)
        deactivate Serialization
        Serialization-->>HostSynchronize: serialized_host
        deactivate Serialization

        HostSynchronize->>Events: extract_system_profile_fields_for_headers(host.static_system_profile)
        Events-->>HostSynchronize: host_type, os_name, bootc_booted

        HostSynchronize->>Serialization: serialize_uuid(host.insights_id)
        Serialization-->>HostSynchronize: insights_id_str

        HostSynchronize->>Events: message_headers(EventType.updated, insights_id_str, host.reporter, host_type, os_name, bootc_booted)
        Events-->>HostSynchronize: headers

        HostSynchronize->>MQ: write_event(event, serialized_host.id, headers, wait=True)
    end
    HostSynchronize-->>Caller: done
Loading

Class diagram for updated host serialization and event helper modules

classDiagram
    class Host {
        +uuid id
        +string account
        +string org_id
        +string display_name
        +string ansible_host
        +uuid insights_id
        +uuid subscription_manager_id
        +string satellite_id
        +string fqdn
        +string bios_uuid
        +list~string~ ip_addresses
        +list~string~ mac_addresses
        +string provider_id
        +string provider_type
        +string reporter
        +string host_type
        +StaticSystemProfile static_system_profile
        +DynamicSystemProfile dynamic_system_profile
    }

    class StaticSystemProfile {
        +uuid org_id
        +uuid host_id
        +uuid owner_id
        +string host_type
        +string os_release
        +map operating_system
        +map bootc_status
    }

    class DynamicSystemProfile {
        +uuid org_id
        +uuid host_id
        +datetime captured_date
        +datetime last_boot_time
        +map other_dynamic_fields
    }

    class SerializationModule {
        +tuple _CANONICAL_FACTS_FIELDS
        +tuple ADDITIONAL_HOST_MQ_FIELDS
        +dict serialize_host(host, staleness_timestamps, staleness, fields, for_mq, system_profile_fields)
        +dict _build_system_profile_from_normalized(host, system_profile_fields)
        +dict serialize_host_for_export_svc(host, staleness_timestamps, staleness)
        +dict serialize_group_without_host_count(group)
        +dict serialize_group_with_host_count(group, host_count)
        +dict serialize_host_system_profile(host)
        +string serialize_uuid(u)
        +dict serialize_staleness_response(staleness)
        +dict deserialize_canonical_facts(raw_data, all)
        +void remove_null_canonical_facts(serialized_host)
        +string _map_host_type_for_backward_compatibility(host_type)
    }

    class EventsModule {
        +tuple EventType
        +tuple message_headers(event_type, insights_id, reporter, host_type, os_name, bootc_booted)
        +tuple host_delete_event(event_type, host, initiated_by_frontend, platform_metadata)
        +tuple extract_system_profile_fields_for_headers(static_system_profile)
    }

    class HostMQModule {
        +Host _set_owner(host, identity)
        +void sync_event_message(message, session, event_producer)
        +void write_delete_event_message(event_producer, result, initiated_by_frontend)
        +void write_add_update_event_message(event_producer, result, initiated_by_frontend)
    }

    class GroupRepositoryModule {
        +void _produce_host_update_events(event_producer, serialized_groups, host_list, identity)
        +void _invalidate_system_cache(host_list, identity)
    }

    class HostSynchronizeModule {
        +void _synchronize_hosts_for_org(org_hosts_query, custom_staleness_dict, event_producer)
    }

    class OutboxRepositoryModule {
        +dict _create_update_event_payload(host)
    }

    class NotificationsModule {
        +dict build_base_notification_obj(notification_type, host)
        +dict populate_events(base_notification_obj, host_list, extra_fields)
    }

    SerializationModule --> Host : uses
    SerializationModule --> StaticSystemProfile : builds_system_profile
    SerializationModule --> DynamicSystemProfile : builds_system_profile

    EventsModule --> StaticSystemProfile : reads_for_headers

    HostMQModule --> SerializationModule : serialize_host
    HostMQModule --> EventsModule : message_headers

    GroupRepositoryModule --> SerializationModule : serialize_host
    GroupRepositoryModule --> EventsModule : message_headers

    HostSynchronizeModule --> SerializationModule : serialize_host
    HostSynchronizeModule --> EventsModule : message_headers

    OutboxRepositoryModule --> SerializationModule : serialize_uuid

    NotificationsModule --> SerializationModule : deserialize_canonical_facts
Loading

File-Level Changes

Change Details Files
Extend canonical facts configuration and harden removal of null canonical facts.
  • Add new canonical fact field names (insights_id, subscription_manager_id, satellite_id, fqdn, bios_uuid, ip_addresses, mac_addresses, provider_id, provider_type) to the canonical facts field list.
  • Guard remove_null_canonical_facts against missing keys by checking key presence before accessing serialized_host entries.
app/serialization.py
Introduce normalized system profile serialization and update host serialization to use direct model fields and shared UUID serializer.
  • Add _build_system_profile_from_normalized helper that constructs system_profile from static_system_profile and dynamic_system_profile tables, with UUID and datetime normalization and exclusion of metadata fields.
  • Change serialize_host to build its base dict manually (instead of serialize_canonical_facts) and to pull identity/infra-related fields directly from host model attributes, using serialize_uuid for UUIDs.
  • Change os_release to be read from static_system_profile.os_release and openshift_cluster_id to use serialize_uuid.
  • Adjust system_profile handling to use _build_system_profile_from_normalized and to map system_profile.host_type for MQ events via a new _map_host_type_for_backward_compatibility helper.
  • Rename _serialize_uuid to serialize_uuid and update all callers, including group serialization, staleness serialization, export serialization, and system profile serialization helpers.
app/serialization.py
Align owner resolution with new subscription_manager_id field and normalized system profile for MQ owner handling and cache invalidation.
  • Update _set_owner to use host.subscription_manager_id and static_system_profile.owner_id instead of canonical_facts["subscription_manager_id"] and system_profile_facts["owner_id"].
  • Adjust delete-event and update-event flows to read insights_id from host.insights_id and owner_id from static_system_profile.owner_id when invalidating cached system keys.
app/queue/host_mq.py
Switch MQ event headers and delete-event payloads to use normalized fields and add reusable header field extraction from static system profiles.
  • Remove use of serialize_canonical_facts in host_delete_event and explicitly surface each canonical field from host model attributes in the event payload.
  • Introduce extract_system_profile_fields_for_headers to derive host_type, operating system name, and bootc booted flag from static_system_profile, and use it when building headers for delete and update events.
  • Update header construction across delete, update, group, and sync flows to use serialize_uuid(host.insights_id) and the new header extractor instead of system_profile_facts-based lookups.
app/queue/host_mq.py
app/queue/events.py
lib/group_repository.py
lib/host_synchronize.py
Normalize UUID usage in outbox reporter payloads and simplify notification canonical facts derivation.
  • Update outbox_repository reporter field construction to use serialize_uuid for satellite_id, subscription_manager_id, and insights_id.
  • Change notifications to always derive canonical_facts from the host dict via deserialize_canonical_facts instead of preferring embedded canonical_facts or system_profile_facts, ensuring consistency with new serialization.
  • Adjust event population in notifications to use the same canonical fact deserialization path for each host.
lib/outbox_repository.py
app/queue/notifications.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In _build_system_profile_from_normalized, the UUID_FIELDS, DATETIME_FIELDS, and EXCLUDE_FIELDS sets are recreated on every call; consider lifting these to module-level constants to avoid unnecessary allocations on hot paths like host serialization.
  • _build_system_profile_from_normalized silently swallows DetachedInstanceError for both static and dynamic system profile relationships; consider at least logging at debug level so unexpected lazy-loading issues don’t get completely hidden.
  • _map_host_type_for_backward_compatibilitymapsNone and any non-edgevalue to an empty string; if the intent is to distinguish between “unknown” and “non-edge” host types in headers, consider documenting this explicitly or adjusting the mapping to preserveNone` where appropriate.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_build_system_profile_from_normalized`, the `UUID_FIELDS`, `DATETIME_FIELDS`, and `EXCLUDE_FIELDS` sets are recreated on every call; consider lifting these to module-level constants to avoid unnecessary allocations on hot paths like host serialization.
- `_build_system_profile_from_normalized` silently swallows `DetachedInstanceError` for both static and dynamic system profile relationships; consider at least logging at debug level so unexpected lazy-loading issues don’t get completely hidden.
- _map_host_type_for_backward_compatibility` maps `None` and any non-`edge` value to an empty string; if the intent is to distinguish between “unknown” and “non-edge” host types in headers, consider documenting this explicitly or adjusting the mapping to preserve `None` where appropriate.

## Individual Comments

### Comment 1
<location> `app/queue/events.py:146-154` </location>
<code_context>
         "type": event_type.name,
         "id": host.id,
-        **serialize_canonical_facts(host.canonical_facts),
+        "insights_id": host.insights_id,
+        "subscription_manager_id": host.subscription_manager_id,
+        "satellite_id": host.satellite_id,
+        "fqdn": host.fqdn,
+        "bios_uuid": host.bios_uuid,
+        "ip_addresses": host.ip_addresses,
+        "mac_addresses": host.mac_addresses,
+        "provider_id": host.provider_id,
+        "provider_type": host.provider_type,
         "org_id": host.org_id,
         "account": host.account,
</code_context>

<issue_to_address>
**issue (bug_risk):** Canonical fact UUIDs in delete events are no longer serialized to strings

`host_delete_event` used to rely on `serialize_canonical_facts`, which converted UUID-like canonical facts to strings. With this change, `insights_id`, `subscription_manager_id`, and possibly `provider_id` may now be `uuid.UUID` instances, which can break JSON serialization or consumers that expect strings.

To keep behavior consistent with `serialize_host` and `serialize_uuid`, wrap any UUID-typed canonical facts in `serialize_uuid(...)` before adding them to the event payload.
</issue_to_address>

### Comment 2
<location> `app/serialization.py:144` </location>
<code_context>
         del serialized_host[field_name]


+def _build_system_profile_from_normalized(host, system_profile_fields=None) -> dict:
+    """
+    Build system profile dict from static and dynamic tables.
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring `_build_system_profile_from_normalized` to use shared helpers and module-level constants to avoid duplicated loops and error handling.

You can keep all the new behavior but significantly reduce duplication in `_build_system_profile_from_normalized` by:

1. Lifting the per-call constants to module scope.
2. Using a single traversal helper with per-field transforms.
3. Sharing a single `try/except DetachedInstanceError` per relationship via that helper.

### Example refactor

```python
# module-level constants
UUID_FIELDS = {"owner_id", "rhc_client_id", "rhc_config_state", "virtual_host_uuid"}
DATETIME_FIELDS = {"captured_date", "last_boot_time"}
EXCLUDE_PROFILE_FIELDS = {"org_id", "host_id"}


def _add_profile_fields(model, system_profile, system_profile_fields, transforms) -> None:
    if not model:
        return

    for column in model.__table__.columns:
        name = column.name

        if name in EXCLUDE_PROFILE_FIELDS:
            continue
        if system_profile_fields and name not in system_profile_fields:
            continue

        value = getattr(model, name, None)
        if value is None:
            continue

        transform = transforms.get(name)
        system_profile[name] = transform(value) if transform else value
```

Then `_build_system_profile_from_normalized` becomes:

```python
def _build_system_profile_from_normalized(host, system_profile_fields=None) -> dict:
    system_profile: dict = {}

    static_transforms = {name: serialize_uuid for name in UUID_FIELDS}
    dynamic_transforms = {name: _serialize_datetime for name in DATETIME_FIELDS}

    try:
        _add_profile_fields(host.static_system_profile, system_profile, system_profile_fields, static_transforms)
    except DetachedInstanceError:
        pass  # relationship not loaded - skip

    try:
        _add_profile_fields(host.dynamic_system_profile, system_profile, system_profile_fields, dynamic_transforms)
    except DetachedInstanceError:
        pass  # relationship not loaded - skip

    return system_profile
```

Benefits:

- Removes the two duplicated loops and duplicated `DetachedInstanceError` handling.
- Constants are created once and are clearly visible at module scope.
- UUID/datetime handling is explicit and centralized via the `transforms` dict while preserving the current field-based behavior.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +146 to +154
"insights_id": host.insights_id,
"subscription_manager_id": host.subscription_manager_id,
"satellite_id": host.satellite_id,
"fqdn": host.fqdn,
"bios_uuid": host.bios_uuid,
"ip_addresses": host.ip_addresses,
"mac_addresses": host.mac_addresses,
"provider_id": host.provider_id,
"provider_type": host.provider_type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Canonical fact UUIDs in delete events are no longer serialized to strings

host_delete_event used to rely on serialize_canonical_facts, which converted UUID-like canonical facts to strings. With this change, insights_id, subscription_manager_id, and possibly provider_id may now be uuid.UUID instances, which can break JSON serialization or consumers that expect strings.

To keep behavior consistent with serialize_host and serialize_uuid, wrap any UUID-typed canonical facts in serialize_uuid(...) before adding them to the event payload.

del serialized_host[field_name]


def _build_system_profile_from_normalized(host, system_profile_fields=None) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring _build_system_profile_from_normalized to use shared helpers and module-level constants to avoid duplicated loops and error handling.

You can keep all the new behavior but significantly reduce duplication in _build_system_profile_from_normalized by:

  1. Lifting the per-call constants to module scope.
  2. Using a single traversal helper with per-field transforms.
  3. Sharing a single try/except DetachedInstanceError per relationship via that helper.

Example refactor

# module-level constants
UUID_FIELDS = {"owner_id", "rhc_client_id", "rhc_config_state", "virtual_host_uuid"}
DATETIME_FIELDS = {"captured_date", "last_boot_time"}
EXCLUDE_PROFILE_FIELDS = {"org_id", "host_id"}


def _add_profile_fields(model, system_profile, system_profile_fields, transforms) -> None:
    if not model:
        return

    for column in model.__table__.columns:
        name = column.name

        if name in EXCLUDE_PROFILE_FIELDS:
            continue
        if system_profile_fields and name not in system_profile_fields:
            continue

        value = getattr(model, name, None)
        if value is None:
            continue

        transform = transforms.get(name)
        system_profile[name] = transform(value) if transform else value

Then _build_system_profile_from_normalized becomes:

def _build_system_profile_from_normalized(host, system_profile_fields=None) -> dict:
    system_profile: dict = {}

    static_transforms = {name: serialize_uuid for name in UUID_FIELDS}
    dynamic_transforms = {name: _serialize_datetime for name in DATETIME_FIELDS}

    try:
        _add_profile_fields(host.static_system_profile, system_profile, system_profile_fields, static_transforms)
    except DetachedInstanceError:
        pass  # relationship not loaded - skip

    try:
        _add_profile_fields(host.dynamic_system_profile, system_profile, system_profile_fields, dynamic_transforms)
    except DetachedInstanceError:
        pass  # relationship not loaded - skip

    return system_profile

Benefits:

  • Removes the two duplicated loops and duplicated DetachedInstanceError handling.
  • Constants are created once and are clearly visible at module scope.
  • UUID/datetime handling is explicit and centralized via the transforms dict while preserving the current field-based behavior.

@jpramos123
Copy link
Contributor Author

/retest

@jpramos123 jpramos123 marked this pull request as draft December 12, 2025 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant