feat: hardware denormalization for hardware listing #1614

gustavobtflores · 2025-11-13T21:36:11Z

Objective

Implement hardware status denormalization to improve query performance for hardware listing data

Problem Solved

The current hardware listing endpoint suffers from performance issues due to complex joins across multiple tables (checkouts, tests, builds). This PR introduces a denormalized approach to aggregate hardware status data.

Key Changes

Added 4 new models (HardwareStatus, LatestCheckout, PendingTest, ProcessedHardwareStatus) with appropriate indexes
Integrated aggregation logic into the kcidbng_ingester to checkouts and pending tables during data ingestion
Created process_pending_aggregations management command for batch processing of pending tests with configurable batch size and loop mode
Added get_hardware_listing_data_from_status_table function to query denormalized data instead of complex joins
Modified hardwareView to leverage new denormalized table for improved response times with optional "feature flag"

How to Test

1. Run Database Migration

cd backend
poetry python manage.py migrate

2. Process Test Submissions (Optional - populate test data)

Extract submissions.zip to a folder and run the ingester:

poetry python manage.py monitor_submissions --spool-dir ./{folder-name}

3. Test Backlog Processing Command

Single batch processing:

poetry python manage.py process_pending_aggregations --batch-size 1000

Continuous loop mode:

python manage.py process_pending_aggregations --loop --interval 60 --batch-size 1000

4. Verify Hardware Status Data

Check that hardware_status table is populated in PostgreSQL using psql or other software that can communicate with the DB

5. Test Hardware Listing Endpoint

Query the hardware listing endpoint and verify improved performance and correct data aggregation.

barbieri · 2025-11-17T22:24:47Z

backend/kernelCI_app/models.py

+    origin = models.CharField(max_length=100)
+    platform = models.CharField(max_length=100)
+    compatibles = ArrayField(models.TextField(), null=True)
+    start_time = models.IntegerField()


since we don't have buckets anymore, keep this as datetime field

barbieri · 2025-11-17T22:26:37Z

backend/kernelCI_app/models.py

+
+    class Meta:
+        db_table = "hardware_status"
+        unique_together = ("checkout_id", "origin", "platform")


I'd have origin, platform first so indexes reaching only them will work.

also need another index on origin, platform, start_time... since we also query based on those.

barbieri · 2025-11-17T22:27:55Z

backend/kernelCI_app/models.py

+
+
+class LatestCheckout(models.Model):
+    checkout_id = models.TextField(primary_key=True)


nit: group the unique together fields, then the other fields (checkout_id and start_time)

also not likely, but checkout_id is not a PK, the same commit can be in multiple trees...

checkout_id is not a PK, the same commit can be in multiple trees

checkout_id is always a single row in the checkouts table, so the same checkout_id can't be from multiple trees, but we can have multiple checkouts for the same tree, in which case this is the point of the LatestCheckout table, to only store the latest checkout for each tree

barbieri · 2025-11-17T22:29:55Z

backend/kernelCI_app/models.py

+
+
+class PendingBuild(models.Model):
+    id = models.TextField(primary_key=True)


what's this id? i guess it's build_id... so name it accordingly to make it easier to correlate in the code.

I'm also thinking if we have guarantees this won't conflict across trees/origins...

barbieri · 2025-11-17T22:30:13Z

backend/kernelCI_app/models.py

+
+
+class PendingTest(models.Model):
+    id = models.TextField(primary_key=True)


ditto, i guess it's test_id

barbieri · 2025-11-17T22:31:40Z

backend/kernelCI_app/models.py

+    id = models.TextField(primary_key=True)
+    checkout_id = models.TextField()
+    status = models.CharField(
+        max_length=10, choices=StatusChoices.choices, blank=True, null=True


can this really be null or blank?

also maybe we convert this into the subset we care (pass/fail/inc), it doesn't need to be the same choice, we can use a reduced subset with single letter DB storage (1 byte instead of a string)

barbieri · 2025-11-17T22:31:48Z

backend/kernelCI_app/models.py

+    compatible = ArrayField(models.TextField(), null=True)
+    build_id = models.TextField()
+    status = models.CharField(
+        max_length=10, choices=StatusChoices.choices, blank=True, null=True


backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py

backend/kernelCI_app/queries/hardware.py

backend/kernelCI_app/management/commands/process_pending_aggregations.py

barbieri · 2025-11-26T17:37:27Z

backend/kernelCI_app/management/commands/delete_unused_hardware_status.py

+        valid_checkout_ids = set(
+            LatestCheckout.objects.values_list("checkout_id", flat=True)
+        )
+
+        orphaned_entries = HardwareStatus.objects.exclude(
+            checkout_id__in=valid_checkout_ids
+        )
+        orphaned_count = orphaned_entries.count()


this must be inside the transaction

barbieri · 2025-11-26T17:39:12Z

backend/kernelCI_app/management/commands/delete_unused_hardware_status.py

+        )
+
+        total_deleted = 0
+        while True:


ah, because of the slice below

barbieri · 2025-11-26T17:45:47Z

.../kernelCI_app/migrations/0010_pendingtest_processedhardwarestatus_hardwarestatus_and_more.py

+                        "hardware_key",
+                        "entity_id",
+                        "entity_type",
+                        blank=True,


blank=true?

the default for composite primary key is blank=True and apparently it cannot be changed to False, didn't found why exactly yet

https://github.com/django/django/blob/main/django/db/models/fields/composite.py#L53-L73

answering myself:

https://github.com/django/django/blob/stable/5.2.x/django/db/models/fields/composite.py#L50

if not kwargs.setdefault("blank", True): raise ValueError("CompositePrimaryKey must be blank.")

barbieri · 2025-11-26T18:53:43Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+    if status not in simplified_status_to_count:
+        return simplified_status_to_count[SimplifiedStatusChoices.INCONCLUSIVE]
+    return simplified_status_to_count[status]


how? if all status are in there. But anyway, it's better to use .get(status, SimplifiedStatusChoices.INCONCLUSIVE) than to double-lookup

barbieri · 2025-11-26T18:57:07Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+    is_already_processed = to_process in existing_processed
+
+    if is_already_processed:


why assign to variable?

added just for readability, but yeah, to_process in existing_processed already sounds readable enough

barbieri · 2025-11-26T19:10:30Z

backend/kernelCI_app/management/commands/delete_unused_hardware_status.py

+        valid_checkout_ids = set(
+            LatestCheckout.objects.values_list("checkout_id", flat=True)
+        )
+
+        orphaned_entries = HardwareStatus.objects.exclude(
+            checkout_id__in=valid_checkout_ids
+        )
+        orphaned_count = orphaned_entries.count()


this must be inside the transaction

barbieri · 2025-11-26T19:13:09Z

backend/kernelCI_app/management/commands/delete_unused_hardware_status.py

+        while True:
+            with transaction.atomic():
+                batch_ids = list(
+                    orphaned_entries.values_list("checkout_id", flat=True)[:batch_size]


the values_list("checkout_id", flat=True) you can specify when you create the orphaned_entries, here you just slice

barbieri · 2025-11-26T19:20:23Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+            test, h_key, status_record, existing_processed, new_processed_entries
+        )
+
+        _process_build_status(


only process build if the test was processed (return boolean in _process_test_status and check before this line)

barbieri · 2025-11-26T19:21:12Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+            build,
+            h_key,


note if we include the test.id, type=TEST in the hw_key above, we'll need to compute a new hw_key for the build, ok?

MarceloRobert · 2025-11-26T18:46:52Z

backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py

Please create an issue for adding unit tests for this and the other functions added in this PR just so that we don't forget

MarceloRobert · 2025-11-26T18:53:41Z

backend/kernelCI_app/models.py

+
+
+class LatestCheckout(models.Model):
+    checkout_id = models.TextField(primary_key=True)


checkout_id is not a PK, the same commit can be in multiple trees

checkout_id is always a single row in the checkouts table, so the same checkout_id can't be from multiple trees, but we can have multiple checkouts for the same tree, in which case this is the point of the LatestCheckout table, to only store the latest checkout for each tree

.../kernelCI_app/migrations/0010_pendingtest_processedhardwarestatus_hardwarestatus_and_more.py

MarceloRobert · 2025-11-26T19:07:48Z

backend/kernelCI_app/views/hardwareViewV2.py

+            if len(result.hardware) < 1:
+                return create_api_error_response(
+                    error_message=ClientStrings.NO_HARDWARE_FOUND,
+                    status_code=HTTPStatus.OK,
+                )


why don't you check for this len < 1 (btw why not len == 0) with the hardwares_raw instead of letting them go through the model validation? If there is no hardware then I see that there won't be any item to validate, but it seems out of order to me

just followed the current approach on hardwareView, but you're right, we can check it earlier

backend/kernelCI_app/typeModels/hardwareListingV2.py

backend/kernelCI_app/management/commands/process_pending_aggregations.py

barbieri · 2025-11-27T00:29:47Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+        h_key = get_hardware_key(test.origin, test.platform, checkout.id)
+        contexts.append((test, build, checkout, h_key))


when we move to include entity_id/type into the get_hardware_key() we must generate 2 h_key, one for the test and another for the build, report both in contexts and keys_to_check

barbieri · 2025-11-27T00:30:45Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+
+def _get_existing_processed(keys_to_check: set[bytes]) -> set[ProcessedHardwareStatus]:
+    """Fetch existing processed entries from the database."""
+    return set(ProcessedHardwareStatus.objects.filter(hardware_key__in=keys_to_check))


when we move to hardware_key to include entity_id/type it will be a PK, so we can change this line as well and .values("hardware_key", flat=True)

barbieri · 2025-11-27T00:32:22Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+    to_process = ProcessedHardwareStatus(
+        hardware_key=h_key,
+        entity_id=test.test_id,
+        entity_type=HardwareStatusEntityType.TEST,
+    )


when we move to hardware_key to include entity_id/type, then test_h_key is already what we'll check in existing_processed, no need to create the model instance here.

barbieri · 2025-11-27T00:32:52Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+    to_process = ProcessedHardwareStatus(
+        hardware_key=h_key,
+        entity_id=build.id,
+        entity_type=HardwareStatusEntityType.BUILD,
+    )
+
+    if to_process in existing_processed:
+        return


when we move to hardware_key to include entity_id/type, then build_h_key is already what we'll check in existing_processed, no need to create the model instance here.

barbieri · 2025-11-27T00:35:02Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+
+    existing_processed = _get_existing_processed(keys_to_check)
+
+    for test, build, checkout, h_key in contexts:


here you will get test_h_key and build_h_key, then be careful when calling _process_*_status()

barbieri · 2025-11-27T00:37:38Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+    def process_pending_batch(self, batch_size: int) -> int:
+        last_processed_test_id = None
+
+        while True:


i think we should hold a transaction inside the loop... or outside (more correct), but that may hold the lock for too long, need to check how it behaves with real data.

i think it is safer to keep the transaction inside the loop, with that we don't lose time reprocessing the previous batches if some of them fails to commit

barbieri · 2025-11-27T00:38:29Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+        ready_tests: Sequence[PendingTest],
+        ready_builds: dict[str, Builds],
+    ) -> int:
+        with transaction.atomic():


then this transaction is in process_pending_batch and not here

barbieri · 2025-11-27T00:43:34Z

backend/kernelCI_app/models.py

+    pk = models.CompositePrimaryKey(
+        "hardware_key",
+        "entity_id",
+        "entity_type",
+    )
+    hardware_key = models.BinaryField(
+        max_length=32
+    )  # this holds a sha256, thus digest_size = 32 bytes
+    entity_id = models.TextField()
+    entity_type = models.CharField(
+        max_length=1, choices=HardwareStatusEntityType.choices
+    )


we should include entity_id/type inside the hardware_key, currently the lookup for hardware_key will return all entities ever processed for that hardware, as the system grows and we never delete ProcessedHardwareStatus entries, the set will be large and this will hurt performance (and memory usage).

maybe we should keep a checkout_id field from a cron job we should remove those that are not related to the latest checkouts, same as we did with delete_unused_hardware_... (or even do it from inside that job)

barbieri · 2025-11-27T00:44:38Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+    checkout_id: str
+    test_origin: str
+    platform: str
+    compatibles: Optional[str]


isn't this an array of strings?

barbieri · 2025-11-27T00:46:19Z

backend/kernelCI_app/management/commands/process_pending_aggregations.py

+def get_hardware_key(origin: str, platform: str, checkout_id: str) -> bytes:
+    """Generate a hash (hardware key) from origin, platform, and checkout ID."""
+    return hashlib.sha256(f"{origin}|{platform}|{checkout_id}".encode("utf-8")).digest()


add entity_id/type

- Add HardwareStatus model to store aggregated test/build/boot status by platform - Add LatestCheckout model to track latest checkout per origin/tree/repository - Add PendingTest model to queue tests for aggregation processing - Add ProcessedHardwareStatus model to track processed entities - Add SimplifiedStatusChoices enum for aggregated status values - Create database migration for new tables with appropriate indexes

- Add aggregation_helpers module with checkout and test aggregation logic - Integrate aggregation into kcidbng_ingester db_worker - Populate LatestCheckout table with checkout tracking - Populate PendingTest table with test records for processing - Run aggregations before buffer flush to ensure data consistency

- Add process_pending_aggregations management command for backlog processing - Implement batch processing with configurable batch size and loop mode - Add hardware status aggregation from PendingTest queue - Add get_hardware_listing_data_from_status_table query function - Query denormalized hardware_status table for improved performance

MarceloRobert · 2025-11-27T20:06:00Z

backend/kernelCI_app/migrations/0010_pendingtest_hardwarestatus_latestcheckout_and_more.py

nit: rename migration to a more meaningful name

gustavobtflores force-pushed the feat/hardware-denormalization branch from ae2fc62 to 0175d2c Compare November 17, 2025 21:16

barbieri reviewed Nov 17, 2025

View reviewed changes

MarceloRobert assigned gustavobtflores Nov 18, 2025

MarceloRobert added enhancement New feature or request Database Issue that alters only configs of a database itself labels Nov 18, 2025

gustavobtflores force-pushed the feat/hardware-denormalization branch 7 times, most recently from 547b86d to c36d84d Compare November 21, 2025 13:57

gustavobtflores changed the title ~~wip: hardware denormalization for hardware listing~~ feat: hardware denormalization for hardware listing Nov 21, 2025

gustavobtflores marked this pull request as ready for review November 21, 2025 14:06

gustavobtflores force-pushed the feat/hardware-denormalization branch from c36d84d to f428a90 Compare November 21, 2025 14:10

MarceloRobert reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py Show resolved Hide resolved

MarceloRobert reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py Show resolved Hide resolved

MarceloRobert reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/queries/hardware.py Outdated Show resolved Hide resolved

gustavobtflores force-pushed the feat/hardware-denormalization branch 2 times, most recently from 63eb9f7 to aee6f18 Compare November 21, 2025 16:52

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/process_pending_aggregations.py Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 21, 2025

View reviewed changes

backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated Show resolved Hide resolved

barbieri reviewed Nov 26, 2025

View reviewed changes

MarceloRobert reviewed Nov 26, 2025

View reviewed changes

gustavobtflores force-pushed the feat/hardware-denormalization branch from acd80d1 to eb0fbff Compare November 26, 2025 22:49

barbieri reviewed Nov 27, 2025

View reviewed changes

gustavobtflores force-pushed the feat/hardware-denormalization branch from eb0fbff to c95ea60 Compare November 27, 2025 18:05

gustavobtflores added 5 commits November 27, 2025 15:45

test: fix db_worker tests mocking aggregation helper

7e6e353

feat: add cronjob to delete unused hardware status rows every week

63a72f4

gustavobtflores force-pushed the feat/hardware-denormalization branch from c95ea60 to 63a72f4 Compare November 27, 2025 18:46

MarceloRobert mentioned this pull request Nov 27, 2025

Discuss hardware listing changes #1568

Closed

MarceloRobert reviewed Nov 27, 2025

View reviewed changes



		class LatestCheckout(models.Model):
		checkout_id = models.TextField(primary_key=True)



		class PendingBuild(models.Model):
		id = models.TextField(primary_key=True)



		class PendingTest(models.Model):
		id = models.TextField(primary_key=True)

		is_already_processed = to_process in existing_processed

		if is_already_processed:

		h_key = get_hardware_key(test.origin, test.platform, checkout.id)
		contexts.append((test, build, checkout, h_key))


		existing_processed = _get_existing_processed(keys_to_check)

		for test, build, checkout, h_key in contexts:

feat: hardware denormalization for hardware listing #1614

Are you sure you want to change the base?

feat: hardware denormalization for hardware listing #1614

Uh oh!

Conversation

gustavobtflores commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Problem Solved

Key Changes

How to Test

1. Run Database Migration

2. Process Test Submissions (Optional - populate test data)

3. Test Backlog Processing Command

4. Verify Hardware Status Data

5. Test Hardware Listing Endpoint

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

gustavobtflores commented Nov 13, 2025 •

edited

Loading

barbieri Nov 27, 2025 •

edited

Loading