Filter out Copr builds without SRPM in SQL #2863

m-blaha · 2025-11-04T12:21:07Z

In addition to pagination slicing, the CoprBuildsList class uses Python code to filter out Copr builds that are waiting for an SRPM or whose SRPM build failed. This causes that the API in some cases can return fewer items than the user requested.

Moving the filter into SQL resolves the problem. The SQL filter relies on the fact that the build_id field is NULL until the build is actually created by submitting to Copr.

Resolves: #2505

gemini-code-assist · 2025-11-04T12:21:20Z

Summary of Changes

Hello @m-blaha, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the data retrieval process for Copr build lists by migrating the filtering of incomplete builds from application-level Python code to the database query. This enhancement ensures that the API consistently delivers the expected number of results and improves overall performance by optimizing data processing at its source.

Highlights

SQL-based Filtering: The filtering logic for Copr builds that are waiting for an SRPM or whose SRPM build failed has been moved from Python code to the SQL query. This is achieved by filtering out records where the build_id is NULL.
Improved API Efficiency: By performing the filtering at the database level, the API becomes more efficient as it no longer needs to fetch and then discard unwanted records in Python.
Accurate Pagination: This change resolves an issue where the API could return fewer items than requested due to post-query filtering, ensuring that pagination slicing works as expected.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

In addition to pagination slicing, the CoprBuildsList class uses Python code to filter out Copr builds that are waiting for an SRPM or whose SRPM build failed. This causes that the API in some cases can return fewer items than the user requested. Moving the filter into SQL resolves the problem. The SQL filter relies on the fact that the build_id field is NULL until the build is actually created by submitting to Copr. Resolves: packit#2505

gemini-code-assist

Code Review

This pull request correctly moves the filtering of Copr builds from Python code into the SQL query. This is a good change that resolves an issue with pagination and improves efficiency. The implementation is clean and the reasoning is sound. I have one suggestion for a further performance improvement to address a potential N+1 query problem in the API endpoint, which would make this part of the code even more efficient.

gemini-code-assist · 2025-11-04T12:22:45Z

packit_service/service/api/copr_builds.py

        for build in CoprBuildTargetModel.get_merged_chroots(first, last):
            build_info = CoprBuildTargetModel.get_by_build_id(build.build_id, None)
-            if build_info.status == BuildStatus.waiting_for_srpm:
-                continue
-            if (
-                build_info.status == BuildStatus.failure
-                and not build_info.build_start_time
-                and not build_info.build_logs_url
-            ):
-                # SRPM build failed, it doesn't make sense to list this build
-                continue
            project_info = build_info.get_project()


This loop currently causes an N+1 query problem. For each build returned by get_merged_chroots, you're making at least two more database queries: get_by_build_id and then get_project (which itself can trigger multiple lazy-loads). This can lead to significant performance degradation, especially with a large number of builds.

To resolve this, I recommend modifying CoprBuildTargetModel.get_merged_chroots to fetch all the necessary information in a single query by using joins and returning all required fields. This would eliminate the need for extra queries inside the loop.

For example, you could extend the query in get_merged_chroots to join with CoprBuildGroupModel, PipelineModel, ProjectEventModel, and GitProjectModel to retrieve fields like project_name, build_submitted_time, web_url, commit_sha, and project details. You would need to use an aggregate function (like min or max) on these additional fields within the group_by clause, since they will be the same for all chroots of a given build.

This would make the API endpoint much more performant.

This makes sense, but is out of scope here. If we wanted to optimize the code, it should be tracked as a separate issue.

m-blaha requested a review from a team as a code owner November 4, 2025 12:21

usercont-release-bot added this to Packit Kanban Board Nov 4, 2025

github-project-automation bot moved this to new in Packit Kanban Board Nov 4, 2025

m-blaha force-pushed the filter-out-no-source branch from ab917f1 to fe6327e Compare November 4, 2025 12:22

gemini-code-assist bot reviewed Nov 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Filter out Copr builds without SRPM in SQL #2863

Filter out Copr builds without SRPM in SQL #2863

Uh oh!

m-blaha commented Nov 4, 2025

Uh oh!

gemini-code-assist bot commented Nov 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 4, 2025

Uh oh!

m-blaha Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Filter out Copr builds without SRPM in SQL #2863

Are you sure you want to change the base?

Filter out Copr builds without SRPM in SQL #2863

Uh oh!

Conversation

m-blaha commented Nov 4, 2025

Uh oh!

gemini-code-assist bot commented Nov 4, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

m-blaha Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant