Skip to content

perf(increment-approve): optimize OBS source report processing + development group filtering#495

Open
okurz wants to merge 2 commits intoopenSUSE:masterfrom
okurz:feature/051_poo198728_klp_improve_runtime
Open

perf(increment-approve): optimize OBS source report processing + development group filtering#495
okurz wants to merge 2 commits intoopenSUSE:masterfrom
okurz:feature/051_poo198728_klp_improve_runtime

Conversation

@okurz
Copy link
Copy Markdown
Member

@okurz okurz commented Apr 16, 2026

@okurz okurz marked this pull request as draft April 16, 2026 14:24
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (f2d9742) to head (77d13b1).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #495   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           97        97           
  Lines         9690      9703   +13     
  Branches       514       515    +1     
=========================================
+ Hits          9690      9703   +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@okurz okurz force-pushed the feature/051_poo198728_klp_improve_runtime branch from 3d0946b to faa2124 Compare April 16, 2026 19:49
@okurz okurz changed the title PART 2: perf(increment-approve): optimize OBS source report processing + development group filtering - After #480 perf(increment-approve): optimize OBS source report processing + development group filtering Apr 16, 2026
@okurz okurz marked this pull request as ready for review April 16, 2026 19:49
@okurz okurz force-pushed the feature/051_poo198728_klp_improve_runtime branch from faa2124 to f0320c6 Compare April 16, 2026 20:04
Copy link
Copy Markdown
Contributor

@asmorodskyi asmorodskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just would like to raise a flag that multi-thread is quite expensive thing which brings it's own set of problems to a code . And we should always think hard to find other ways how to achieve same goal . From I understand only reason why we switching to multi-threading is because @okurz find it's annoying to wait 10 minutes for single run during development phase which I find quite questionable reasoning . multi-threading for example makes debug'ing of this code challenging .

My blocking does not mean hard block but I would like raise discussion and gather more opinions before we actually dive into this river ...

@okurz
Copy link
Copy Markdown
Member Author

okurz commented Apr 17, 2026

I just would like to raise a flag that multi-thread is quite expensive thing which brings it's own set of problems to a code

multi-threading is not expensive. Here the code is naturally mapping out what we want to do: Process all datapoints, regardless of the order or any other dependency.

By the way, the two commits mostly optimized by caching and a more efficient lookup. The multi-threading is only secondary and is following the concept of futures that are already used in qem-bot in other locations

okurz added 2 commits April 20, 2026 10:49
Motivation:
The script performed a sequential openQA API request (get_single_job) for
every individual job to check if it belonged to a development group. This
caused hundreds of redundant HTTP requests.

Design Choices:
Modified _filter_jobs to use the group_id already present in the job_stats
response. This allows checking the development group status once per group
instead of once per job ID, utilizing openQA's job grouping. Removed the
now-unused is_in_devel_group(job_id) helper.

Benefits:
Reduced execution time by approximately 80 seconds (over 50% improvement).
Significantly reduced load on the openQA API.

Related issue: https://progress.opensuse.org/issues/198728
Motivation:
Processing multiple OBS actions for a request involved redundant repository
lookups and sequential loading of package data from source reports, leading
to high execution times for requests with many actions.

Design Choices:
1. Added caching for project repository lookups via get_repos_of_project.
2. Parallelized the loading of packages from source reports for all actions
   using ThreadPoolExecutor and thread-safe updates to the package lists.

Benefits:
Reduced runtime for package diffing in large requests (e.g., 16 actions)
by approximately 34 seconds.

Related issue: https://progress.opensuse.org/issues/198728
@okurz okurz force-pushed the feature/051_poo198728_klp_improve_runtime branch from f0320c6 to 77d13b1 Compare April 20, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants