packit · lbarcziova · Aug 19, 2025 · May 21, 2025 · Aug 11, 2025 · Aug 11, 2025
diff --git a/research/automation-tools/rdopkg.md b/research/automation-tools/rdopkg.md
@@ -36,13 +36,11 @@ authors: flachman
 ## Actions
 
 - [fix](https://github.com/softwarefactory-project/rdopkg/blob/master/doc/rdopkg.1.adoc#action-fix) -- Apply changes to the `.spec` file.
-
   1. Bump Release, prepare a new `%changelog` entry header.
   2. Drop to shell, let user edit the `.spec` file.
   3. After running `rdopkg`, ensure description was added to `%changelog` and commit changes in a new commit.
 
 - [patch](https://github.com/softwarefactory-project/rdopkg/blob/master/doc/rdopkg.1.adoc#action-patch) -- Introduce new patches to the package.
-
   1. Unless -l/--local-patches was used, reset the local patches branch to the remote patches branch.
   2. Update patch files from local patches branch using git format-patch.
   3. Update .spec file with correct patch files references.
@@ -52,7 +50,6 @@ authors: flachman
   7. Display the diff.
 
 - [new-version](https://github.com/softwarefactory-project/rdopkg/blob/master/doc/rdopkg.1.adoc#action-new-version) -- Update package to new upstream version.
-
   1. Show changes between the previous version and the current one, especially modifications to requirements.txt.
   2. Reset the local patches branch to the remote patches branch
   3. Rebase the local patches branch on \$NEW_VERSION tag.

diff --git a/research/celery/task-workflow-refactor.md b/research/celery/task-workflow-refactor.md
@@ -41,9 +41,7 @@ Possible solutions, which can be somehow combined:
 1.  serialize the info about objects and pass it into `send_task`
     - this would need serializing and then again deserializing
 2.  save the info about objects in DB and pass IDs of models into `send_task`
-
     - what models does make sense to have? possibilities:
-
       - project
       - package config
       - service config

diff --git a/research/celery/tasks-prioritizing.md b/research/celery/tasks-prioritizing.md
@@ -38,7 +38,6 @@ authors: lbarczio
 ## How to do it
 
 - from [FAQ](https://docs.celeryproject.org/en/master/faq.html#does-celery-support-task-priorities):
-
   - Redis transport emulates priority support
   - prioritize work by routing high priority tasks to different workers, this usually works better than per message priorities
 
@@ -49,7 +48,6 @@ authors: lbarczio
 ### Task priority
 
 - docs are not so clear:
-
   - [priority](https://docs.celeryproject.org/en/latest/reference/celery.app.task.html?highlight=celery.app.task#celery.app.task.Task.priority)
     attribute of the `Task` - default task priority
   - [priority](https://docs.celeryproject.org/en/stable/reference/celery.app.task.html?highlight=priority#celery.app.task.Task.apply_async)

diff --git a/research/database/refresh.md b/research/database/refresh.md
@@ -8,19 +8,16 @@ authors:
 ## Usecases
 
 1. Show the whole workflow to the user.
-
    - It's not clear what it is.
 
 2. For each step, we get:
-
    - previous step
    - next steps
    - other steps from this group (e.g. other chroots for this build)
 
 3. It is possible to rerun the whole workflow.
 4. It is possible to rerun one step (and all the follow-up steps).
 5. It is possible to rerun a part of one step (and the follow-up step(s)).
-
    - E.g. one chroot.
 
 6. For project, we get all workflows.

diff --git a/research/deployment/deployment-improvements/index.md b/research/deployment/deployment-improvements/index.md
@@ -31,7 +31,6 @@ Areas to be covered:
 ### Installation source
 
 - github
-
   - pros:
     - current changes in other projects are always in place - useful especially for stg branch
   - cons:
@@ -46,7 +45,6 @@ Areas to be covered:
 ### Image build approach
 
 - s2i: Source-to-Image (S2I) is a tool for building reproducible, Docker-formatted container images. It produces ready-to-run images by injecting application source into a container image and assembling a new image. The new image incorporates the base image (the builder) and built source and is ready to use with the docker run command. S2I supports incremental builds, which re-use previously downloaded dependencies, previously built artifacts, etc.
-
   - pros:
     - separating code and image development - probably advantage in bigger projects where development and devops is separated
   - cons:

diff --git a/research/deprecation/index.md b/research/deprecation/index.md
@@ -14,7 +14,6 @@ authors: mfocko
 Looked into the options suggested by @lachmanfrantisek which were:
 
 - `Deprecated`
-
   - seems as a good choice, offers decorator that has optional parameters such as version or custom message
   - live GitHub repo
   - fast release cycle
@@ -24,30 +23,24 @@ Looked into the options suggested by @lachmanfrantisek which were:
     from docs, all properties are optional, you can add reason (usually alternative) or version in which it was deprecated
 
 - `Python-Deprecated`
-
   - dead version of `Deprecated`, which is probably kept in PyPI just for backward-compatibility
 
 - `deprecationlib`
-
   - seems like hobby project, only one information in decorator (alternative function name)
 
 - `Dandelyon`
-
   - looks like nice project
   - offers multiple decorators
   - doesn't seem to be very active
 
 - `deprecate`
-
   - dead project
 
 - `deprecation`
-
   - not very active
   - multiple issues
 
 - `libdeprecation`
-
   - dead version of `deprecation`
 
 - `warnings` (built-in module)

diff --git a/research/integrations/console.md b/research/integrations/console.md
@@ -44,7 +44,6 @@ the future.
 How we can integrate with it?
 
 1. We should use [AWS Launch Templates](https://docs.aws.amazon.com/autoscaling/ec2/userguide/launch-templates.html).
-
    - One cannot configure instance parameters using the provisioning API.
    - We also need to be strict about this: we don't want users willy nilly
      change their instances (i.e. request 64G mem).

diff --git a/research/integrations/downstream/index.md b/research/integrations/downstream/index.md
@@ -253,7 +253,6 @@ User can configure a repository where Packit create an issue in case of failed u
 There is an occurring question if the new functionality/job will be incorporated into the existing service, done as a separate deployment or done from scratch. Let's put down some benefits and problems.
 
 1. New functionality added to the existing service
-
    - The new functionality is added as a new handler (=a separate class).
    - If we need to react to a new event, the parsing needs to be implemented.
    - The mapping between event and handler is done by decorating the handler and explicitly setting the event we react on.
@@ -262,7 +261,6 @@ There is an occurring question if the new functionality/job will be incorporated
    - Since we have one database, we can show some overall status and combine information from upstream part and downstream part (including the propose-downstream job that is somewhere in the middle).
 
 2. New functionality as another deployment
-
    - It's more a version of the previous one.
    - Benefits are independence, being able to have different identities and limits.
    - The main downside is a duplicated afford needed to maintain the service and to run the shared part (task scheduler, listeners, API service).

diff --git a/research/integrations/fedora-ci/deployment-move.md b/research/integrations/fedora-ci/deployment-move.md
@@ -0,0 +1,133 @@
+---
+title: Decoupling Fedora CI deployment from Packit Service
+authors: lbarcziova
+---
+
+Related links:
+
+- [issue](https://github.com/packit/packit-service/issues/2737)
+
+This research describes requirements and plan to decouple Fedora CI-specific worker functionality from
+the [Packit Service repository](https://github.com/packit/packit-service) and deploy it as a separate, independent service.
+This will improve maintainability, scalability, and make the deployment within Fedora infrastructure easier.
+
+## Code
+
+- create the new repo (`https://github.com/packit/fedora-ci`?) structure, something like this:
+
+```
+fedora-ci/
+├── fedora_ci/
+│   ├── handlers/      # Fedora CI handlers
+│   ├── helpers/       # Helper classes like FedoraCIHelper
+│   ├── checker/       # Checker classes
+│   ├── jobs.py        # Job processing logic
+│   └── tasks.py       # Celery tasks
+├── tests/
+
+```
+
+- code migration:
+  - identify and move all Fedora CI-related worker functionality from packit-service to the new repository; this concerns jobs that do not depend on a repo having Packit configuration in the repository
+  - set up tests and CI
+  - create files needed for deployment: `run_worker.sh`, Containerfile, docker-compose file, etc.
+- remove the moved code from the `packit-service` repo
+  (this should be done after the code from new repo is deployed)
+- needed changes:
+  - once there is separate dashboard page, change the URL paths in the code
+
+### Implementation changes during the transition period
+
+- changes only to the new repo, without deploying
+  - this would mean for few weeks the changes wouldn't take effect in the old deployment,
+    and might cause some bugs landing when the new deployment happens which might be harder to investigate (if there is a lot of new changes)
+  - old deployment might continue to run with known bugs
+- changes to both repos
+  - this involves duplicated work, and might be prone to errors (e.g. forgetting to apply a change to one repo)
+  - first changing the code in old repo, and then applying the same to the new one
+- importing the code from new repo
+  - cleaner transition
+  - requires more initial effort and might be more complex to set up
+  - how to implement:
+    - git submodule - directly link the new repository as a subdirectory in the old one
+    - open to other suggestions
+  - this might be not possible to do easily, as it could cause circular imports, as fedora-ci code will need to import events from packit-service
+
+- in any case, we could try to minimize the new features and focus only on bug fixing during this time
+
+## DB and API
+
+- schema same, empty tables
+- do we want to migrate the data from the current deployment?
+- API at e.g. `prod.fedora-ci.packit.dev/api`
+
+## Dashboard
+
+- keeping one instance or having 2
+  - 1 instance: we can use [`Context selector`](https://www.patternfly.org/components/menus/context-selector/design-guidelines) like OpenShift does
+    - using different backends
+    - we agreed we prefer this solution
+  - 2 instances:
+    - implementation wise this would require more changes
+- to consider: this might be also required to be deployed in Fedora infra
+  - dashboard deployment is quite straightforward, shouldn't be an issue
+
+## Identity
+
+- we probably want a new identity (or 2, both for stg and prod) on `src.fedoraproject.org` to be set up
+- current Fedora CI user (`releng-bot`) is in these groups:
+  - cvsadmin
+  - fedora-contributor
+  - fedorabugs
+  - packager
+  - relenggroup
+
+## Openshift
+
+### Ansible playbook, roles, Openshift object definitions
+
+- [Fedora infra ansible repo](https://pagure.io/fedora-infra/ansible)
+- copy and adjust the existing [deployment playbook](https://github.com/packit/deployment/blob/main/playbooks/deploy.yml) and related files
+- mimic existing Fedora infrastructure playbooks, such as https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/openshift-apps/openscanhub.yml, and remove any unneeded tasks specific to Packit Service
+- copy and adjust the Openshift object definitions, also remove Packit Service specific values (e.g. MPP specific)
+- logs collection in Fedora infra?
+
+### Configuration
+
+- create Packit service config (specific server name etc.), variable file templates
+
+### Secrets
+
+- all the secrets should be new (different from Packit Service)
+  - certificates
+  - identity related files: token, SSH keys, keytab
+  - Fedora messaging
+  - Testing Farm
+  - Flower
+  - postgres
+  - Sentry
+  - ?
+
+# To discuss
+
+- repo naming
+  - fedora-ci-worker
+- identity
+  - new one
+- do we want both stg and prod? new code deployment strategy? weekly prod updates?
+  - yes, for the beginning stick with weekly updates, this might need to be adjusted later on
+- existing data migration
+  - let's not do this and rather spend time on other tasks
+- how to handle code changes while being in the process of the decoupling
+  - try to minimize changes, urgent fixes contribute to both repos
+
+# Follow-up work (to be adjusted based on discussion)
+
+- code migration as described above:
+  - functionality and tests
+  - CI setup
+  - deployment related files
+- configuration and secrets generation
+- integrate our deployment into https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks
+- dashboard changes
+- reverse dependency tests run in packit-service repo to make sure changes there do not break the Fedora CI
diff --git a/research/integrations/gitlab/index.md b/research/integrations/gitlab/index.md
@@ -36,7 +36,6 @@ There are many ways available for us to move forward.
 - This service can then be enabled by the project maintainer by going to Project settings -> Integrations -> Packit service, eg. [test-instance](http://52.183.132.26:3000/testpackit/testing/-/settings/integrations).
 
 - For adding project integration to gitlab instances we have two options to move forward:
-
   1. We contribute to the [GitLab](https://gitlab.com/gitlab-org/gitlab/tree/master/app/models/project_services) and can reach large audiance, but for contributing to gitlab is a time taking process. (Currently looking into it)
 
   2. Add our project integration code directly to the custom gitlab instances that we currently want to support.

diff --git a/research/integrations/image-builder/index.md b/research/integrations/image-builder/index.md
@@ -58,7 +58,6 @@ The integration is pretty straightforward:
 2. Implement a handler for it (trigger = successful Copr build + explicit `/packit` command)
 3. Wait for the build to finish: babysit/polling/celery
 4. Auth - create a 'service' account for Packit on access.redhat.com
-
    - Attach employee SKU to it
    - Create a refresh token & store it in bitwarden
    - Inform Image Builder team about this user so they are aware of it (maybe

diff --git a/research/monitoring/error-budgets/index.md b/research/monitoring/error-budgets/index.md
@@ -8,7 +8,6 @@ authors:
 ## Next steps for Packit
 
 1. Identify stakeholders who can help us to define our SLO
-
    - Project which are the most frequent users of the service.
    - Prominent users:
      - [rhinstaller/anaconda](https://github.com/rhinstaller/anaconda)
@@ -31,7 +30,6 @@ authors:
 
 2. Discuss and document their expectations. At a minimum in terms of
    (questions are provided as an example):
-
    - latency
      - How fast should builds/tests start? (First feedback from the service
        that something is happening.)

diff --git a/research/monitoring/metrics.md b/research/monitoring/metrics.md
@@ -7,6 +7,7 @@ authors: lbarczio
 
 - [Prometheus Flask exporter](https://github.com/rycus86/prometheus_flask_exporter)
   - metrics are configured via decorators, e.g. `@metrics.counter(..)`:
+
   ```python
   @app.route('/<item_type>')
   @metrics.do_not_track()
@@ -15,6 +16,7 @@ authors: lbarczio
   def by_type(item_type):
       pass  # only the counter is collected, not the default metrics
   ```
+
   - the metrics are by default exposed on the same Flask application on the /metrics endpoint,
     this can be adjusted
   - counters count invocations, other types (histogram, gauge, summary) collect metrics based on the
@@ -44,6 +46,7 @@ authors: lbarczio
   - metrics:
     - celery_workers - number of workers
     - celery_tasks_total - number of tasks per state (labels name, state, queue and namespace):
+
   ```
    celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="RECEIVED"} 3.0
    celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="PENDING"} 0.0
@@ -53,6 +56,7 @@ authors: lbarczio
    celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="REVOKED"} 0.0
    celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="SUCCESS"} 7.0
   ```
+
   - celery_tasks_runtime_seconds
   - celery_tasks_latency_seconds - time until tasks are picked up by a worker - this can be helpful for us and is
     not included in the first exporter metrics
@@ -80,9 +84,7 @@ authors: lbarczio
 - builtin Monitoring view in clusters we use currently - this should use some of the tools below
 
 - previous research:
-
   1. [`kube-state-metrics`](https://github.com/kubernetes/kube-state-metrics)
-
      - converts Kubernetes objects to metrics consumable by Prometheus
      - not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods
      - metrics are exported on the HTTP endpoint `/metrics` on the listening port, designed to be consumed either by
@@ -103,7 +105,6 @@ authors: lbarczio
        [CLI args](https://github.com/kubernetes/kube-state-metrics/blob/master/docs/cli-arguments.md#command-line-arguments)
 
   2. [Node exporter](https://github.com/prometheus/node_exporter)
-
      - Prometheus exporter for hardware and OS metrics exposed by \*NIX kernels
      - runs on a host, provides details on I/O, memory, disk and CPU pressure
      - can be configured as a side-car container, [described](https://access.redhat.com/solutions/4406661)
@@ -131,7 +132,6 @@ authors: lbarczio
        - container_network_receive_bytes_total - cumulative count of bytes received
        - container_processes - number of processes running inside the container
   4. [`kubernetes_sd_config`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config)
-
      - in the Prometheus configuration allow Prometheus to retrieve scrape targets from Kubernetes REST API and stay synchronized with the cluster state.
      - role types that can be configured to discover targets:
        - `node` - discovers one target per cluster node with the address defaulting to the Kubelet's HTTP port

diff --git a/research/monorepo-support/refactoring.md b/research/monorepo-support/refactoring.md
@@ -50,7 +50,6 @@ I see two possible solutions to support monorepos.
    Substitute the `self.event.package_config.jobs` calls like in this [commit](https://github.com/majamassarini/packit-service/commit/10d012bfddef815ad03781c2e3907998e20d8c7f). Where the `package_config.get_job_views` method looks like [this](https://github.com/majamassarini/packit/blob/multiple_distgit_external_package_config/packit/config/package_config.py#L157-L172).
 
    The above solution resolves a test like [this](https://github.com/majamassarini/packit-service/blob/multiple_distgit_packit_api/tests/unit/test_jobs.py#L3134-L3234).
-
    - **PROS**: we don't need to touch much more code than this. Our handlers are designed to work with one `JobConfig` and they will keep doing that, working in the same way with a `JobConfigView` (or just pick another name for it) and a `JobConfig`.
 
    - **CONS**: if, for supporting monorepos, we need to deal with multiple packages in the same handler. Then we need to group together the `JobConfigView`s, like in the `package_config.get_grouped_job_views` method [here](https://github.com/majamassarini/packit/blob/multiple_distgit_external_package_config/packit/config/package_config.py#L174-L196). And we should **add a new way to match jobs and handlers** in `steve_job.process_jobs` method.
@@ -63,7 +62,6 @@ I see two possible solutions to support monorepos.
    Modify `steve_job.process_jobs`, `steve_job.get_handlers_for_event`, `steve_job.get_config_for_handler_kls` methods to work with the new data structure returned by the `package_config.get_grouped_job_views`.
 
    At the end the `steve_job.process_jobs` will create only handlers taking a list of `JobConfig` or `JobConfigView` and for this reason we will modify all our handlers to loop over all the given configs.
-
    - **PROS**: one single way to _match jobs and handlers_
 
    - **CONS**: we are suggesting that all the handlers should be able to handle multiple configs, but this is probably not true.

diff --git a/research/source-git/dist-git-to-src-git/updates.md b/research/source-git/dist-git-to-src-git/updates.md
@@ -118,7 +118,6 @@ We have various way, how to save metadata for regeneration:
 
 - Easy solution that mimics the history overwriting without force push.
 - There are multiple ways, how to do this:
-
   - (I) Regenerate the source-git from scratch and use
     [ours](https://git-scm.com/docs/merge-strategieshttps://git-scm.com/docs/merge-strategies)
     merging strategy to merge the new version on top of the old version ignoring its content.