Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions research/automation-tools/rdopkg.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,11 @@ authors: flachman
## Actions

- [fix](https://github.com/softwarefactory-project/rdopkg/blob/master/doc/rdopkg.1.adoc#action-fix) -- Apply changes to the `.spec` file.

1. Bump Release, prepare a new `%changelog` entry header.
2. Drop to shell, let user edit the `.spec` file.
3. After running `rdopkg`, ensure description was added to `%changelog` and commit changes in a new commit.

- [patch](https://github.com/softwarefactory-project/rdopkg/blob/master/doc/rdopkg.1.adoc#action-patch) -- Introduce new patches to the package.

1. Unless -l/--local-patches was used, reset the local patches branch to the remote patches branch.
2. Update patch files from local patches branch using git format-patch.
3. Update .spec file with correct patch files references.
Expand All @@ -52,7 +50,6 @@ authors: flachman
7. Display the diff.

- [new-version](https://github.com/softwarefactory-project/rdopkg/blob/master/doc/rdopkg.1.adoc#action-new-version) -- Update package to new upstream version.

1. Show changes between the previous version and the current one, especially modifications to requirements.txt.
2. Reset the local patches branch to the remote patches branch
3. Rebase the local patches branch on \$NEW_VERSION tag.
Expand Down
2 changes: 0 additions & 2 deletions research/celery/task-workflow-refactor.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,7 @@ Possible solutions, which can be somehow combined:
1. serialize the info about objects and pass it into `send_task`
- this would need serializing and then again deserializing
2. save the info about objects in DB and pass IDs of models into `send_task`

- what models does make sense to have? possibilities:

- project
- package config
- service config
Expand Down
2 changes: 0 additions & 2 deletions research/celery/tasks-prioritizing.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ authors: lbarczio
## How to do it

- from [FAQ](https://docs.celeryproject.org/en/master/faq.html#does-celery-support-task-priorities):

- Redis transport emulates priority support
- prioritize work by routing high priority tasks to different workers, this usually works better than per message priorities

Expand All @@ -49,7 +48,6 @@ authors: lbarczio
### Task priority

- docs are not so clear:

- [priority](https://docs.celeryproject.org/en/latest/reference/celery.app.task.html?highlight=celery.app.task#celery.app.task.Task.priority)
attribute of the `Task` - default task priority
- [priority](https://docs.celeryproject.org/en/stable/reference/celery.app.task.html?highlight=priority#celery.app.task.Task.apply_async)
Expand Down
3 changes: 0 additions & 3 deletions research/database/refresh.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,16 @@ authors:
## Usecases

1. Show the whole workflow to the user.

- It's not clear what it is.

2. For each step, we get:

- previous step
- next steps
- other steps from this group (e.g. other chroots for this build)

3. It is possible to rerun the whole workflow.
4. It is possible to rerun one step (and all the follow-up steps).
5. It is possible to rerun a part of one step (and the follow-up step(s)).

- E.g. one chroot.

6. For project, we get all workflows.
Expand Down
2 changes: 0 additions & 2 deletions research/deployment/deployment-improvements/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ Areas to be covered:
### Installation source

- github

- pros:
- current changes in other projects are always in place - useful especially for stg branch
- cons:
Expand All @@ -46,7 +45,6 @@ Areas to be covered:
### Image build approach

- s2i: Source-to-Image (S2I) is a tool for building reproducible, Docker-formatted container images. It produces ready-to-run images by injecting application source into a container image and assembling a new image. The new image incorporates the base image (the builder) and built source and is ready to use with the docker run command. S2I supports incremental builds, which re-use previously downloaded dependencies, previously built artifacts, etc.

- pros:
- separating code and image development - probably advantage in bigger projects where development and devops is separated
- cons:
Expand Down
7 changes: 0 additions & 7 deletions research/deprecation/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ authors: mfocko
Looked into the options suggested by @lachmanfrantisek which were:

- `Deprecated`

- seems as a good choice, offers decorator that has optional parameters such as version or custom message
- live GitHub repo
- fast release cycle
Expand All @@ -24,30 +23,24 @@ Looked into the options suggested by @lachmanfrantisek which were:
from docs, all properties are optional, you can add reason (usually alternative) or version in which it was deprecated

- `Python-Deprecated`

- dead version of `Deprecated`, which is probably kept in PyPI just for backward-compatibility

- `deprecationlib`

- seems like hobby project, only one information in decorator (alternative function name)

- `Dandelyon`

- looks like nice project
- offers multiple decorators
- doesn't seem to be very active

- `deprecate`

- dead project

- `deprecation`

- not very active
- multiple issues

- `libdeprecation`

- dead version of `deprecation`

- `warnings` (built-in module)
Expand Down
1 change: 0 additions & 1 deletion research/integrations/console.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ the future.
How we can integrate with it?

1. We should use [AWS Launch Templates](https://docs.aws.amazon.com/autoscaling/ec2/userguide/launch-templates.html).

- One cannot configure instance parameters using the provisioning API.
- We also need to be strict about this: we don't want users willy nilly
change their instances (i.e. request 64G mem).
Expand Down
2 changes: 0 additions & 2 deletions research/integrations/downstream/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,6 @@ User can configure a repository where Packit create an issue in case of failed u
There is an occurring question if the new functionality/job will be incorporated into the existing service, done as a separate deployment or done from scratch. Let's put down some benefits and problems.

1. New functionality added to the existing service

- The new functionality is added as a new handler (=a separate class).
- If we need to react to a new event, the parsing needs to be implemented.
- The mapping between event and handler is done by decorating the handler and explicitly setting the event we react on.
Expand All @@ -262,7 +261,6 @@ There is an occurring question if the new functionality/job will be incorporated
- Since we have one database, we can show some overall status and combine information from upstream part and downstream part (including the propose-downstream job that is somewhere in the middle).

2. New functionality as another deployment

- It's more a version of the previous one.
- Benefits are independence, being able to have different identities and limits.
- The main downside is a duplicated afford needed to maintain the service and to run the shared part (task scheduler, listeners, API service).
Expand Down
133 changes: 133 additions & 0 deletions research/integrations/fedora-ci/deployment-move.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
---
title: Decoupling Fedora CI deployment from Packit Service
authors: lbarcziova
---

Related links:

- [issue](https://github.com/packit/packit-service/issues/2737)

This research describes requirements and plan to decouple Fedora CI-specific worker functionality from
the [Packit Service repository](https://github.com/packit/packit-service) and deploy it as a separate, independent service.
This will improve maintainability, scalability, and make the deployment within Fedora infrastructure easier.

## Code

- create the new repo (`https://github.com/packit/fedora-ci`?) structure, something like this:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably prefer more explicit, something like fedora-ci-worker


```
fedora-ci/
├── fedora_ci/
│ ├── handlers/ # Fedora CI handlers
│ ├── helpers/ # Helper classes like FedoraCIHelper
│ ├── checker/ # Checker classes
│ ├── jobs.py # Job processing logic
│ └── tasks.py # Celery tasks
├── tests/

```

- code migration:
- identify and move all Fedora CI-related worker functionality from packit-service to the new repository; this concerns jobs that do not depend on a repo having Packit configuration in the repository
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if there would be some code (I am thinking to the events as an example) that has to be moved in a third repository shared both by packit-service and fedora-ci?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following how we did this in hardly, I would start with having the service code in one repository (i.e. all the events being placed in packit-service), as decoupling of that could be more complex.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am probably missing something here: if events will stay in packit-service, and packit-service will import from fedora-ci but fedora-ci would need events in packit-service... I don't think we can make this work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and packit-service will import from fedora-ci

do you mean this just for the "importing" solution for the transition period? That might be a good point. Regarding the transition, I am more inclined to trying minimise the time of transition and the code changes rather.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I am referring the transition time solution. I am not against a quick solution, I just fear we may hit cyclic imports and not being able to solve them quickly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majamassarini yes I see your point now and I agree. And it might still not even really be "quick" nevertheless. Let's discuss more tomorrow.

- set up tests and CI
- create files needed for deployment: `run_worker.sh`, Containerfile, docker-compose file, etc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

Time to move on to podman-compose, I hate the complaints in shell each time I forget to pass COMPOSE

- remove the moved code from the `packit-service` repo
(this should be done after the code from new repo is deployed)
- needed changes:
- once there is separate dashboard page, change the URL paths in the code

### Implementation changes during the transition period

- changes only to the new repo, without deploying
- this would mean for few weeks the changes wouldn't take effect in the old deployment,
and might cause some bugs landing when the new deployment happens which might be harder to investigate (if there is a lot of new changes)
- old deployment might continue to run with known bugs
- changes to both repos
- this involves duplicated work, and might be prone to errors (e.g. forgetting to apply a change to one repo)
- first changing the code in old repo, and then applying the same to the new one
- importing the code from new repo
- cleaner transition
- requires more initial effort and might be more complex to set up
- how to implement:
- git submodule - directly link the new repository as a subdirectory in the old one
- open to other suggestions
- this might be not possible to do easily, as it could cause circular imports, as fedora-ci code will need to import events from packit-service

- in any case, we could try to minimize the new features and focus only on bug fixing during this time

## DB and API

- schema same, empty tables
- do we want to migrate the data from the current deployment?
- API at e.g. `prod.fedora-ci.packit.dev/api`

## Dashboard

- keeping one instance or having 2
- 1 instance: we can use [`Context selector`](https://www.patternfly.org/components/menus/context-selector/design-guidelines) like OpenShift does
- using different backends
- we agreed we prefer this solution
- 2 instances:
- implementation wise this would require more changes
- to consider: this might be also required to be deployed in Fedora infra
- dashboard deployment is quite straightforward, shouldn't be an issue

## Identity

- we probably want a new identity (or 2, both for stg and prod) on `src.fedoraproject.org` to be set up
- current Fedora CI user (`releng-bot`) is in these groups:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it used only for Fedora CI?

- cvsadmin
- fedora-contributor
- fedorabugs
- packager
- relenggroup

## Openshift

### Ansible playbook, roles, Openshift object definitions

- [Fedora infra ansible repo](https://pagure.io/fedora-infra/ansible)
- copy and adjust the existing [deployment playbook](https://github.com/packit/deployment/blob/main/playbooks/deploy.yml) and related files
- mimic existing Fedora infrastructure playbooks, such as https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/openshift-apps/openscanhub.yml, and remove any unneeded tasks specific to Packit Service
- copy and adjust the Openshift object definitions, also remove Packit Service specific values (e.g. MPP specific)
- logs collection in Fedora infra?

### Configuration

- create Packit service config (specific server name etc.), variable file templates

### Secrets

- all the secrets should be new (different from Packit Service)
- certificates
- identity related files: token, SSH keys, keytab
- Fedora messaging
- Testing Farm
- Flower
- postgres
- Sentry
- ?

# To discuss

- repo naming
- fedora-ci-worker
- identity
- new one
- do we want both stg and prod? new code deployment strategy? weekly prod updates?
- yes, for the beginning stick with weekly updates, this might need to be adjusted later on
- existing data migration
- let's not do this and rather spend time on other tasks
- how to handle code changes while being in the process of the decoupling
- try to minimize changes, urgent fixes contribute to both repos

# Follow-up work (to be adjusted based on discussion)

- code migration as described above:
- functionality and tests
- CI setup
- deployment related files
- configuration and secrets generation
- integrate our deployment into https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks
- dashboard changes
- reverse dependency tests run in packit-service repo to make sure changes there do not break the Fedora CI
1 change: 0 additions & 1 deletion research/integrations/gitlab/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@ There are many ways available for us to move forward.
- This service can then be enabled by the project maintainer by going to Project settings -> Integrations -> Packit service, eg. [test-instance](http://52.183.132.26:3000/testpackit/testing/-/settings/integrations).

- For adding project integration to gitlab instances we have two options to move forward:

1. We contribute to the [GitLab](https://gitlab.com/gitlab-org/gitlab/tree/master/app/models/project_services) and can reach large audiance, but for contributing to gitlab is a time taking process. (Currently looking into it)

2. Add our project integration code directly to the custom gitlab instances that we currently want to support.
Expand Down
1 change: 0 additions & 1 deletion research/integrations/image-builder/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,6 @@ The integration is pretty straightforward:
2. Implement a handler for it (trigger = successful Copr build + explicit `/packit` command)
3. Wait for the build to finish: babysit/polling/celery
4. Auth - create a 'service' account for Packit on access.redhat.com

- Attach employee SKU to it
- Create a refresh token & store it in bitwarden
- Inform Image Builder team about this user so they are aware of it (maybe
Expand Down
2 changes: 0 additions & 2 deletions research/monitoring/error-budgets/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ authors:
## Next steps for Packit

1. Identify stakeholders who can help us to define our SLO

- Project which are the most frequent users of the service.
- Prominent users:
- [rhinstaller/anaconda](https://github.com/rhinstaller/anaconda)
Expand All @@ -31,7 +30,6 @@ authors:

2. Discuss and document their expectations. At a minimum in terms of
(questions are provided as an example):

- latency
- How fast should builds/tests start? (First feedback from the service
that something is happening.)
Expand Down
8 changes: 4 additions & 4 deletions research/monitoring/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ authors: lbarczio

- [Prometheus Flask exporter](https://github.com/rycus86/prometheus_flask_exporter)
- metrics are configured via decorators, e.g. `@metrics.counter(..)`:

```python
@app.route('/<item_type>')
@metrics.do_not_track()
Expand All @@ -15,6 +16,7 @@ authors: lbarczio
def by_type(item_type):
pass # only the counter is collected, not the default metrics
```

- the metrics are by default exposed on the same Flask application on the /metrics endpoint,
this can be adjusted
- counters count invocations, other types (histogram, gauge, summary) collect metrics based on the
Expand Down Expand Up @@ -44,6 +46,7 @@ authors: lbarczio
- metrics:
- celery_workers - number of workers
- celery_tasks_total - number of tasks per state (labels name, state, queue and namespace):

```
celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="RECEIVED"} 3.0
celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="PENDING"} 0.0
Expand All @@ -53,6 +56,7 @@ authors: lbarczio
celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="REVOKED"} 0.0
celery_tasks_total{name="my_app.tasks.fetch_some_data",namespace="celery",queue="celery",state="SUCCESS"} 7.0
```

- celery_tasks_runtime_seconds
- celery_tasks_latency_seconds - time until tasks are picked up by a worker - this can be helpful for us and is
not included in the first exporter metrics
Expand Down Expand Up @@ -80,9 +84,7 @@ authors: lbarczio
- builtin Monitoring view in clusters we use currently - this should use some of the tools below

- previous research:

1. [`kube-state-metrics`](https://github.com/kubernetes/kube-state-metrics)

- converts Kubernetes objects to metrics consumable by Prometheus
- not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods
- metrics are exported on the HTTP endpoint `/metrics` on the listening port, designed to be consumed either by
Expand All @@ -103,7 +105,6 @@ authors: lbarczio
[CLI args](https://github.com/kubernetes/kube-state-metrics/blob/master/docs/cli-arguments.md#command-line-arguments)

2. [Node exporter](https://github.com/prometheus/node_exporter)

- Prometheus exporter for hardware and OS metrics exposed by \*NIX kernels
- runs on a host, provides details on I/O, memory, disk and CPU pressure
- can be configured as a side-car container, [described](https://access.redhat.com/solutions/4406661)
Expand Down Expand Up @@ -131,7 +132,6 @@ authors: lbarczio
- container_network_receive_bytes_total - cumulative count of bytes received
- container_processes - number of processes running inside the container
4. [`kubernetes_sd_config`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config)

- in the Prometheus configuration allow Prometheus to retrieve scrape targets from Kubernetes REST API and stay synchronized with the cluster state.
- role types that can be configured to discover targets:
- `node` - discovers one target per cluster node with the address defaulting to the Kubelet's HTTP port
Expand Down
2 changes: 0 additions & 2 deletions research/monorepo-support/refactoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ I see two possible solutions to support monorepos.
Substitute the `self.event.package_config.jobs` calls like in this [commit](https://github.com/majamassarini/packit-service/commit/10d012bfddef815ad03781c2e3907998e20d8c7f). Where the `package_config.get_job_views` method looks like [this](https://github.com/majamassarini/packit/blob/multiple_distgit_external_package_config/packit/config/package_config.py#L157-L172).

The above solution resolves a test like [this](https://github.com/majamassarini/packit-service/blob/multiple_distgit_packit_api/tests/unit/test_jobs.py#L3134-L3234).

- **PROS**: we don't need to touch much more code than this. Our handlers are designed to work with one `JobConfig` and they will keep doing that, working in the same way with a `JobConfigView` (or just pick another name for it) and a `JobConfig`.

- **CONS**: if, for supporting monorepos, we need to deal with multiple packages in the same handler. Then we need to group together the `JobConfigView`s, like in the `package_config.get_grouped_job_views` method [here](https://github.com/majamassarini/packit/blob/multiple_distgit_external_package_config/packit/config/package_config.py#L174-L196). And we should **add a new way to match jobs and handlers** in `steve_job.process_jobs` method.
Expand All @@ -63,7 +62,6 @@ I see two possible solutions to support monorepos.
Modify `steve_job.process_jobs`, `steve_job.get_handlers_for_event`, `steve_job.get_config_for_handler_kls` methods to work with the new data structure returned by the `package_config.get_grouped_job_views`.

At the end the `steve_job.process_jobs` will create only handlers taking a list of `JobConfig` or `JobConfigView` and for this reason we will modify all our handlers to loop over all the given configs.

- **PROS**: one single way to _match jobs and handlers_

- **CONS**: we are suggesting that all the handlers should be able to handle multiple configs, but this is probably not true.
Expand Down
1 change: 0 additions & 1 deletion research/source-git/dist-git-to-src-git/updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,6 @@ We have various way, how to save metadata for regeneration:

- Easy solution that mimics the history overwriting without force push.
- There are multiple ways, how to do this:

- (I) Regenerate the source-git from scratch and use
[ours](https://git-scm.com/docs/merge-strategieshttps://git-scm.com/docs/merge-strategies)
merging strategy to merge the new version on top of the old version ignoring its content.
Expand Down
Loading