Skip to content

Pull labs implementation #1371

@nuclearcat

Description

@nuclearcat

During recent Lab Working Group meetings, we discussed an idea for pull-based labs.

One of the major problems new labs face when joining the KernelCI network is the complexity of setting up job submission infrastructure. A significant issue is allowing an external party (KernelCI) to access their infrastructure to push jobs, which requires exposing ports and opening firewalls.

In the pull model, instead of pushing jobs to participating labs, labs pull job definitions from the KernelCI server when they are ready to run new jobs.

This approach offers several advantages:

  • Security: Labs do not need to expose any ports or open firewalls, as they initiate the connection to the KernelCI server. This reduces the attack surface and enhances security.
  • Control: Labs have more control over when and how they receive job definitions. They can pull jobs at their convenience, allowing for better resource management.
  • Simplicity: The setup process becomes simpler, as labs only need to configure a client to pull jobs, rather than setting up a server to receive pushed jobs.
  • Scalability: As the number of participating labs grows, the pull model scales more easily, since each lab manages its own job retrieval process and can simply dispatch jobs once they are pulled.

However, there are also some challenges to consider:

  • Latency: There may be a delay between when a job becomes available and when a lab pulls it, which could impact job turnaround times. Maintainers expect to see results as soon as possible.
  • Complexity in Job Management: The burden of managing job queues and ensuring that jobs are pulled in a timely manner shifts to the labs, which may require additional infrastructure or processes on their end. However, this becomes significant only when labs have a large number of jobs to run.

Additionally, we are introducing lab-agnostic job definitions. Previously, job definitions were tied to specific software (LAVA) and specific lab infrastructure. With lab-agnostic job definitions, jobs are defined independently of the underlying infrastructure, allowing for greater flexibility and easier integration with different labs.

Overall, the pull model for job submission in KernelCI offers a promising alternative to the traditional push model, with potential benefits in security, control, simplicity, and scalability. However, careful consideration of the challenges and implementation details will be necessary to ensure its success.

As a first step, I implemented a simple prototype for pull-based job definition retrieval using a Python script that periodically polls the KernelCI server for events. Ben Copeland from Linaro helped with a more complete implementation of the client side that can run jobs using tuxrun.

Here we will post follow-up and related issues.

Sub-issues

Metadata

Metadata

Labels

roadmapAPI & Pipeline Roadmap

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions