-
Notifications
You must be signed in to change notification settings - Fork 6
docs: add design poposal for integrating Kubernetes into EMT #273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
Manual cluster creation is an existing method where users create a cluster by selecting a template and hosts. Before we talk about manual cluster creation workflow, it is important to understand the coupling between EMT image version and cluster template introduced by integrating Kubernetes into EMT. For instance, creating a K3s cluster with a K3s v1.32 template may not work as expected on a EMT machine provisioned with K3s v1.30. Such discrepancies can occur when Edge Orchestration undergoes multiple upgrades, resulting in a mix of multiple versions of cluster templates and EMT machines. While this issue is relevant for K3s on EMT only, given that this combination is the primary use case, it is important to address it properly in the user workflow. There are a few options to mitigate this issue. | ||
|
||
**Option 1)** Restrict cluster creation to onboarded hosts only, excluding provisioned hosts. The Cluster Manager automatically selects the appropriate EMT version, and requests instance creation to the Infra Manager to install the OS and Kubernetes. Could be simplest option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doable by looking at the content of the manifest.json associated with an OSProfile. It tells basically all the package versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is the part that I wanted confirmation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who would be doing that? UI? Cluster Orch? Some component must parse the manifest and infer the installed k3s version. Please clarify in the proposal
**1. Automatically when a host is deauthorized**: Users can opt to delete the cluster when the host is deauthorized, with this option enabled by default during cluster creation. The Cluster Manager will automatically delete the cluster when all hosts within it are deauthorized. | ||
**2. Manually through a direct request to the Cluster Manager**: Users can initiate a cluster deletion by making a direct API call to the Cluster Manager. | ||
|
||
For manual deletion, the handling of cluster nodes depends on the approach taken for manual creation. If manual creation is restricted to onboarded hosts only (Option 1), deleting a host should deauthorize the cluster nodes, making users to re-onboard the hosts for reuse. If either Option 2 or Option 3 is chosen, deleting a host should only clean up Kubernetes, which aligns with the current behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont fully understand this paragraph. deleting a host should deauthorize the cluster nodes, making users to re-onboard the hosts for reuse.
-> how does this relate with option1 or option3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is counter-intuitive: I would expect that deleting a cluster-node, automatically you will delete Instance and Host in EIM to allow them to be onboarded again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See updates, I tried to explain it better. In short, it's for consistency across workflows.
Co-authored-by: Pier Luigi Ventre <[email protected]>
|
||
## Abstract | ||
|
||
The Edge Microvisor Toolkit (EMT) is an operating system specifically designed for hosting edge workloads, streamlining traditional general-purpose operating systems by including only the essential components needed to run container-based applications. Our experience from previous releases has demonstrated that EMT's design principles—image-based deployment and immutable root filesystem—enhance the reliability and consistency of cluster creation compared to even well-maintained general-purpose operating systems, such as Ubuntu. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw on the long term will be these the only image we plan to support for EMT/EMT-S? cc @krishnajs
|
||
The first option introduces a streamlined process for users seeking simplicity or bulk registration of hosts, or both. Both the Web UI and CLI will offer a new toggle for `Create Cluster Automatically` (potentially combined with `Provision Automatically`, pending final design decisions) during registration. This option is only valid when `Provision Automatically` is enabled. When enabled, it automatically creates a single-node cluster for each host. | ||
|
||
Users can provide a default cluster template for all hosts in the registration request, with the flexibility to override it for specific hosts if needed. For EMT machines, once a specific OS profile is selected, only cluster templates compatible with that EMT version will be displayed. This includes all RKE2 templates and K3s templates that match the K3s version embedded in the selected EMT image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ho do you override the template to a specific one ? can we have a drop down in the registration instead of a toggle ? where the dropdown allows you to pick the template between any k8s type ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is more about UX design. Patricia has initial design so feel free to reach out her, but "toggle" is for user to select auto cluster creation enabled or disabled. User should provide more information when enabled in subsequent steps of registration.
- **EMT Image Version**: Specifies the operating system and embedded software, including the K3s version, on the edge device. | ||
- **Cluster Template**: Defines the configuration for Kubernetes cluster creation, including the Kubernetes flavor, version, and custom control plane settings. | ||
|
||
For instance, attempting to create a K3s cluster using a template for K3s v1.32 on an EMT machine embedding K3s v1.30 may result in compatibility issues. Such mismatches can occur when Edge Orchestration undergoes multiple upgrades, leading to a mix of cluster templates and EMT machines with varying versions. While this issue is specific to the K3s-on-EMT scenario, it is critical to address it effectively in the user workflow, as this represents the most common use case for EMF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to include A/B upgrades in the discussion or create a new ADR. In this logic CO is not involved - what is the impact on our ms. cc @daniele-moro that is working on day2 improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added an "Upgrade Cluster" section. There is a small task for the Infra Manager to address, which involves preventing unintended K3s version update through EMT update. Note that for the 3.1 release, we will not support multiple K3s versions nor K8s version upgrade in CO - we're still waiting CAPI to implement the foundation for in-place upgrades. So upgrade can wait for a future release.
|
||
#### Automatic Cluster Creation | ||
|
||
The first option introduces a streamlined process for users seeking simplicity or bulk registration of hosts, or both. Both the Web UI and CLI will offer a new toggle for `Create Cluster Automatically` (potentially combined with `Provision Automatically`, pending final design decisions) during registration. This option is only valid when `Provision Automatically` is enabled. When enabled, it automatically creates a single-node cluster for each host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some deployments (like minimal OXM deployment) we won't deploy cluster orchestration. In that case we need to configure UI accordingly to not show the new toggle for CO. CC: @teone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyunsun @gcgirish and @adorney99 we had a long discussion in EIM team and here is the summary of the feedback specifically showing the difference between OXM use case and End customer use case with EMF. We need to reconcile on this
@krishnajs : Thanks. Two questions.
|
Original equipment manufacturer like Dell, Lenovo, Advantech.
Yes, its multi-tenancy GW. We don't think OXM would need this (not sure yet). But for now what we are saying is we will not disable or change MT/MT Gateway in 3.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyunsun i have read and reviewed the doc of the proposed implementation
Description
Add design proposal for integrating Kubernetes into Edge Microvisor Toolkit.
Any Newly Introduced Dependencies
Please describe any newly introduced 3rd party dependencies in this change. List their name, license information and how they are used in the project.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Checklist: