Skip to content

design: Extract Micro-OS from PV storage #217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

osinstom
Copy link
Contributor

@osinstom osinstom commented Apr 23, 2025

Description

This PR brings the design proposal to extract the Micro-OS image from PV storage in order to improve EN provisioning time.

Any Newly Introduced Dependencies

N/A

How Has This Been Tested?

N/A

Checklist:

  • I agree to use the APACHE-2.0 license for my code changes
  • I have not introduced any 3rd party dependency changes
  • I have performed a self-review of my code

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! Please make sure to review our Contributing Guide.

…latform/edge-manageability-framework into proposal-eim-decouple-hookos

Last updated: 24.04.2025

## Abstract
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another benefit to list is also the startup time of DKAM - given all the issues we faced with the readiness probe implementation

DKAM should not do any of the steps above. Therefore, the main challenge to solve is how to provide CA certs and environment variables to the Micro-OS image downloaded from an external source,
without sacrificing the security.

### Solution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the last 2 steps impacted if we move from hookos to emt-tinker ?


dkam->>pa: Store signed iPXE EFI binary

user->>+rs: Download RS certificate & Micro-OS DER key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RS Certrificate might not be required use opt if the certificte is ss only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this. The signing of Micro-OS is used to add an additional security level when downloading via iPXE, which is only secured by HTTPS with one-way TLS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iPXE client is not really rich like other clients to carry well known CAs. So, irrespective of the source, we might need to add the trust chain for TLS.

Comment on lines +135 to +140
Steps 4-8: At deployment time, DKAM reads the orchestrator CA certificate and cluster-wide variables,
and generates two artifacts: 1) the signed Micro-OS bundle, and 2) the signed iPXE script that downloads
the Micro-OS image and the Micro-OS bundle.
We assume that both artifacts are signed with the same key that is generated by DKAM.
Note that iPXE includes standard public certificates (like Microsoft or AWS) so we don't need to embed
these certificates, as long as it's not trusted by iPXE or a self-signed certificate that is used by a local storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we skip certificate bundling if it has a public root ? Do we need to handle certificate rotation?

- the orchestrator CA certificate (uploaded now)
- the secure key used to sign iPXE and orchestrator bundle (uploaded now)
- the secure key used to sign EMT (uploaded now) - only if EMT is used
- the secure key used to sign Micro-OS image (new). This key is downloaded from the external storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- the secure key used to sign Micro-OS image (new). This key is downloaded from the external storage.
- the secure key used to sign Micro-OS image (uploaded now). This key is downloaded from the external storage (new).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking of the DER key that is currently used to sign EMT images? I'm not sure if we can use the same one.

Comment on lines +164 to +165
**Phase 1**: Work on a PoC by storing the current non-curated HookOS on the RS file server. The HookOS image is currently distributed as OCI artifact.
The PoC would require to store it as non-OCI on the S3 bucket. The PoC should give an answer if the solution described in this proposal is feasible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep it as OCI and use services as ECR ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we cannot download OCI scripts with iPXE

1. How to handle authentication to external storage? It must be an HTTPS endpoint and TLS certificates must be embedded into
the iPXE binary, but if an external storage requires some additional authentication (e.g., JWT like the internal RS)
iPXE is not able to support that.
2. The design doesn't assume that the trust CA certifcates can be rotated/refreshed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please elaborate more on .2


note over en: EN powered on

en->>pa: Download iPXE binary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the signature verification key and the CA cert chain for orch still needs to be configured into the EN.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdithyaBaglody, please help review if the SB flow remains intact.

Copy link

@AdithyaBaglody AdithyaBaglody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are inherent security risks in this proposal.

environment files that are needed for Micro-OS to work with the given Edge Orchestrator instance. The files package will be called a "Micro-OS bundle" hereinafter.
- The Micro-OS bundle is a signed archive. It is signed by the same keychain as iPXE script.
- iPXE script along with the Micro-OS bundle are still stored on the orchestrator PV. Since these are small files, the provisioning time would not be impacted.
- ENs will download the Micro-OS bundle and the Micro-OS image during the iPXE phase and inject the Micro-OS bundle to the Micro-OS image (via `initrd`), similarly to how the initramfs is loaded.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IHMO this is not going to work. Did someone do a TR??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick PoC and I was able to inject a tarball containing a cert file along with the initramfs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this cert used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is orchestrator CA certificate that is needed for Micro-OS services (device discovery, tink worker, ...) to connect to orchestrator

@osinstom
Copy link
Contributor Author

I believe there are inherent security risks in this proposal.

Can you please elaborate on the security risks so that we can think how we can fix them?

@AdithyaBaglody
Copy link

I believe there are inherent security risks in this proposal.

Can you please elaborate on the security risks so that we can think how we can fix them?

@osinstom This proposal talks about signing the image on the edge. We have no way to trust the edge node at that time.

You intend to extend the cpio archive of the initramfs by using a signed archive. This would create a new archive, effectively it will become an unsigned image. grub can only verify the complete image. If you are creating a cpio archive on the fly, there is no way to sign it and make it trusted.

@osinstom osinstom changed the title Design proposal to extract Micro-OS from PV storage design: Extract Micro-OS from PV storage May 8, 2025
@ajaythakurintel ajaythakurintel added the Proposal Identify a PR as a design proposal to be reviewed. label May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal Identify a PR as a design proposal to be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants