Skip to content

[bug] Built-in KMS encryption doesn't work with a custom CA chain #991

Open
@Clockwork-Muse

Description

@Clockwork-Muse

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Omni cannot be used as an automatic KMS system when Omni has a certificate from a custom root chain; a loop of "untrusted certificate errors" is printed to the logs. Any attached nodes are effectively bricked, and require re-installation.

Image
(sorry this is an image, but it's a remote console view)

Note that installation of a cluster with custom root chains over the entire process otherwise works, via specifying the trusted roots config on initial ISO boot and in cluster config during install.

Expected Behavior

The automatic KMS setup includes the certificates from Omni in the setup. Or alternatively, provides a way to specify one.

Steps To Reproduce

  1. On Ubuntu (or any other host), generate a custom root certificate and chain (or get one signed by a non-standard certificate chain).
  2. Install the certificate root to the host CA store.
    • On Ubuntu, move the new certificate chain into /usr/local/share/ca-certificates, then run sudo update-ca-certificates
  3. Follow the standard instructions for self-hosting. Instead of using the directions for a normal trusted cert, mount in the created certs.
    • In addition to mounting in the custom certificates, also mount the host certificate chains by adding the following mounts, eg:
    # For Ubuntu host, but should be similar on many distros
    docker run 
    ...
     -v /etc/ssl/certs:/etc/ssl/certs:ro \
     -v /usr/share/ca-certificates:/usr/share/ca-certificates:ro \
    ...
    ghcr.io/siderolabs/omni:v0.46.3
    ...
    
  4. Access the Omni console as normal and register a new machine;
    • When downloading installation media, specify the kernel parameter talos.config=metal-iso
    • Follow the directions to generate the required iso, the contents (at least partially) being the TrustedRootsConfig config
    • Make sure to also add the TrustedRootsConfig as a machine config patch after it has registered itself.
  5. Attempt to spin up a new cluster with Omni as the KMS system;
    • Add the TrustedRootsConfig as a cluster config patch
    • Also select the checkbox for "Encrypt Disks":

Image

When you hit "Create Cluster", observe that Talos Linux installs, but after the machine reboots it fails to decrypt the STATE partition (which isn't entirely surprising given the warnings about any network config).

What browsers are you seeing the problem on?

No response

Anything else?

There could be multiple ways that it may have such a chain;

  • Corporate MITM proxy for SaaS solution
  • On-prem hosting with custom trust roots for security or other reasons

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions