Skip to content

Canonical K8s MAAS cluster creation with CAPI gets stuck when default CNI is set to false #169

@vishu2498

Description

@vishu2498

Issue:

When deploying a Canonical K8s MAAS cluster using CAPI, if the default CNI option is set to false, the managment cluster is forever waiting for the bootstrap cluster to get created and arrive in running state since the CNI needs to be installed manually on the MAAS instance.

Steps to reproduce:

An option is present to set default CNI to false as shown here:

spec:
  initConfig:
    enableDefaultNetwork: false

Reference: https://documentation.ubuntu.com/canonical-kubernetes/latest/capi/reference/configs/#initconfig

Create a cluster using this configuration in CK8ControlPlane object and the issue will be reproducible.

Expected Behaviour:

The bootstrap cluster inside MAAS instance should be up and running and setting default CNI to false should not impact the boostrap cluster. Later when the bootstrap cluster is up (possibly using default CNI), user/CAPI can install their own CNI on the created cluster.

Actual Behaviour:

Inside the config used by script, it disables the CNI installation (cilium CNI installation). This methods works correct when deploying with k8s-snap package, but it fails via the CAPI method.

The bootstrap cluster is created without CNI and is stuck with only these pods:

NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE
kube-system   coredns-fc9c778db-bhqwc           0/1     Pending   0          11s
kube-system   metrics-server-8694c96fb7-6b79j   0/1     Pending   0          11s

At this stage, k8s status doesn't show ready status and stays the same.

From the CAPMAAS side, it only proceeds if it is able to create a client to target cluster (using kubeconfig secret) and is able to list the nodes (only then it confirms that API server is online).

It waits with logs like this:

I0717 05:05:24.427993       1 maasmachine_controller.go:353] "API Server is not online; requeue" logger="controllers.MaasMachine" maasmachine="vishu-maas-1-cp-61386-8vvln" cluster="vishu-maas-1" state="Deploying" m-id="aqhn73"

Now since the bootstrap cluster inside the MAAS instance has CNI disabled, there is nothing from management side that can be done to get this cluster up. As per the documentation, the k8s cluster deployed with snap also needs to disable CNI first and then manually apply alternative CNI. However, CAPI can’t proceed here because user first needs to SSH inside the instance, access the k8s cluster and manually install CNI for the process to go through.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions