Skip to content

Network issue when colocating k8s and lxd #811

@shundezhang

Description

@shundezhang

Bug Description

In a typical COS deployment, k8s and ceph are deployed on the same node: https://pastebin.ubuntu.com/p/XKhqw7bJjZ/
So ceph-mon is deployed in a lxd container so that it is separated from ceph-osd.
Now the problem is, from the lxd container (where ceph-mon runs) it cannot reach traefik LB address on k8s.
K8s is Charmed Canonical k8s with cilium CNI.
When curl the traefik LB address in lxd container, it hangs and eventually times out.

root@juju-13c073-0-lxd-0:~# curl 10.250.120.100/cos-lite-grafana
curl: (28) Failed to connect to 10.250.120.100 port 80 after 129173 ms: Connection timed out

Interestingly, when colocating microk8s and lxd, this works: the lxd container can reach metallb's LB address.
Since microk8s uses Calico, maybe some config in cilium needs adjusted for this to work.

To Reproduce

  1. Set up MaaS, and create 3 machines for this (4 cores, 8GB RAM). Each machine should have 2 disks, as one is for ceph osd. They are tagged with "k8s".
  2. Deploy k8s and ceph: https://pastebin.ubuntu.com/p/QCrVWTqZ5Y/
  3. Add k8s to juju as cloud
juju exec  -u k8s/0 -- sudo k8s config | tee kubeconfig.yaml
KUBECONFIG=kubeconfig.yaml juju add-k8s k8s-cos --client --controller foundations-maas
  1. Deploy cos: https://pastebin.ubuntu.com/p/xBhfPrP9Fz/
  2. SSH into one ceph-mon unit, and curl traefik LB's address.

Environment

Please see above

Relevant log output

Please see above

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions