Skip to content

Unable to add Agents - cert issue #3901

@npsoniembark

Description

@npsoniembark

Hey everyone, I am trying to install rke2 on google compute engine instances to get a feel of how its working.

Here's what Ive done so far.

  1. I have one master instance where rke2-server is up and running. I can use the kubeconfig file to list all the namespaces and pods seem to be okay
  2. I am not trying to connect an agent to it which fails during startup.
  3. I have ensured that there is no firewall blocking the connection.

Here are the logs from agent

Feb 13 09:18:41 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:41Z" level=info msg="Running load balancer rke2-agent-load-balancer 127.0.0.1:6444 -> [10.138.112.66:9435]" Feb 13 09:18:41 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:41Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:52510->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:43 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:43Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF" Feb 13 09:18:45 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:45Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:33892->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:47 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:47Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:33906->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:49 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:49Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:33914->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:51 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:51Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:33930->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:53 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:53Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:33954->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:55 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:55Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:59122->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:57 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:57Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:59142->127.0.0.1:6444: read: connection reset by peer" Feb 13 09:18:59 rke2-worker01 rke2[3644]: time="2023-02-13T09:18:59Z" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:59154->127.0.0.1:6444: read: connection reset by peer"

I am not sure at what point is it failing. I dont see anything in particular when I check the logs of the rke2-server on the master node.

My questions

  1. Is there anything you can spot here that can tell me whats going on?
  2. Are there any logs on the master when an agent tries to connect? In case we could tell whats going on by looking at those logs?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions