Skip to content

The command rke up --ssh-agent-auth doesn't work with ssh-agent on windows #2515

@beckjkl

Description

@beckjkl

RKE version:
rke version v1.1.0
Docker version: (docker version,docker info preferred)
Version: 20.10.5
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
Windows 10
[System.Environment]::OSVersion.Version
Major Minor Build Revision
10 0 18363 0
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Ubuntu 18.04 VMs on VMware
cluster.yml file:
nodes:

  • address: 10.x.x.x
    port: "22"
    internal_address: ""
    role:
    • etcd
    • controlplane
    • worker
      hostname_override: ""
      user: jan
      docker_socket: /var/run/docker.sock
      ssh_key: ""
      ssh_key_path: 'C:\Users*\Documents\privat_key-*'
      ssh_cert: ""
      ssh_cert_path: ""
      labels: {}
      taints: []
      services:
      etcd:
      image: ""
      extra_args: {}
      extra_binds: []
      extra_env: []
      external_urls: []
      ca_cert: ""
      cert: ""
      key: ""
      path: ""
      uid: 0
      gid: 0
      snapshot: null
      retention: ""
      creation: ""
      backup_config: null
      kube-api:
      image: ""
      extra_args: {}
      extra_binds: []
      extra_env: []
      service_cluster_ip_range: 10.43.0.0/16
      service_node_port_range: ""
      pod_security_policy: false
      always_pull_images: false
      secrets_encryption_config: null
      audit_log: null
      admission_configuration: null
      event_rate_limit: null
      kube-controller:
      image: ""
      extra_args: {}
      extra_binds: []
      extra_env: []
      cluster_cidr: 10.42.0.0/16
      service_cluster_ip_range: 10.43.0.0/16
      scheduler:
      image: ""
      extra_args: {}
      extra_binds: []
      extra_env: []
      kubelet:
      image: ""
      extra_args: {}
      extra_binds: []
      extra_env: []
      cluster_domain: demo.lab
      infra_container_image: ""
      cluster_dns_server: 10.43.0.10
      fail_swap_on: false
      generate_serving_certificate: false
      kubeproxy:
      image: ""
      extra_args: {}
      extra_binds: []
      extra_env: []
      network:
      plugin: canal
      options: {}
      mtu: 0
      node_selector: {}
      update_strategy: null
      authentication:
      strategy: x509
      sans: []
      webhook: null
      addons: ""
      addons_include: []
      system_images:
      etcd: rancher/coreos-etcd:v3.4.3-rancher1
      alpine: rancher/rke-tools:v0.1.56
      nginx_proxy: rancher/rke-tools:v0.1.56
      cert_downloader: rancher/rke-tools:v0.1.56
      kubernetes_services_sidecar: rancher/rke-tools:v0.1.56
      kubedns: rancher/k8s-dns-kube-dns:1.15.0
      dnsmasq: rancher/k8s-dns-dnsmasq-nanny:1.15.0
      kubedns_sidecar: rancher/k8s-dns-sidecar:1.15.0
      kubedns_autoscaler: rancher/cluster-proportional-autoscaler:1.7.1
      coredns: rancher/coredns-coredns:1.6.5
      coredns_autoscaler: rancher/cluster-proportional-autoscaler:1.7.1
      nodelocal: rancher/k8s-dns-node-cache:1.15.7
      kubernetes: rancher/hyperkube:v1.17.4-rancher1
      flannel: rancher/coreos-flannel:v0.11.0-rancher1
      flannel_cni: rancher/flannel-cni:v0.3.0-rancher5
      calico_node: rancher/calico-node:v3.13.0
      calico_cni: rancher/calico-cni:v3.13.0
      calico_controllers: rancher/calico-kube-controllers:v3.13.0
      calico_ctl: rancher/calico-ctl:v2.0.0
      calico_flexvol: rancher/calico-pod2daemon-flexvol:v3.13.0
      canal_node: rancher/calico-node:v3.13.0
      canal_cni: rancher/calico-cni:v3.13.0
      canal_flannel: rancher/coreos-flannel:v0.11.0
      canal_flexvol: rancher/calico-pod2daemon-flexvol:v3.13.0
      weave_node: weaveworks/weave-kube:2.5.2
      weave_cni: weaveworks/weave-npc:2.5.2
      pod_infra_container: rancher/pause:3.1
      ingress: rancher/nginx-ingress-controller:nginx-0.25.1-rancher1
      ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1
      metrics_server: rancher/metrics-server:v0.3.6
      windows_pod_infra_container: rancher/kubelet-pause:v0.1.3
      ssh_key_path: C:\Users\beckj\Documents\privat_key-prod.ppk
      ssh_cert_path: ""
      ssh_agent_auth: false
      authorization:
      mode: rbac
      options: {}
      ignore_docker_version: false
      kubernetes_version: ""
      private_registries: []
      ingress:
      provider: ""
      options: {}
      node_selector: {}
      extra_args: {}
      dns_policy: ""
      extra_envs: []
      extra_volumes: []
      extra_volume_mounts: []
      update_strategy: null
      cluster_name: ""
      cloud_provider:
      name: ""
      prefix_path: ""
      addon_job_timeout: 0
      bastion_host:
      address: ""
      port: ""
      user: ""
      ssh_key: ""
      ssh_key_path: ""
      ssh_cert: ""
      ssh_cert_path: ""
      monitoring:
      provider: ""
      options: {}
      node_selector: {}
      update_strategy: null
      replicas: null
      restore:
      restore: false
      snapshot_name: ""
      dns: null

Steps to Reproduce:
This is the same issue as described in issues 2136. I have the same issue. I've tried to do some more tests to answer the unanswered questions in that issue.
I have installed rke on my windows laptop as described in the documentation and generated the config above. I have the Windows Service "OpenSSH Authentication Agent" active and used ssh-add.exe with the privat key mentioned in the cluster.yml. I can ssh to my machine in powershell with ssh [email protected], but rke up with and without --ssh-agent-auth fail:

rke up
time="2021-04-14T17:42:02+02:00" level=info msg="Running RKE version: v1.1.0"
time="2021-04-14T17:42:02+02:00" level=info msg="Initiating Kubernetes cluster"
time="2021-04-14T17:42:02+02:00" level=info msg="[certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates"
time="2021-04-14T17:42:02+02:00" level=info msg="[certificates] Generating admin certificates and kubeconfig"
time="2021-04-14T17:42:02+02:00" level=info msg="Successfully Deployed state file at [./cluster.rkestate]"
time="2021-04-14T17:42:02+02:00" level=info msg="Building Kubernetes cluster"
time="2021-04-14T17:42:02+02:00" level=info msg="[dialer] Setup tunnel for host [10.x.x.x]"
time="2021-04-14T17:42:02+02:00" level=warning msg="Failed to set up SSH tunneling for host [10.x.x.x]: Can't retrieve Docker Info: error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/info: Unable to access node with address [10...*:22] using SSH. Using encrypted private keys is only supported using ssh-agent. Please configure the option ssh_agent_auth: true in the configuration file or use --ssh-agent-auth as a parameter when running RKE. This will use the SSH_AUTH_SOCK environment variable. Error: Error configuring SSH: ssh: cannot decode encrypted private keys"
time="2021-04-14T17:42:02+02:00" level=warning msg="Removing host [10.x.x.x] from node lists"
time="2021-04-14T17:42:02+02:00" level=fatal msg="Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) [10.x.x.x]"

rke up --ssh-agent-auth
time="2021-04-14T17:43:44+02:00" level=info msg="Running RKE version: v1.1.0"
time="2021-04-14T17:43:45+02:00" level=info msg="Initiating Kubernetes cluster"
time="2021-04-14T17:43:45+02:00" level=info msg="[certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates"
time="2021-04-14T17:43:45+02:00" level=info msg="[certificates] Generating admin certificates and kubeconfig"
time="2021-04-14T17:43:45+02:00" level=info msg="Successfully Deployed state file at [./cluster.rkestate]"
time="2021-04-14T17:43:45+02:00" level=info msg="Building Kubernetes cluster"
time="2021-04-14T17:43:45+02:00" level=info msg="[dialer] Setup tunnel for host [10.x.x.x]"
time="2021-04-14T17:43:45+02:00" level=warning msg="Failed to set up SSH tunneling for host [10.x.x.x]: Can't retrieve Docker Info: error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/info: Unable to access node with address [10...*:22] using SSH. Please check if the configured key or specified key file is a valid SSH Private Key. Error: Error configuring SSH: ssh: no key found"
time="2021-04-14T17:43:45+02:00" level=warning msg="Removing host [10.x.x.x] from node lists"
time="2021-04-14T17:43:45+02:00" level=fatal msg="Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) [10.x.x.x]"

The key was generated by puttygen using "Export OpenSSH Key". Here is the header:
-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,*

I've also generated a key on the Ubuntu server using ssh-keygen, copied that key to my windows machine, changed the path to the key in the cluster.yml to the new key and tried "rke up --ssh-agent-auth", but again no luck. SSH worked with the new key. Here the header of the new key:
-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: AES-128-CBC,*

I've then also removed the passphrase from the newly genereated key and tried again as described above, but again with the same result.
Results:
rke doesn't connect from windows to Ubuntu-VM, because it can't access the ssh-key.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions