Skip to content

[BUG] [Fedora40][OKD] OpenShift cluster becomes Unreachable #4234

Open
@souovan

Description

General information

  • OS: Linux
  • Hypervisor: KVM
  • Did you run crc setup before starting it (Yes/No)? Yes
  • Running CRC on: Laptop

CRC version

CRC version: 2.37.1+36d451
OpenShift version: 4.15.14

CRC status

CRC VM:          Running
OpenShift:       Unreachable (v4.15.0-0.okd-2024-02-23-163410)
RAM Usage:       647MB of 10.95GB
Disk Usage:      21.48GB of 32.68GB (Inside the CRC VM)
Cache Usage:     27.69GB
Cache Directory: /home/van/.crc/cache

CRC config

- consent-telemetry                     : yes
- preset                                : okd

Host Operating System

NAME="Fedora Linux"
VERSION="40 (Workstation Edition)"
ID=fedora
VERSION_ID=40
VERSION_CODENAME=""
PLATFORM_ID="platform:f40"
PRETTY_NAME="Fedora Linux 40 (Workstation Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:40"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f40/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=40
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=40
SUPPORT_END=2025-05-13
VARIANT="Workstation Edition"
VARIANT_ID=workstation

Steps to reproduce

  1. Run crc setup
  2. Run crc start
  3. Get Error running post start: Executing systemctl action failed: ssh command error: command : sudo systemctl enable dnsmasq.service err : Process exited with status 1 : Failed to enable unit: Unit file /etc/systemd/system/dnsmasq.service is masked.
  4. Run systemctl unmask dnsmasq.service
  5. Run crc start again
  6. Get
INFO A CRC VM for OKD 4.15.0-0.okd-2024-02-23-163410 is already running 
Started the OpenShift cluster.

The server is accessible via web console at:
  https://console-openshift-console.apps-crc.testing

Log in as administrator:
  Username: kubeadmin
  Password: PHX79-zmtbm-9PPmT-I9JMB

Log in as user:
  Username: developer
  Password: developer

Use the 'oc' command line interface:
  $ eval $(crc oc-env)
  $ oc login -u developer https://api.crc.testing:6443


NOTE:
This cluster was built from OKD - The Community Distribution of Kubernetes that powers Red Hat OpenShift.
If you find an issue, please report it at https://github.com/openshift/okd
  1. Run eval $(crc oc-env)

Expected

Access the oc CLI and the OKD Web Console

Actual

When i run oc login -u developer https://api.crc.testing:6443 i get:

error: dial tcp 192.168.130.11:6443: connect: connection refused - verify you have provided the correct host and port and that the server is currently running.

And aren't able to access the OKD Web Console

image

Logs when i ssh to the running crc VM

van@fedora:~$ crc ip
192.168.130.11
van@fedora:~$ ssh -i ~/.crc/machines/crc/id_ecdsa -o StrictHostKeyChecking=no [email protected]

i get:

Fedora CoreOS 39.20240128.3.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/tag/coreos

Last login: Fri Jun 14 18:49:51 2024 from 192.168.130.1
[systemd]
Failed Units: 1
  qemu-guest-agent.service

Then root@crc:~# systemctl status qemu-guest-agent.service

× qemu-guest-agent.service - QEMU Guest Agent
     Loaded: loaded (/etc/systemd/system/qemu-guest-agent.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: failed (Result: exit-code) since Fri 2024-06-14 18:41:41 UTC; 9min ago
   Duration: 3ms
    Process: 1159 ExecStart=/usr/bin/qemu-ga --method=vsock-listen --path=3:1234 --blacklist=${BLACKLIST_RPC} -F${FSFREEZE_HOOK_PATHNAME} (code=exited, status=1/FAILURE)
   Main PID: 1159 (code=exited, status=1/FAILURE)
        CPU: 2ms

Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Scheduled restart job, restart counter is at 5.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Start request repeated too quickly.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: Failed to start qemu-guest-agent.service - QEMU Guest Agent.

Then journalctl -u qemu-guest-agent.service --since 18:38:30

Jun 14 18:41:41 crc systemd[1]: Started qemu-guest-agent.service - QEMU Guest Agent.
Jun 14 18:41:41 crc (qemu-ga)[1015]: qemu-guest-agent.service: Referenced but unset environment variable evaluates to an empty string: BLACKLIST_RPC
Jun 14 18:41:41 crc qemu-ga[1015]: 1718390501.650499: critical: Failed to create socket: Address family not supported by protocol
Jun 14 18:41:41 crc qemu-ga[1015]: 1718390501.650536: critical: failed to create guest agent channel
Jun 14 18:41:41 crc qemu-ga[1015]: 1718390501.650538: critical: failed to initialize guest agent channel
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Scheduled restart job, restart counter is at 1.
Jun 14 18:41:41 crc systemd[1]: Started qemu-guest-agent.service - QEMU Guest Agent.
Jun 14 18:41:41 crc (qemu-ga)[1071]: qemu-guest-agent.service: Referenced but unset environment variable evaluates to an empty string: BLACKLIST_RPC
Jun 14 18:41:41 crc qemu-ga[1071]: 1718390501.753307: critical: Failed to create socket: Address family not supported by protocol
Jun 14 18:41:41 crc qemu-ga[1071]: 1718390501.754795: critical: failed to create guest agent channel
Jun 14 18:41:41 crc qemu-ga[1071]: 1718390501.754799: critical: failed to initialize guest agent channel
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Scheduled restart job, restart counter is at 2.
Jun 14 18:41:41 crc systemd[1]: Started qemu-guest-agent.service - QEMU Guest Agent.
Jun 14 18:41:41 crc (qemu-ga)[1115]: qemu-guest-agent.service: Referenced but unset environment variable evaluates to an empty string: BLACKLIST_RPC
Jun 14 18:41:41 crc qemu-ga[1115]: 1718390501.814844: critical: Failed to create socket: Address family not supported by protocol
Jun 14 18:41:41 crc qemu-ga[1115]: 1718390501.815354: critical: failed to create guest agent channel
Jun 14 18:41:41 crc qemu-ga[1115]: 1718390501.815373: critical: failed to initialize guest agent channel
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Scheduled restart job, restart counter is at 3.
Jun 14 18:41:41 crc systemd[1]: Started qemu-guest-agent.service - QEMU Guest Agent.
Jun 14 18:41:41 crc (qemu-ga)[1145]: qemu-guest-agent.service: Referenced but unset environment variable evaluates to an empty string: BLACKLIST_RPC
Jun 14 18:41:41 crc qemu-ga[1145]: 1718390501.840374: critical: Failed to create socket: Address family not supported by protocol
Jun 14 18:41:41 crc qemu-ga[1145]: 1718390501.840388: critical: failed to create guest agent channel
Jun 14 18:41:41 crc qemu-ga[1145]: 1718390501.840390: critical: failed to initialize guest agent channel
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Scheduled restart job, restart counter is at 4.
Jun 14 18:41:41 crc systemd[1]: Started qemu-guest-agent.service - QEMU Guest Agent.
Jun 14 18:41:41 crc (qemu-ga)[1159]: qemu-guest-agent.service: Referenced but unset environment variable evaluates to an empty string: BLACKLIST_RPC
Jun 14 18:41:41 crc qemu-ga[1159]: 1718390501.866131: critical: Failed to create socket: Address family not supported by protocol
Jun 14 18:41:41 crc qemu-ga[1159]: 1718390501.866145: critical: failed to create guest agent channel
Jun 14 18:41:41 crc qemu-ga[1159]: 1718390501.866147: critical: failed to initialize guest agent channel
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Scheduled restart job, restart counter is at 5.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Start request repeated too quickly.
Jun 14 18:41:41 crc systemd[1]: qemu-guest-agent.service: Failed with result 'exit-code'.
Jun 14 18:41:41 crc systemd[1]: Failed to start qemu-guest-agent.service - QEMU Guest Agent.
lines 16-43/43 (END)

Then i run:

root@crc:~# podman ps
CONTAINER ID  IMAGE                                   COMMAND     CREATED       STATUS         PORTS       NAMES
74e5ad82fc3e  quay.io/crcont/gvisor-tap-vsock:latest              3 months ago  Up 14 minutes              gvisor-tap-vsock
root@crc:~# podman logs 74e5ad82fc3e

And get:

...
ERRO[0889] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0890] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0891] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0892] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0893] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0894] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0895] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0896] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0897] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0898] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0899] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0900] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0901] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0902] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0903] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0904] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer 
ERRO[0905] cannot connect to host: dial vsock host(2):1024: connect: connection reset by peer

Apreciate help and any hint where to continue debbuging this issue

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions