Skip to content

Submariner looks like it is not in working condition #3801

@nikitarm1611

Description

@nikitarm1611

ISSUE Description

The submariner does not looks fine with the show-all and diagnose all commands.

Setup:
Site 1: OCP 4.18.22
Site 2: OCP 4.18.22

Submariner version: v0.19.2

Output Collected

show all and diagnose all command outputs for Site 2

This is the error we are seeing in metrodr rack on site2 pod

sh-5.1$ /root/.local/bin/subctl diagnose all --kubeconfig /tmp/local-kubeconfig
Cluster "local-config"
 ✓ Checking Submariner support for the Kubernetes version
 ✓ Kubernetes version "v1.31.11" is supported

 ✓ Non-Globalnet deployment detected - checking that cluster CIDRs do not overlap
 ✓ Checking DaemonSet "submariner-gateway"
 ✓ Checking DaemonSet "submariner-routeagent"
 ✓ Checking DaemonSet "submariner-metrics-proxy"
 ✓ Checking Deployment "submariner-lighthouse-agent"
 ✓ Checking Deployment "submariner-lighthouse-coredns"
 ⚠ Checking the status of all Submariner pods
 ⚠ Pod "submariner-gateway-7vn4h" has restarted 63 times
 ✓ Checking that gateway metrics are accessible from non-gateway nodes 

 ✓ Checking Submariner support for the CNI network plugin
 ✓ The detected CNI network plugin ("OVNKubernetes") is supported
 ✓ Checking OVN version 
 ✓ The ovn-nb database version 7.6.0 is supported
 ✗ Checking gateway connections
 ✗ Connection to cluster "site1" is in progress
 ✗ Checking route agent connections
 ✗ Connection to cluster "site1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"10.129.0.2\"",
  "spec": {
    "cluster_id": "site1",
    "cable_name": "submariner-cable-site1-10-48-96-195",
    "healthCheckIP": "10.129.0.2",
    "hostname": "control-1-ru2.f27l003.fusion.tadn.ibm.com",
    "subnets": [
      "172.30.0.0/16",
      "10.128.0.0/14"
    ],
    "private_ip": "10.48.96.195",
    "public_ip": "129.41.86.6",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "site1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"10.129.0.2\"",
  "spec": {
    "cluster_id": "site1",
    "cable_name": "submariner-cable-site1-10-48-96-195",
    "healthCheckIP": "10.129.0.2",
    "hostname": "control-1-ru2.f27l003.fusion.tadn.ibm.com",
    "subnets": [
      "172.30.0.0/16",
      "10.128.0.0/14"
    ],
    "private_ip": "10.48.96.195",
    "public_ip": "129.41.86.6",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "site1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"10.129.0.2\"",
  "spec": {
    "cluster_id": "site1",
    "cable_name": "submariner-cable-site1-10-48-96-195",
    "healthCheckIP": "10.129.0.2",
    "hostname": "control-1-ru2.f27l003.fusion.tadn.ibm.com",
    "subnets": [
      "172.30.0.0/16",
      "10.128.0.0/14"
    ],
    "private_ip": "10.48.96.195",
    "public_ip": "129.41.86.6",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "site1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"10.129.0.2\"",
  "spec": {
    "cluster_id": "site1",
    "cable_name": "submariner-cable-site1-10-48-96-195",
    "healthCheckIP": "10.129.0.2",
    "hostname": "control-1-ru2.f27l003.fusion.tadn.ibm.com",
    "subnets": [
      "172.30.0.0/16",
      "10.128.0.0/14"
    ],
    "private_ip": "10.48.96.195",
    "public_ip": "129.41.86.6",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "site1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"10.129.0.2\"",
  "spec": {
    "cluster_id": "site1",
    "cable_name": "submariner-cable-site1-10-48-96-195",
    "healthCheckIP": "10.129.0.2",
    "hostname": "control-1-ru2.f27l003.fusion.tadn.ibm.com",
    "subnets": [
      "172.30.0.0/16",
      "10.128.0.0/14"
    ],
    "private_ip": "10.48.96.195",
    "public_ip": "129.41.86.6",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✓ Checking Submariner support for the kube-proxy mode
 ✓ Cluster is running with "OVNKubernetes" CNI which internally implements kube-proxy functionality
 ✓ Checking that firewall configuration allows intra-cluster VXLAN traffic

 ✓ Checking that services have been exported properly

Skipping inter-cluster firewall check as it requires two kubeconfigs. Please run "subctl diagnose firewall inter-cluster" command manually.

Also in the metrodr CR below is the output for the submariner output

✓ Showing Gateways
          NODE                             HA STATUS   SUMMARY                                  
          control-1-ru2.f27l003.fusion.t   passive     There are no connections                 
          control-1-ru3.f27l003.fusion.t   passive     There are no connections                 
          control-1-ru4.f27l003.fusion.t   active      0 connections out of 1 are established

These are below submariner log

F29-submariner-20260129072032.zip

F27-submariner-20260129073303.zip

Metadata

Metadata

Assignees

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions