Skip to content

[BUG] relase v0.1.3 deploy by helm chart with service ome-controller-manager-service no endpoints #220

@wangzhuzhen

Description

@wangzhuzhen

What happened?

Install with helm from source:

# Install from local charts
helm install ome-crd charts/ome-crd --namespace ome --create-namespace
helm install ome charts/ome-resources --namespace ome

When install finished. The Service ome-controller-manager-service without any endpoints

# kubectl  describe  svc  -n  ome ome-controller-manager-service
Name:              ome-controller-manager-service
Namespace:         ome
Labels:            app.kubernetes.io/managed-by=Helm
                   control-plane=ome-controller-manager
                   controller-tools.k8s.io=1.0
Annotations:       meta.helm.sh/release-name: ome
                   meta.helm.sh/release-namespace: ome
Selector:          control-plane=ome-controller-manager,controller-tools.k8s.io=1.0
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.233.41.243
IPs:               10.233.41.243
Port:              <unset>  8443/TCP
TargetPort:        https/TCP
Endpoints:         <none>
Session Affinity:  None
Events:            <none>

What did you expect to happen?

  1. The Service ome-controller-manager-service with endpoints towards to ome controller managers.
  2. Which exactly the named 'https' port is? I only see webhook-server(9443), metrics(8080),and 8081 for readinessProbe.

How can we reproduce it (as minimally and precisely as possible)?

# Install from local charts
helm install ome-crd charts/ome-crd --namespace ome --create-namespace
helm install ome charts/ome-resources --namespace ome

Anything else we need to know?

ome controller manager pods are ready.

# kubectl   get pod -n ome  -o wide 
NAME                                      READY   STATUS    RESTARTS   AGE   IP               NODE                NOMINATED NODE   READINESS GATES
ome-controller-manager-7665db5bc7-5xtzk   1/1     Running   0          72m   10.100.244.23    gpu21-n204-b14-6u   <none>           <none>
ome-controller-manager-7665db5bc7-7rf45   1/1     Running   0          72m   10.100.95.92     gpu25-n204-c01-6u   <none>           <none>
ome-controller-manager-7665db5bc7-ltmds   1/1     Running   0          72m   10.100.23.233    gpu18-n204-b09-6u   <none>           <none>

But ome controller manager Pod's container without any port named https:

#  kubectl   get pod -n ome  ome-controller-manager-7665db5bc7-5xtzk  -o json | jq .spec.containers[].ports
[
  {
    "containerPort": 9443,
    "name": "webhook-server",
    "protocol": "TCP"
  },
  {
    "containerPort": 8080,
    "name": "metrics",
    "protocol": "TCP"
  }
]

I get the ome controller manager listened on these ports:

# nsenter -t 57910  -n ss -tunlp
Netid              State               Recv-Q              Send-Q                           Local Address:Port                           Peer Address:Port             Process                                          
tcp                LISTEN              0                   4096                                         *:9443                                      *:*                 users:(("manager",pid=57910,fd=9))              
tcp                LISTEN              0                   4096                                         *:8081                                      *:*                 users:(("manager",pid=57910,fd=3))              
tcp                LISTEN              0                   4096                                         *:8080                                      *:*                 users:(("manager",pid=57910,fd=7))

Environment

  • OME version: v0.1.3
  • Kubernetes version (use kubectl version): v1.30.10
  • Cloud provider or hardware configuration: INTEL(R) XEON(R) PLATINUM 8558
  • OS (e.g., from /etc/os-release): Ubuntu 22.04.5 LTS (Jammy Jellyfish)
  • Runtime (SGLang, vLLM, etc.) and version:
  • Model being served (if applicable):
  • Install method (Helm, kubectl, etc.): Helm from source

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions