Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gateway2: Fix missing liveness probe on gateway deployments #10329

Closed
wants to merge 7 commits into from

Conversation

davidjumani
Copy link
Contributor

@davidjumani davidjumani commented Nov 12, 2024

Description

Fixes the missing liveliness probe for the kube gateways that come up

Before :

Name:             gloo-proxy-gw-868db69887-pccks
Namespace:        gloo-system
Priority:         0
Service Account:  gloo-proxy-gw
Node:             gloo-control-plane/172.18.0.3
Start Time:       Tue, 12 Nov 2024 11:55:23 -0500
Labels:           app.kubernetes.io/instance=gw
                  app.kubernetes.io/name=gloo-proxy-gw
                  gateway.networking.k8s.io/gateway-name=gw
                  gloo=kube-gateway
                  pod-template-hash=868db69887
Annotations:      prometheus.io/path: /metrics
                  prometheus.io/port: 9091
                  prometheus.io/scrape: true
Status:           Running
IP:               10.244.0.9
IPs:
  IP:           10.244.0.9
Controlled By:  ReplicaSet/gloo-proxy-gw-868db69887
Containers:
  gloo-gateway:
    Container ID:  containerd://0ed2200bd71eeef53a88b1e36c61d540a1a08cdb299e43d7fcdb5eea47dd951a
    Image:         quay.io/solo-io/gloo-envoy-wrapper:1.0.0-ci1
    Image ID:      docker.io/library/import-2024-11-12@sha256:c8f0a2be7329f67a10b08d1b1c61f10b0f8659a3c9a86b77c032e1937428959e
    Ports:         8080/TCP, 9091/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      --disable-hot-restart
      --service-node
      $(POD_NAME).$(POD_NAMESPACE)
    State:          Running
      Started:      Tue, 12 Nov 2024 11:55:24 -0500
    Ready:          True
    Restart Count:  0
    Environment:
      POD_NAME:       gloo-proxy-gw-868db69887-pccks (v1:metadata.name)
      POD_NAMESPACE:  gloo-system (v1:metadata.namespace)
      ENVOY_UID:      0
    Mounts:
      /etc/envoy from envoy-config (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  envoy-config:
    Type:        ConfigMap (a volume populated by a ConfigMap)
    Name:        gloo-proxy-gw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>

After :

Name:             gloo-proxy-gw-58c8f6d687-rg6jk
Namespace:        default
Priority:         0
Service Account:  gloo-proxy-gw
Node:             gloo-control-plane/172.18.0.3
Start Time:       Tue, 12 Nov 2024 16:44:50 -0500
Labels:           app.kubernetes.io/instance=gw
                  app.kubernetes.io/name=gloo-proxy-gw
                  gateway.networking.k8s.io/gateway-name=gw
                  gloo=kube-gateway
                  pod-template-hash=58c8f6d687
Annotations:      prometheus.io/path: /metrics
                  prometheus.io/port: 9091
                  prometheus.io/scrape: true
Status:           Running
IP:               10.244.0.10
IPs:
  IP:           10.244.0.10
Controlled By:  ReplicaSet/gloo-proxy-gw-58c8f6d687
Containers:
  gloo-gateway:
    Container ID:  containerd://bec13b8b0444b5c06736c1f0f028836cda236879cdac42ee6ecda20f653576ea
    Image:         quay.io/solo-io/gloo-envoy-wrapper:1.0.0-ci1
    Image ID:      docker.io/library/import-2024-11-12@sha256:c8f0a2be7329f67a10b08d1b1c61f10b0f8659a3c9a86b77c032e1937428959e
    Ports:         8080/TCP, 9091/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      --disable-hot-restart
      --service-node
      $(POD_NAME).$(POD_NAMESPACE)
    State:          Running
      Started:      Tue, 12 Nov 2024 16:44:51 -0500
    Ready:          True
    Restart Count:  0
    Liveness:       exec [wget -O /dev/null 127.0.0.1:19000/ready] delay=0s timeout=1s period=10s #success=1 #failure=3.   <--------- Liveness probe
    Environment:
      POD_NAME:       gloo-proxy-gw-58c8f6d687-rg6jk (v1:metadata.name)
      POD_NAMESPACE:  default (v1:metadata.namespace)
      ENVOY_UID:      0
    Mounts:
      /etc/envoy from envoy-config (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  envoy-config:
    Type:        ConfigMap (a volume populated by a ConfigMap)
    Name:        gloo-proxy-gw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  16s   default-scheduler  Successfully assigned default/gloo-proxy-gw-58c8f6d687-rg6jk to gloo-control-plane
  Normal  Pulled     15s   kubelet            Container image "quay.io/solo-io/gloo-envoy-wrapper:1.0.0-ci1" already present on machine
  Normal  Created    15s   kubelet            Created container gloo-gateway
  Normal  Started    15s   kubelet            Started container gloo-gateway

@solo-changelog-bot
Copy link

Issues linked to changelog:
https://github.com/solo-io/solo-projects/issues/7084

@github-actions github-actions bot added keep pr updated signals bulldozer to keep pr up to date with base branch work in progress signals bulldozer to keep pr open (don't auto-merge) labels Nov 12, 2024
Copy link
Contributor

@ashishb-solo ashishb-solo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy to approve as is, but any chance we can create a test for this?

@davidjumani davidjumani changed the title gateway: Fix missing liveliness probe gateway: Fix missing liveness probe on gateway deployments Nov 12, 2024
Comment on lines +92 to +101
livenessProbe:
exec:
command:
- wget
- -O
- /dev/null
- 127.0.0.1:19000/ready
initialDelaySeconds: 3
periodSeconds: 10
failureThreshold: 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
livenessProbe:
exec:
command:
- wget
- -O
- /dev/null
- 127.0.0.1:19000/ready
initialDelaySeconds: 3
periodSeconds: 10
failureThreshold: 3
{- $gateway.probes.livenessprobe }

or something like that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add that once we add support for custom livenss probes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but i dont think we want a default liveness at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue description says "I'd like to have proper probes by default". Seems like you and Jesus (issue author) should align on desired behavior here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should consider how it's done in the edge proxies and decide if we want similar configurability here

Copy link

github-actions bot commented Nov 12, 2024

Visit the preview URL for this PR (updated for commit f26d900):

https://gloo-edge--pr10329-fix-liveliness-probe-ss51itsq.web.app

(expires Wed, 20 Nov 2024 13:11:38 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 77c2b86e287749579b7ff9cadb81e099042ef677

@davidjumani davidjumani changed the title gateway: Fix missing liveness probe on gateway deployments gateway2: Fix missing liveness probe on gateway deployments Nov 12, 2024
@davidjumani
Copy link
Contributor Author

Closing in favour of solo-io#10332

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keep pr updated signals bulldozer to keep pr up to date with base branch work in progress signals bulldozer to keep pr open (don't auto-merge)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants