Skip to content

Gateway API coupling#604

Merged
shengnuo merged 5 commits intoNVIDIA:mainfrom
shengnuo:gateway-api
Aug 13, 2025
Merged

Gateway API coupling#604
shengnuo merged 5 commits intoNVIDIA:mainfrom
shengnuo:gateway-api

Conversation

@shengnuo
Copy link
Copy Markdown
Collaborator

@shengnuo shengnuo commented Aug 3, 2025

Pros:

The PR creates a httproutes.gateway.networking.k8s.io CRD if .spec.expose.httpRoute is specified for NIMService and NeMo Microservices.

Cons:

  • Adds an extra coupling with third-party APIs to the already complex NIMService, NIMPipeline, and NeMo microservices spec
  • Adds unnecessary bloat to NIM Operator

Pre-requisites

  • A Gateway API controller (e.g. Istio) must be installed
  • A Gateway object with allowable hostnames must be defined. Below is a sample Gateway powered by Istio.
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: istio-gateway
  namespace: nemo
spec:
  gatewayClassName: istio
  listeners:
  - allowedRoutes:
      namespaces:
        from: All
    hostname: foobar.nim
    name: foobar
    port: 8000
    protocol: HTTP
  - allowedRoutes:
      namespaces:
        from: All
    hostname: datastore.nemo
    name: datastore
    port: 8000
    protocol: HTTP
  - allowedRoutes:
      namespaces:
        from: All
    hostname: entitystore.nemo
    name: entitystore
    port: 8000
    protocol: HTTP
  - allowedRoutes:
      namespaces:
        from: All
    hostname: customizer.nemo
    name: customizer
    port: 8000
    protocol: HTTP
  - allowedRoutes:
      namespaces:
        from: All
    hostname: evaluator.nemo
    name: evaluator
    port: 8000
    protocol: HTTP
  - allowedRoutes:
      namespaces:
        from: All
    hostname: guardrail.nemo
    name: guardrail
    port: 8000
    protocol: HTTP

Sample NIMService object with HTTPRoute coupling

Below is a sample NIMService object that routes the domain foobar.nim its backend Service object.
Note that .spec.expose.httpRoute is new.

apiVersion: apps.nvidia.com/v1alpha1
kind: NIMService
metadata:
  name: meta-llama3-8b-instruct
spec:
  image:
    repository: nvcr.io/nim/meta/llama-3.1-8b-instruct
    tag: "1.8"
    pullPolicy: IfNotPresent
    pullSecrets:
      - ngc-secret
  authSecret: ngc-api-secret
  storage:
    nimCache:
      name: meta-llama3-8b-instruct
      profile: ''
  replicas: 1
  resources:
    limits:
      nvidia.com/gpu: 1
  expose:
    httpRoute:
      enabled: true
      spec:
        parentRefs:
          - name: istio-gateway
        host: foobar.nim
        paths:
          - type: PathPrefix
            value: /
    service:
      type: ClusterIP
      port: 8000

Sample NeMo Microservice with HTTPRoute coupling

Below is a sample NeMoDatastore object that routes the domain datastore.nemo its backend Service object.

apiVersion: apps.nvidia.com/v1alpha1
kind: NemoDatastore
metadata:
  name: nemodatastore-sample
  namespace: nemo
spec:
  secrets:
    datastoreConfigSecret: "nemo-ms-nemo-datastore"
    datastoreInitSecret: "nemo-ms-nemo-datastore-init"
    datastoreInlineConfigSecret: "nemo-ms-nemo-datastore-inline-config"
    giteaAdminSecret: "gitea-admin-credentials"
    lfsJwtSecret: "nemo-ms-nemo-datastore--lfs-jwt" 
  databaseConfig:
    credentials:
      user: ndsuser
      secretName: datastore-pg-existing-secret
      passwordKey: password
    host: datastore-pg-postgresql.nemo.svc.cluster.local
    port: 5432
    databaseName: ndsdb
  pvc:
    name: "pvc-shared-data"
    create: true
    storageClass: ""
    volumeAccessMode: ReadWriteOnce
    size: "10Gi"
  expose:
    httpRoute:
      enabled: true
      spec:
        parentRefs:
          - name: istio-gateway
        host: datastore.nemo
        paths:
          - type: PathPrefix
            value: /
    service:
      type: ClusterIP
      port: 8000
  image:
    repository: nvcr.io/nvidia/nemo-microservices/datastore
    tag: "25.08"
    pullPolicy: IfNotPresent
    pullSecrets:
      - ngc-secret
  replicas: 1
  resources:
    requests:
      memory: "256Mi"
      cpu: "500m"
    limits:
      memory: "512Mi"
      cpu: "1"

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Aug 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@shengnuo shengnuo force-pushed the gateway-api branch 2 times, most recently from e6faaf6 to 6c2d265 Compare August 11, 2025 21:53
@shengnuo shengnuo changed the title DRAFT: Gateway API coupling Gateway API coupling Aug 11, 2025
@shengnuo shengnuo force-pushed the gateway-api branch 2 times, most recently from fb26486 to f543116 Compare August 11, 2025 22:11
Comment thread api/apps/v1alpha1/common_types.go Outdated
@shengnuo shengnuo force-pushed the gateway-api branch 2 times, most recently from 3ff51d7 to 289f539 Compare August 12, 2025 22:47
shivamerla
shivamerla previously approved these changes Aug 12, 2025
Copy link
Copy Markdown
Collaborator

@shivamerla shivamerla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shengnuo shengnuo force-pushed the gateway-api branch 6 times, most recently from 7e5cbbe to 2765332 Compare August 13, 2025 16:10
@shengnuo shengnuo enabled auto-merge August 13, 2025 21:40
Signed-off-by: Sheng Lin <shelin@nvidia.com>
Signed-off-by: Sheng Lin <shelin@nvidia.com>
Signed-off-by: Sheng Lin <shelin@nvidia.com>
Signed-off-by: Sheng Lin <shelin@nvidia.com>
Signed-off-by: Sheng Lin <shelin@nvidia.com>
@shengnuo shengnuo merged commit c17b53d into NVIDIA:main Aug 13, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants