Skip to content

can't deploy llm inference yaml #616

@Lsgoose

Description

@Lsgoose

What happened?

I use a yaml to deploy llm inference :

apiVersion: v1
kind: Pod
metadata:
  name: test-opt-half-gpu-1
  namespace: default
  annotations:
    gpu-fraction: "0.5"
  labels:
    kai.scheduler/queue: test
spec:
  schedulerName: kai-scheduler
  containers:
  - name: opt-half-gpu
    image: lsgoose/ray-llm-infer-kai:latest
    imagePullPolicy: IfNotPresent
    resources:
      # limits:
      #   nvidia.com/gpu: 0.5
    command: ["python", "/app/opt-1.3b.py"]
    volumeMounts:
      - mountPath: /mnt/public/lyt/model_path
        name: model-path
  volumes:
    - name: model-path
      hostPath:
        path: /mnt/public/lyt/model_path
        type: Directory

I can deploy this yaml last week, but when I deploy kai-scheduler again, there is a bug

Image

What did you expect to happen?

No response

Environment

  • Kubernetes version
  • 1.28.2
  • KAI Scheduler version
  • 0.9.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions