I've followed the llm-d quick start to install llm-d on my OpenShift 4.17 Cluster and deploy the meta-llama/Llama-3.2-3B-Instruct model as part of the "sampleApplication" during setup. This model deploys properly and all the resources, including the HTTPRoute, are created. I can run ./test-request.sh and everything completes successfully.
When trying to deploy my own model, when applying the ModelService, an HTTPRoute is not created and the model is not registered with the gateway. I can query the model directly from the decode pod, but not from the gateway. I have tried the sample Scenario 2: https://github.com/llm-d/llm-d-model-service/tree/main/samples/nixl-xpyd. When applying the baseconfig and ModelService CR in that sample, all the components are created except for the HTTPRoute. I have also tried using the config that the sampleApplication used: basic-gpu-with-nixl-and-redis-lookup-preset, and I run into the same issue.
I have tried specifying my own model I wanted to deploy RedHatAI/granite-3.1-2b-instruct-quantized.w4a16 in a ModelService, and run into the same issue specified above. However, when I modify the values.yaml: https://github.com/llm-d/llm-d-deployer/blob/main/charts/llm-d/values.yaml#L109 to point to the RedHatAI/granite-3.1-2b-instruct-quantized.w4a16 model or any other model, all the components including the HTTPRoute get created successfully.
I have not been successful in specifying a ModelService outside of the sampleApplication that will create an HTTPRoute.