Add support to install by kustomize#179
Add support to install by kustomize#179avinashsingh77 wants to merge 5 commits intollm-d-incubation:mainfrom
Conversation
|
For context for maintainers this PR is to aggregate feedback on the potential migration and resolve points on if / how it should work rather than code that should be merged in this repo -- it will eventually land in the main repo |
There was a problem hiding this comment.
I dont have time to do the full review right, but ive called out some things to start. I think my overall objection to this right now is that there is too many configuration overlays. I think another pattern we could consider is having one modelserver directory per guide, and then just do variation based on the hardware accelerator. we could move monitoring into base as described below, this would also let us get rid of single vs multi-node as well, because guides explicitly have multi or non multi-node deployments for that pattern. We could move DRA to be based on the accelerator, typically for Nvidia or AMD GPUs we can k8s device plugin system, Ive only really ever seen intel devices go through DRA. Take that point with a grain of salt though because I am definetly no DRA expert.
The point here I guess I am making is that I think we need to aggregate more of these overlays into more "whole" deployments. We can try to group things per guide - their purpose is to demonstrate patterns within inference which I think is not being shown here. The project is aimed at providing "guides" / "well-lit-paths" which are fleshed out examples, pre tuned to work in production, it is users responsibility to walk back up the path and build their own to suit their use-case. Hope this context framing helps inform the design here
There was a problem hiding this comment.
Very cool I haven't seen FMA used yet
|
This PR is marked as stale after 21d of inactivity. After an additional 14d of inactivity (7d to become rotten, then 7d more), it will be closed. To prevent this PR from being closed, add a comment or remove the |
Co-authored-by: Greg Pereira <grpereir@redhat.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
|
@Gregory-Pereira Does it look better now? Changes:
| Overlay | → | Accelerator Variants | Questions/Notes:
|
This PR introduces Kustomize as an alternative installation method for llm-d-modelservice, providing users with a declarative, composable deployment approach alongside the existing Helm charts.