PLTF-4: Automate LiteLLM setup#253
Conversation
| # This var will cause the openhands deploy to create the LLM team name if it doesn't exist already | ||
| LITE_LLM_TEAM_NAME: openhands |
There was a problem hiding this comment.
This comment was misleading, and as far as I can tell LITE_LLM_TEAM_NAME isn't used anywhere within this chart or the entire OpenHands github org. I've removed it.
| # NOTE: We recommend using the provided LiteLLM instance for simplicity and | ||
| # because it is the most extensively tested scenario. Our automation uses an | ||
| # admin key to do user management for the LiteLLM instance. | ||
| enabled: false |
There was a problem hiding this comment.
I'm a little confused on the purpose of this flag - without it set to true, LITE_LLM_API_URL is never configured at all, regardless of whether you're using the provided LiteLLM instance or not.
In this PR, I've used it to gate off LiteLLM configuration in general, but let me know if I'm misinterpreting.
| labels: | ||
| app: openhands | ||
| annotations: | ||
| checksum/keycloak-config: {{ include (print $.Template.BasePath "/keycloak-config-script.yaml") . | sha256sum }} |
There was a problem hiding this comment.
Adding a missing checksum for keycloak while I'm here.
With the check sum in place, pods will rollover whenever the config script is updated.
| \"team_id\": \"$LITE_LLM_TEAM_ID\", | ||
| \"team_alias\": \"$LITE_LLM_TEAM_ID\" |
There was a problem hiding this comment.
Just a callout that team_alias is what we're referring to as Team Name in our documentation. It's the same model name you see in the LiteLLM UI.
I've had this script explicitly set an ID rather than a name because IDs are unique while aliases are not. If we really want to lean on resolving the team by name, we'd have to add some sort of locking mechanism to this script to prevent creating duplicate teams.
|
Thanks for this. @neubig and @xingyaoww might be the best to help look at this? |
neubig
left a comment
There was a problem hiding this comment.
This all looked reasonable to me! I assume this means that then argocd will manage the litellm configuration, so we can test this on staging next right?
One question: where do we get the version of litellm? Is this inherited from the other infra files in some way? I didn't see it in the PR.
|
Yep, we can test this in staging. I believe we're installing the The LiteLLM version is set through the |
|
I was able to validate that the upgrade path works on the staging GKE cluster through this preview PR, where I deployed I can see logs indicating that the config script ran as intended. @raymyers or @neubig, you mind swinging back for an approval? |
neubig
left a comment
There was a problem hiding this comment.
Looks good, now that this is confirmed to be working!
Description
Closes #251
Configuring LiteLLM requires manually configuring a team, then populating the helm values with the ID of that team that was created. We need this process to be automated to facilitate using Replicated for managed installations.
This PR adds an init container which automatically provisions a new team if the
teamIdset in the values doesn't exist. The defaultteamIdisopenhands.Helm Chart Checklist
versionfield inChart.yamlfor each modified chartAdditional Notes
I've added a couple of comments already to explain some of the changes I made.