Thank you for your interest in contributing! This guide will help you get started.
Be respectful and constructive. We are committed to providing a welcoming and inclusive experience for everyone.
- Go 1.24+
- Docker 17.03+
- kubectl v1.28+
- Access to a Kubernetes v1.28+ cluster (or Kind for local development)
- Operator SDK v1.42+
git clone https://github.com/<your-username>/litellm-operator.git
cd litellm-operatorgo mod downloadmake installmake rungit checkout -b feat/my-featureUse conventional branch prefixes: feat/, fix/, docs/, refactor/, test/.
- Follow existing code patterns and conventions (see below)
- Add or update tests for any new or changed functionality
After modifying CRD types or RBAC markers:
make generate # Regenerate DeepCopy methods
make manifests # Regenerate CRD YAMLs, RBAC, webhooksRun the full test suite:
make test # Unit + integration tests (envtest)Run the linter:
make lintUse Conventional Commits for commit messages:
feat(controller): add budget reset support to team reconciler
fix(api-client): handle 429 rate limit responses with retry
docs: update CRD specification for new fields
test(model): add integration tests for model deletion
refactor(resources): extract common label builder
- Fill in the PR template
- Reference any related issues
- Ensure CI passes (tests, lint)
- Keep PRs focused — one feature or fix per PR
- Follow standard Go conventions and Effective Go
- Use
controller-runtimeidioms:client.Client,ctrl.Result,ctrl.Request - Use
logrfor structured logging — neverfmt.Printlnorlog - Wrap errors with context:
fmt.Errorf("reconciling model %s: %w", name, err)
- CRD type files:
litellm<resource>_types.go - Controller files:
litellm<resource>_controller.go - Constants use PascalCase with the
litellm.palena.ai/prefix for annotations and labels
Every controller follows the standard pattern:
- Fetch the CR (return early if not found)
- Handle deletion via finalizer
- Ensure finalizer is present
- Resolve
instanceRefto get LiteLLM endpoint and master key - Reconcile the resource against the LiteLLM API
- Update status conditions
- Transient errors (network, API 5xx): requeue with
RequeueAfter: 30 * time.Second - Permanent errors (invalid spec, 400): set a status condition, do not requeue
- Always update status conditions on both success and failure
- Never silently swallow errors
- Unit tests: individual functions, resource generation, diff logic
- Integration tests: use
envtest(controller-runtime's test environment) - E2E tests: run against a real cluster with
make test-e2e - Mock the
litellm.Clientinterface for unit and integration tests
api/v1alpha1/ CRD type definitions
internal/controller/ Reconciliation controllers
internal/litellm/ LiteLLM REST API client
internal/resources/ Kubernetes resource generators
config/crd/bases/ Generated CRD manifests
config/samples/ Example custom resources
bundle/ OLM bundle manifests
deploy/charts/ Helm chart
- Scaffold with Operator SDK:
operator-sdk create api --group litellm --version v1alpha1 --kind LiteLLMNewResource --resource --controller
- Define types in
api/v1alpha1/litellmnewresource_types.go - Implement the controller in
internal/controller/litellmnewresource_controller.go - Add API client methods in
internal/litellm/if the CRD syncs with the LiteLLM API - Add mock methods in
internal/litellm/mock_client.go - Add a sample CR in
config/samples/ - Run
make generate manifests test
- Use GitHub Issues
- Include steps to reproduce, expected vs actual behavior, and relevant logs
- For security vulnerabilities, email security@palena.ai instead of opening a public issue
By contributing, you agree that your contributions will be licensed under the Apache License 2.0.