Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InferencePool config proposal for API review #162

Merged
merged 9 commits into from
Feb 1, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,10 @@ vet: ## Run go vet against code.
test: manifests generate fmt vet envtest ## Run tests.
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test $$(go list ./... | grep -v /e2e) -coverprofile cover.out

.PHONY: test-integration
test-integration: manifests generate fmt vet envtest ## Run tests.
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test ./test/integration -coverprofile cover.out

.PHONY: test-e2e
test-e2e: ## Run end-to-end tests against an existing Kubernetes cluster with at least 3 available GPUs.
go test ./test/e2e/ -v -ginkgo.v
Expand Down
82 changes: 82 additions & 0 deletions api/v1alpha1/inferencepool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,90 @@ type InferencePoolSpec struct {
// +kubebuilder:validation:Maximum=65535
// +kubebuilder:validation:Required
TargetPortNumber int32 `json:"targetPortNumber"`

// EndpointPickerConfig specifies the configuration needed by the proxy to discover and connect to the endpoint
// picker service that picks endpoints for the requests routed to this pool.
EndpointPickerConfig `json:",inline"`
}

// EndpointPickerConfig specifies the configuration needed by the proxy to discover and connect to the endpoint picker extension.
// This type is intended to be a union of mutually exclusive configuration options that we may add in the future.
type EndpointPickerConfig struct {
// Extension configures an endpoint picker as an extension service.
//
// +kubebuilder:validation:Required
ExtensionRef *Extension `json:"extensionRef,omitempty"`
}

// Extension specifies how to configure an extension that runs the endpoint picker.
type Extension struct {
// Reference is a reference to a service extension.
ExtensionReference `json:",inline"`

// ExtensionConnection configures the connection between the gateway and the extension.
ExtensionConnection `json:",inline"`
}

// ExtensionReference is a reference to the extension deployment.
type ExtensionReference struct {
// Group is the group of the referent.
// When unspecified or empty string, core API group is inferred.
//
// +optional
// +kubebuilder:default=""
Group *string `json:"group,omitempty"`

// Kind is the Kubernetes resource kind of the referent. For example
// "Service".
//
// Defaults to "Service" when not specified.
//
// ExternalName services can refer to CNAME DNS records that may live
// outside of the cluster and as such are difficult to reason about in
// terms of conformance. They also may not be safe to forward to (see
// CVE-2021-25740 for more information). Implementations MUST NOT
// support ExternalName Services.
//
// +optional
// +kubebuilder:default=Service
Kind *string `json:"kind,omitempty"`

// Name is the name of the referent.
//
// +kubebuilder:validation:Required
Name string `json:"name"`

// The port number on the pods running the extension. When unspecified, implementations SHOULD infer a
// default value of 9002 when the Kind is Service.
//
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=65535
// +optional
TargetPortNumber *int32 `json:"targetPortNumber,omitempty"`
}

// ExtensionConnection encapsulates options that configures the connection to the extension.
type ExtensionConnection struct {
// Configures how the gateway handles the case when the extension is not responsive.
// Defaults to failClose.
//
// +optional
// +kubebuilder:default="FailClose"
FailureMode *ExtensionFailureMode `json:"failureMode"`
}

// ExtensionFailureMode defines the options for how the gateway handles the case when the extension is not
// responsive.
// +kubebuilder:validation:Enum=FailOpen;FailClose
type ExtensionFailureMode string

const (
// FailOpen specifies that the proxy should not drop the request and forward the request to and endpoint of its picking.
FailOpen ExtensionFailureMode = "FailOpen"
// FailClose specifies that the proxy should drop the request.
FailClose ExtensionFailureMode = "FailClose"
)

// LabelKey was originally copied from: https://github.com/kubernetes-sigs/gateway-api/blob/99a3934c6bc1ce0874f3a4c5f20cafd8977ffcb4/apis/v1/shared_types.go#L694-L731
// Duplicated as to not take an unexpected dependency on gw's API.
//
Expand Down
88 changes: 88 additions & 0 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 38 additions & 0 deletions client-go/applyconfiguration/api/v1alpha1/endpointpickerconfig.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

75 changes: 75 additions & 0 deletions client-go/applyconfiguration/api/v1alpha1/extension.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

42 changes: 42 additions & 0 deletions client-go/applyconfiguration/api/v1alpha1/extensionconnection.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading