API Reference

Packages

llmd.ai/v1alpha1

llmd.ai/v1alpha1

Package v1alpha1 contains API Schema definitions for the llmd v1alpha1 API group.

Resource Types

VariantAutoscaling
VariantAutoscalingList

ActuationStatus

ActuationStatus provides details about the actuation process and its current status.

Appears in:

VariantAutoscalingStatus

Field	Description	Default	Validation
`applied` boolean	Applied indicates whether the actuation was successfully applied.

OptimizedAlloc

OptimizedAlloc describes the target optimized allocation for a model variant.

Appears in:

VariantAutoscalingStatus

Field	Description	Validation
`lastRunTime` Time	LastRunTime is the timestamp of the last optimization run.
`accelerator` string	Accelerator is the type of accelerator for the optimized allocation. This field is deprecated and will be removed in a future version. Use node selector or node affinity from scale target instead.
`numReplicas` integer	NumReplicas is the number of replicas for the optimized allocation. nil means no optimization decision has been made yet.	Minimum: 0

VariantAutoscaling

VariantAutoscaling is the Schema for the variantautoscalings API. It represents the autoscaling configuration and status for a model variant.

Appears in:

VariantAutoscalingList

Field	Description	Validation
`apiVersion` string	`llmd.ai/v1alpha1`
`kind` string	`VariantAutoscaling`
`kind` string	Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds	Optional: {}
`apiVersion` string	APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources	Optional: {}
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` VariantAutoscalingSpec	Spec defines the desired state for autoscaling the model variant.
`status` VariantAutoscalingStatus	Status represents the current status of autoscaling for the model variant.

VariantAutoscalingConfigSpec

VariantAutoscalingConfigSpec holds the optional tuning fields for a VariantAutoscaling. It is extracted as a standalone embeddable type so that higher-level controllers (e.g. KServe) can inline it without duplicating field definitions.

Appears in:

VariantAutoscalingSpec

Field	Description	Default	Validation
`variantCost` string	VariantCost specifies the cost per replica for this variant (used in saturation analysis).	10.0	Optional: {} Pattern: `^\d+(\.\d+)?$`

VariantAutoscalingList

VariantAutoscalingList contains a list of VariantAutoscaling resources.

Field	Description	Validation
`apiVersion` string	`llmd.ai/v1alpha1`
`kind` string	`VariantAutoscalingList`
`kind` string	Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds	Optional: {}
`apiVersion` string	APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources	Optional: {}
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` VariantAutoscaling array	Items is the list of VariantAutoscaling resources.

VariantAutoscalingSpec

VariantAutoscalingSpec defines the desired state for autoscaling a model variant.

Appears in:

VariantAutoscaling

Field	Description	Default	Validation
`scaleTargetRef` CrossVersionObjectReference	ScaleTargetRef references the scalable resource to manage. This follows the same pattern as HorizontalPodAutoscaler.		Required: {}
`modelID` string	ModelID specifies the unique identifier of the model to be autoscaled.		MinLength: 1 Required: {}
`minReplicas` integer	MinReplicas is the lower bound on the number of replicas for this variant. A value of 0 enables scale-to-zero when the model is idle. Defaults to 1, preserving existing behavior for VAs that omit this field.	1	Minimum: 0 Optional: {}
`maxReplicas` integer	MaxReplicas is the upper bound on the number of replicas for this variant. The autoscaler will never scale beyond this value regardless of load.	2	Minimum: 1
`variantCost` string	VariantCost specifies the cost per replica for this variant (used in saturation analysis).	10.0	Optional: {} Pattern: `^\d+(\.\d+)?$`

VariantAutoscalingStatus

VariantAutoscalingStatus represents the current status of autoscaling for a variant, including the current allocation, desired optimized allocation, and actuation status.

Appears in:

VariantAutoscaling

Field	Description	Validation
`desiredOptimizedAlloc` OptimizedAlloc	DesiredOptimizedAlloc indicates the target optimized allocation based on autoscaling logic.
`actuation` ActuationStatus	Actuation provides details about the actuation process and its current status.
`conditions` Condition array	Conditions represent the latest available observations of the VariantAutoscaling's state	Optional: {}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Reference

Packages

llmd.ai/v1alpha1

Resource Types

ActuationStatus

OptimizedAlloc

VariantAutoscaling

VariantAutoscalingConfigSpec

VariantAutoscalingList

VariantAutoscalingSpec

VariantAutoscalingStatus

FilesExpand file tree

crd-reference.md

Latest commit

History

crd-reference.md

File metadata and controls

API Reference

Packages

llmd.ai/v1alpha1

Resource Types

ActuationStatus

OptimizedAlloc

VariantAutoscaling

VariantAutoscalingConfigSpec

VariantAutoscalingList

VariantAutoscalingSpec

VariantAutoscalingStatus