Skip to content

docs: Remove references to deprecated Allocation API types#554

Merged
asm582 merged 1 commit intomainfrom
docs/remove-allocation-api-references-a292c5d0430359d8
Jan 9, 2026
Merged

docs: Remove references to deprecated Allocation API types#554
asm582 merged 1 commit intomainfrom
docs/remove-allocation-api-references-a292c5d0430359d8

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Jan 9, 2026

Summary

This PR updates documentation to reflect the API changes made in PR #553, which removed the Allocation and LoadProfile types from the VariantAutoscaling API as part of the model-based scaling cleanup.

Changes Made

Removed Documentation

  • Allocation type definition: Fields for accelerator, numReplicas, maxBatch, itlAverage, ttftAverage, and load
  • LoadProfile type definition: Fields for arrivalRate, avgInputTokens, and avgOutputTokens
  • currentAlloc field: Removed from VariantAutoscalingStatus documentation

Updated Examples

  • Replaced currentAllocation with desiredOptimizedAlloc in status examples
  • Updated controller behavior documentation to reflect current API state
  • Updated configuration guide deployment deletion examples

Files Modified

  • docs/user-guide/crd-reference.md - Removed Allocation and LoadProfile type definitions, updated VariantAutoscalingStatus
  • docs/user-guide/configuration.md - Updated status examples
  • docs/design/controller-behavior.md - Updated deployment deletion status examples

Context

These types were part of the model-based scaling implementation that tracked detailed allocation metrics (ITL, TTFT, batch size, load characteristics). With the shift to saturation-based scaling as the primary autoscaling approach, these fields are no longer part of the API surface.

The internal interfaces.Allocation type still exists for internal use by the optimizer and model analyzer components, but is not exposed in the Kubernetes API.

Verification

  • All references to removed API types have been cleaned from user-facing documentation
  • Status examples updated to reflect current API structure
  • No breaking changes to existing valid configurations
  • Documentation accurately reflects code in main branch

Related

AI generated by Update Docs

Remove documentation for CurrentAlloc, Allocation, and LoadProfile types
that were removed in PR #553 as part of the model-based scaling cleanup.

Changes:
- Remove Allocation and LoadProfile type definitions from CRD reference
- Remove currentAlloc field from VariantAutoscalingStatus documentation
- Update status examples to use desiredOptimizedAlloc instead
- Update VariantAutoscalingStatus description to reflect current state

These types were part of the model-based scaling implementation and are
no longer needed for saturation-based scaling.
@lionelvillard lionelvillard requested a review from asm582 January 9, 2026 16:01
@asm582 asm582 marked this pull request as ready for review January 9, 2026 16:28
@asm582 asm582 merged commit 6b4dd1c into main Jan 9, 2026
github-actions bot pushed a commit that referenced this pull request Jan 9, 2026
- Update VariantAutoscalingStatus godoc comment to remove mention of 'current allocation'
- Add clarifying notes to HPA and KEDA integration docs about log format changes
- Mark debug log examples as illustrative from older model-based scaling version

Follow-up to PR #554 which removed Allocation types from CRD reference docs.
ev-shindin pushed a commit to ev-shindin/workload-variant-autoscaler that referenced this pull request Jan 14, 2026
…tion-api-references-a292c5d0430359d8

docs: Remove references to deprecated Allocation API types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant