Skip to content

fix(BA-5557): register appproxy endpoints for deployments entering DEPLOYING via ActivateRevision#10728

Open
jopemachine wants to merge 9 commits intomainfrom
fix/BA-5557-deploying-appproxy-registration
Open

fix(BA-5557): register appproxy endpoints for deployments entering DEPLOYING via ActivateRevision#10728
jopemachine wants to merge 9 commits intomainfrom
fix/BA-5557-deploying-appproxy-registration

Conversation

@jopemachine
Copy link
Copy Markdown
Member

@jopemachine jopemachine commented Apr 2, 2026

Summary

  • Deployments entering DEPLOYING state via ActivateRevision (bypassing check_pending) were not getting registered with appproxy, causing them to stay DEGRADED
  • DeployingProvisioningHandler now directly calls register_endpoint for unregistered deployments using deploying_revision_id
  • Made register_endpoint public on DeploymentExecutor for cross-handler reuse
  • Restored check_pending_deployments name for the pending handler path

Test plan

  • Verify deployment via CreateDeployment → AddRevision → ActivateRevision registers endpoint in appproxy
  • Verify legacy deployment path (check_pending) still works unchanged
  • Verify deployments that already have URLs are skipped (no double registration)

🤖 Generated with Claude Code

Fixes BA-5557

Copilot AI review requested due to automatic review settings April 2, 2026 01:03
@github-actions github-actions bot added size:M 30~100 LoC comp:manager Related to Manager component labels Apr 2, 2026
@jopemachine jopemachine marked this pull request as draft April 2, 2026 01:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes BA-5557 by ensuring deployments that enter DEPLOYING via ActivateRevision (skipping the check_pending path) still get registered with appproxy, preventing them from staying DEGRADED.

Changes:

  • Add appproxy endpoint registration to the DEPLOYING/PROVISIONING handler for deployments missing network.url (using deploying_revision_id).
  • Update check_pending_deployments() to skip already-registered deployments and to resolve the revision via current_revision_id or deploying_revision_id.
  • Make DeploymentExecutor.register_endpoint() public and wire additional dependencies through the coordinator.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/ai/backend/manager/sokovan/deployment/handlers/deploying.py Pre-registers missing appproxy endpoints during DEPLOYING provisioning and bulk-persists URLs.
src/ai/backend/manager/sokovan/deployment/executor.py Adjusts pending registration logic (skip when URL exists; revision fallback) and exposes register_endpoint() publicly.
src/ai/backend/manager/sokovan/deployment/coordinator.py Injects DeploymentExecutor + DeploymentRepository into the deploying provisioning handler.
Comments suppressed due to low confidence (1)

src/ai/backend/manager/sokovan/deployment/executor.py:446

  • This method is now public (register_endpoint) but it still sits under a "Private helper methods" section header. Consider updating the section header/comment to avoid confusion about intended API surface (or move register_endpoint above the private helpers section).
    # Private helper methods

    async def register_endpoint(
        self,
        deployment: DeploymentInfo,
        scaling_group_target: ScalingGroupProxyTarget,
        revision_id: UUID,
    ) -> str:
        """Resolve the target revision's model definition and register the endpoint to the app proxy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jopemachine jopemachine added this to the 26.4 milestone Apr 2, 2026
@github-actions github-actions bot added size:L 100~500 LoC and removed size:M 30~100 LoC labels Apr 2, 2026
jopemachine and others added 2 commits April 2, 2026 15:01
…PLOYING via ActivateRevision

Deployments that enter DEPLOYING through CreateDeployment → AddRevision →
ActivateRevision bypass check_pending, which normally registers endpoints
with appproxy. This caused routes to have no appproxy endpoint, leaving
deployments stuck in DEGRADED status.

- Add _register_unregistered_endpoints to DeployingProvisioningHandler
  that directly calls register_endpoint for deployments without a URL
- Make register_endpoint public on DeploymentExecutor for cross-handler use
- Restore check_pending_deployments name for the pending handler path
- Inject DeploymentRepository into DeployingProvisioningHandler for
  proxy target loading and URL persistence

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ews fragment

The _register_endpoint method was renamed to register_endpoint (public) but
tests still referenced the old private name, causing AttributeError in CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jopemachine jopemachine force-pushed the fix/BA-5557-deploying-appproxy-registration branch from ba2c662 to 148d2b1 Compare April 2, 2026 06:07
jopemachine and others added 7 commits April 2, 2026 15:42
- Remove unnecessary network.url check and deploying_revision_id fallback from check_pending_deployments
- Replace asyncio.ensure_future with direct coroutines in gather
- Rename _register_unregistered_endpoints to _ensure_endpoints_registered
- Extract _build_registration_batch and _execute_registration_batch methods
- Introduce _EndpointRegistrationBatch dataclass for type safety

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests now accurately simulate the bug: deployment created without a
revision, then ActivateRevision'd into DEPLOYING, must get its
appproxy endpoint registered via deploying_revision_id.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jopemachine jopemachine marked this pull request as ready for review April 2, 2026 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants