Skip to content

Support multi-endpoint failover #47

@killme2008

Description

@killme2008

Summary

The current multi-endpoint client configuration provides client-side load balancing across ready subchannels, but it does not guarantee per-RPC failover semantics. When a request is routed to an endpoint that becomes unavailable or returns a transport error, the failed RPC is not retried on another endpoint automatically.

Current behavior

  • Single endpoint uses a direct GrpcChannel.
  • Multi-endpoint uses a static resolver plus random or round_robin load balancing.
  • Endpoint selection happens across ready subchannels.
  • There is no explicit retry policy or request-level failover.

Expected behavior

Multi-endpoint configuration should support failover for transient transport-level endpoint failures, so a failed RPC can be retried against another healthy endpoint when it is safe to do so.

Scope

  • Define the desired failover contract clearly.
  • Evaluate whether this should rely on gRPC retry policy, client-side retry logic, or another mechanism.
  • Clarify interaction with idempotency / write semantics.
  • Add tests that cover endpoint failure scenarios in multi-endpoint mode.

Notes

This is distinct from load balancing. The current implementation can route new calls to other ready endpoints, but it does not provide request-level automatic failover for a call that already failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions