Skip to content

Ensuring Deterministic Authentication Boundaries in gRPC-Go Bidirectional Streams #8861

@Trajanv

Description

@Trajanv

Please see the FAQ in our main README.md before submitting your issue.

Use case(s) - what problem will this feature solve?

In high-throughput, real-time bidirectional streaming systems (e.g., financial market data), there is a requirement for deterministic failure boundaries. Currently, grpc-go allows SendMsg() to succeed immediately after stream creation, even if the underlying TLS/mTLS handshake is still in progress. In scenarios where authentication fails (expired certs, untrusted CA during rotation, or failed JWT Auth at the RPC level):

Wasted Resources: The client may buffer and serialize a significant volume of data that is mathematically certain to be dropped.

Indeterminate State: Application logic cannot easily distinguish between a "network blip" and a "security rejection" until RecvMsg() eventually surfaces a transport error, often after the application believes the data was successfully sent.

Retry Logic Complexity: Without knowing if the handshake (or JWT Auth at the RPC level) ever succeeded, it is difficult to implement safe, state-aware retry strategies during certificate rotation windows.

Proposed Solution

One or more of the following would help address this:

Stream-level authentication readiness indicator
A channel, callback, or context signal indicating successful transport authentication for a stream.

Fail-fast stream semantics
An option to fail stream creation or SendMsg() immediately if authentication is not complete.

Authentication-aware send behavior
Prevent SendMsg() from accepting messages until authentication succeeds, or return an immediate error if authentication has failed.

Explicit grpc-go documentation
Clear documentation describing when authentication is guaranteed relative to stream creation and SendMsg() calls.
E.g
Introduce a mechanism to make the transport authentication state observable and gateable at the stream level.

  1. Stream.AuthReady(context.Context) error: Add a method to the grpc.ClientStream interface (or a concrete wrapper) that blocks until the handshake is complete and peer information is verified.

  2. grpc.WaitForAuth() CallOption: An optional flag for stream creation that prevents SendMsg() from returning nil until the transport security layer has signaled success. If the handshake fails, SendMsg() should return the transport error immediately.

  3. Enhanced Peer Visibility: Ensure peer.FromContext(stream.Context()) is guaranteed to be populated with verified identity data once the "Ready" signal is triggered.

Alternatives Considered

Manually implement a gating mechanism using chan struct{} or sync.WaitGroup to block SendMsg, this requires application-level boilerplate and doesn't solve the underlying issue that the gRPC transport has already accepted and potentially buffered data before the latch is triggered.

Additional Context

The current behavior is an artifact of gRPC’s "transparent" design, where the transport is abstracted away to maximize concurrency. While this is ideal for standard Request/Response, it creates a semantic gap for security-sensitive streaming (Bidi, for eg). As organizations move toward Zero Trust architectures with short-lived certificates and JWT Auth for BiDi Streams, the frequency of handshake failures during rotation/Jwt token refresh increases, making this "gap" a production stability risk.

Metadata

Metadata

Assignees

Labels

Type: FeatureNew features or improvements in behavior

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions