Skip to content

Broken gRPC calls with Istio Service Mesh and Envoy #1237

Open
@paoloyx

Description

Hi everyone,

I don't really know if here is the right place to ask, but I'll try anyway. For the sake of clarity, I've asked the same on Envoy GitHub as I don't really know where the problem lies: envoyproxy/envoy#16811

A bit of context

We've an Istio 1.8.3 installation on a 1.18.x Kubernetes clusters. There are two pods (let's say ORIGIN and DEST) where ORIGIN is making a gRPC call to DEST, through a Kubernetes service that is proxying the DEST pod.

The situation is this:

  • We are using higherkindness/mu-scala
  • without Istio sidecars (i.e. Envoy proxies) all is going well.
  • With the sidecars the gRPC communication is not working, we receive an error like Received unexpected EOS on empty DATA frame from server.

Here are the logs, one for successful communication (without sidecars) and one for bad communication (with sidecars)

Successful communications logs

ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.417 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=false settings={ENABLE_PUSH=0, MAX_CONCURRENT_STREAMS=0, INITIAL_WINDOW_SIZE=1048576, MAX_HEADER_LIST_SIZE=8192}																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.430 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=983041																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.515 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=false settings=[id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100]																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.516 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.535 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.542 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[:authority: dest-service:9100, :path: /x.y.z/Create, :method: POST, :scheme: http, content-type: application/grpc, te: trailers, user-agent: grpc-java-netty/1.36.1, grpc-accept-encoding: gzip] streamDependency=0 weight=16 exclusive=false padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.551 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND DATA: streamId=3 padding=0 endStream=true length=50 bytes=000000002d0a2439383937636436392d646636632d346362612d386631392d30613562316332613237663412057465737439																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.718 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=9934465																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.500 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND HEADERS: streamId=3 headers=GrpcHttp2ResponseHeaders[:status: 200, content-type: application/grpc+proto, grpc-encoding: gzip, date: Thu, 03 Jun 2021 16:18:45 GMT] padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc8bb1580201004c0dcd01a48f73de0be9473cad27f093ae15c6df98c71b6224c079429a8ca05299fbd18215177eb41ef...																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND PING: ack=false bytes=1234																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.521 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND HEADERS: streamId=3 headers=GrpcHttp2ResponseHeaders[grpc-status: 0, date: Thu, 03 Jun 2021 16:18:45 GMT] padding=0 endStream=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.530 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND PING: ack=true bytes=1234																									
ORIGIN-POD ORIGIN-POD INFO  2021-06-03 16:18:45.542 [ioapp-compute-0] i.r.c.Server$ - [APPLICATION_LOGS_BLAH_BLAH]																								
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.552 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND GO_AWAY: lastStreamId=0 errorCode=0 length=0 bytes=																									

Bad Communication logs

ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.427 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=false settings={ENABLE_PUSH=0, MAX_CONCURRENT_STREAMS=0, INITIAL_WINDOW_SIZE=1048576, MAX_HEADER_LIST_SIZE=8192}																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.475 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=983041																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.487 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=false settings={HEADER_TABLE_SIZE=4096, MAX_CONCURRENT_STREAMS=2147483647, INITIAL_WINDOW_SIZE=268435456,=0}																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.490 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.511 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=268369921																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.521 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[:authority: dest-service:9100, :path: /x.y.z/Create, :method: POST, :scheme: http, content-type: application/grpc, te: trailers, user-agent: grpc-java-netty/1.36.1, grpc-accept-encoding: gzip] streamDependency=0 weight=16 exclusive=false padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.533 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND DATA: streamId=3 padding=0 endStream=true length=50 bytes=000000002d0a2439383937636436392d646636632d346362612d386631392d30613562316332613237663412057465737439																																															
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.071 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND HEADERS: streamId=3 headers=GrpcHttp2ResponseHeaders[:status: 200, grpc-encoding: gzip, date: Thu, 03 Jun 2021 15:13:31 GMT, content-type: application/grpc+proto, x-envoy-upstream-service-time: 529, server: envoy] padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.113 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc9b911c0300804c0dca16b50ca0c423ca21c71c6fd97606fbad710e6ee9a2099aca406a1b4d8a4484b632b71dc2375c9...																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.114 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND PING: ack=false bytes=1234																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.125 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=true length=0 bytes=																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.129 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND PING: ack=true bytes=1234																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.160 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND GO_AWAY: lastStreamId=0 errorCode=0 length=0 bytes=
[APPLICATION-LOGS_BLAH_BLAH]																							
ORIGIN-POD ORIGIN-POD io.grpc.StatusRuntimeException: INTERNAL: Received unexpected EOS on empty DATA frame from server																									
ORIGIN-POD ORIGIN-POD	at io.grpc.Status.asRuntimeException(Status.java:535)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:464)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:428)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:461)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)	

Differences in logs

Unfortunately I'm no expert in debugging gGPC services, and There's one thing that i clearly see:

  • in succesful calls (without envoys) we've got only one INBOUND DATA log line:
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc8bb1580201004c0dcd01a48f73de0be9473cad27f093ae15c6df98c71b6224c079429a8ca05299fbd18215177eb41ef...	
  • in errored calls (with envoys) we've got TWO INBOUND DATA log lines:
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.113 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc9b911c0300804c0dca16b50ca0c423ca21c71c6fd97606fbad710e6ee9a2099aca406a1b4d8a4484b632b71dc2375c9...																									
																								
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.125 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=true length=0 bytes=	

and the last one has length=0 bytes, so I suppose that this is giving the Received unexpected EOS on empty DATA frame from server error

Envoy logs

Just to add an important info, here are the Envoy proxies logs for above communication

ORIGIN pod

[2021-05-28T15:32:52.440Z] "POST /x.y.z/Create HTTP/2" 200 - via_upstream - "-" 50 89 534 529 "-" "grpc-java-netty/1.36.1" 2b95ea42-d089-4ae5-9d45-b13dba9e983f"  "dest-service:9100" 10.172.24.7:9000 outbound|9100||dest-service.default.svc.cluster.local 10.172.22.162:35838 10.175.246.61:9100 10.172.22.162:33746 - default

DEST pod

[2021-05-28T15:32:52.442Z]  "POST /x.y.z/Create HTTP/2"  200  -  via_upstream  -  "-"  0  135  643  640  "-"  "grpc-java-netty/1.36.1"  "96fa90ba-ab5e-4d03-818e-c762f5 │  │ 92aeb1"  "dest-service:9100"  "127.0.0.1:9000"  inbound|9000||  127.0.0.1:45170  10.172.24.7:9000  10.172.22.162:35838  outbound_.9100_._.dest-service.default.svc.cluster.local  default

Hope that this is enough to let someone of you understand what my problem is.
Thanks

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions