Proxy functionality broken in Aleph 0.7.0 and later #731
Description
Since updating from 0.6.4 to 0.8.0 I see strange exceptions for proxied calls.
io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 485454502f312e31203430302042616420526571756573740d0a636f6e74656e742d6c656e6774683a2031310d0a636f6e74656e742d747970653a20746578742f706c61696e0d0a646174653a205468752c203133204a756e20323032342032323a35353a323720474d540d0a7365727665723a20656e766f790d0a636f6e6e656374696f6e3a20636c6f73650d0a0d0a4261642052657175657374
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:500)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1407)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:918)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at manifold.executor$thread_factory$reify__1227$f__1228.invoke(executor.clj:71)
at clojure.lang.AFn.run(AFn.java:22)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 485454502f312e31203430302042616420526571756573740d0a636f6e74656e742d6c656e6774683a2031310d0a636f6e74656e742d747970653a20746578742f706c61696e0d0a646174653a205468752c203133204a756e20323032342032323a35353a323720474d540d0a7365727665723a20656e766f790d0a636f6e6e656374696f6e3a20636c6f73650d0a0d0a4261642052657175657374
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1314)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
... 19 common frames omitted
Decoding the hex string yields:
$ xxd -r -p <<<485454502f312e31203430302042616420526571756573740d0a636f6e74656e742d6c656e6774683a2031310d0a636f6e74656e742d747970653a20746578742f706c61696e0d0a646174653a205468752c203133204a756e20323032342032323a35353a323720474d540d0a7365727665723a20656e766f790d0a636f6e6e656374696f6e3a20636c6f73650d0a0d0a4261642052657175657374
HTTP/1.1 400 Bad Request
content-length: 11
content-type: text/plain
date: Thu, 13 Jun 2024 22:55:27 GMT
server: envoy
connection: close
Bad Request
So it seems it tries to interpret an http response as https. (I try to reach a https endpoint via an http proxy, but maybe it's making an unproxied http call instead while still expecting https.)
It's difficult for me, though, to reproduce the proxy settings we use in prod locally. Therefore, I will use a local proxy for this bug report here:
$ docker run --rm -p 3128:3128/tcp ghcr.io/tarampampam/3proxy
/bin/3proxy: Starting 3proxy
{"time_unix":1718326196, "proxy":{"type:":"PROXY", "port":3128}, "error":{"code":"00000"}, "auth":{"user":"-"}, "client":{"ip":"0.0.0.0", "port":3128}, "server":{"ip":"0.0.0.0", "port":0}, "bytes":{"sent":0, "received":0}, "request":{"hostname":"[0.0.0.0]"}, "message":"Accepting connections [17/4281620160]"}
{"time_unix":1718326196, "proxy":{"type:":"SOCKS", "port":1080}, "error":{"code":"00000"}, "auth":{"user":"-"}, "client":{"ip":"0.0.0.0", "port":1080}, "server":{"ip":"0.0.0.0", "port":0}, "bytes":{"sent":0, "received":0}, "request":{"hostname":"[0.0.0.0]"}, "message":"Accepting connections [17/4281566912]"}
After this, we can confirm that Aleph 0.6.4 works as expected
(def pool (http/connection-pool {:connection-options {:proxy-options {:host "localhost" :port 3128}}}))
@(http/get "http://example.com" {:pool pool}) =>
{:request-time 52,
:aleph/keep-alive? true,
:headers
{"x-cache" "HIT",
"server" "ECAcc (sed/58AF)",
"age" "245573",
"content-type" "text/html; charset=UTF-8",
"content-length" "1256",
"accept-ranges" "bytes",
"expires" "Fri, 21 Jun 2024 00:16:36 GMT",
"etag" "\"3147526947+gzip\"",
"date" "Fri, 14 Jun 2024 00:16:36 GMT",
"vary" "Accept-Encoding",
"last-modified" "Thu, 17 Oct 2019 07:18:26 GMT",
"cache-control" "max-age=604800"},
:status 200,
:connection-time 96,
:body
#object[java.io.ByteArrayInputStream 0x36c1143f "java.io.ByteArrayInputStream@36c1143f"]}
@(http/get "https://example.com" {:pool pool}) =>
{:request-time 21,
:aleph/keep-alive? true,
:headers
{"x-cache" "HIT",
"server" "ECAcc (sed/5885)",
"age" "271652",
"content-type" "text/html; charset=UTF-8",
"content-length" "1256",
"expires" "Fri, 21 Jun 2024 00:16:40 GMT",
"etag" "\"3147526947+gzip+ident\"",
"date" "Fri, 14 Jun 2024 00:16:40 GMT",
"vary" "Accept-Encoding",
"last-modified" "Thu, 17 Oct 2019 07:18:26 GMT",
"cache-control" "max-age=604800"},
:status 200,
:connection-time 194,
:body
#object[java.io.ByteArrayInputStream 0x2f625fd5 "java.io.ByteArrayInputStream@2f625fd5"]}
Trying the same in 0.7.0 or higher yields
@(http/get "https://example.com" {:pool pool}) =>
Execution error (ExceptionInfo) at aleph.http.client/http-connection$setup-client$http-req-fn (client.clj:913).
Channel is inactive/closed.
{:req
{:path "/",
:user-info nil,
:pool
#object[io.aleph.dirigiste.Pool 0x124e951f "io.aleph.dirigiste.Pool@124e951f"],
:request-url "https://example.com",
:server-port nil,
:uri "/",
:server-name "example.com",
:query-string nil,
:scheme :https,
:request-method :get},
:ch NioSocketChannel [8efb471b]():
Handlers:
,
:open? false,
:active? false}
client.clj: 913 aleph.http.client/http-connection/setup-client/http-req-fn
client_middleware.clj: 263 aleph.http.client-middleware/wrap-exceptions/fn/fn
AFn.java: 154 clojure.lang.AFn/applyToHelper
AFn.java: 144 clojure.lang.AFn/applyTo
core.clj: 667 clojure.core/apply
core.clj: 1990 clojure.core/with-bindings*
core.clj: 1990 clojure.core/with-bindings*
RestFn.java: 142 clojure.lang.RestFn/applyTo
core.clj: 671 clojure.core/apply
core.clj: 2020 clojure.core/bound-fn*/fn
RestFn.java: 408 clojure.lang.RestFn/invoke
deferred.clj: 882 manifold.deferred/eval17318/chain'-
client_middleware.clj: 263 aleph.http.client-middleware/wrap-exceptions/fn
client_middleware.clj: 598 aleph.http.client-middleware/wrap-request-timing/fn
client_middleware.clj: 976 aleph.http.client-middleware/wrap-request/fn
http.clj: 403 aleph.http/eval28275/request/fn/fn/fn
deferred.clj: 900 manifold.deferred/eval17318/chain'-
deferred.clj: 854 manifold.deferred/eval17318/subscribe/fn
deferred.clj: 344 manifold.deferred.Listener/onSuccess
deferred.clj: 523 manifold.deferred.Deferred/fn/fn
AFn.java: 22 clojure.lang.AFn/run
Executor.java: 320 io.aleph.dirigiste.Executor$3/run
Executor.java: 62 io.aleph.dirigiste.Executor$Worker$1/run
executor.clj: 70 manifold.executor/thread-factory/reify/f
Or alternatively, after a timeout of a minute or so:
@(http/get "https://example.com" {:pool pool}) =>
Unhandled io.netty.handler.proxy.ProxyConnectException
http, none, localhost/127.0.0.1:3128 => null, timeout
ProxyHandler.java: 201 io.netty.handler.proxy.ProxyHandler$2/run
PromiseTask.java: 98 io.netty.util.concurrent.PromiseTask/runTask
ScheduledFutureTask.java: 153 io.netty.util.concurrent.ScheduledFutureTask/run
AbstractEventExecutor.java: 173 io.netty.util.concurrent.AbstractEventExecutor/runTask
AbstractEventExecutor.java: 166 io.netty.util.concurrent.AbstractEventExecutor/safeExecute
SingleThreadEventExecutor.java: 469 io.netty.util.concurrent.SingleThreadEventExecutor/runAllTasks
NioEventLoop.java: 569 io.netty.channel.nio.NioEventLoop/run
SingleThreadEventExecutor.java: 994 io.netty.util.concurrent.SingleThreadEventExecutor$4/run
ThreadExecutorMap.java: 74 io.netty.util.internal.ThreadExecutorMap$2/run
executor.clj: 70 manifold.executor/thread-factory/reify/f
AFn.java: 22 clojure.lang.AFn/run
FastThreadLocalRunnable.java: 30 io.netty.util.concurrent.FastThreadLocalRunnable/run
Thread.java: 1583 java.lang.Thread/run
Trying http yields a strange 503:
@(http/get "http://example.com" {:pool pool})
Execution error (ExceptionInfo) at aleph.http.client-middleware/wrap-exceptions$fn$fn$fn (client_middleware.clj:275).
status: 503
{:aleph/keep-alive? false,
:aleph/destroy-conn? #<SuccessDeferred@5a793212: false>,
:headers
{"content-type" "text/html; charset=UTF-8",
"content-length" "1284",
"connection" "close",
"pragma" "no-cache",
"expires" "Thu, 01 Jan 1970 00:00:00 GMT",
"p3p" "CP=\"CAO PSA OUR\"",
"cache-control"
"no-store, no-cache, must-revalidate, post-check=0, pre-check=0"},
:status 503,
:body
#object[java.io.ByteArrayInputStream 0x77490577 "java.io.ByteArrayInputStream@77490577"]}
client_middleware.clj: 275 aleph.http.client-middleware/wrap-exceptions/fn/fn/fn
deferred.clj: 882 manifold.deferred/eval17318/chain'-
deferred.clj: 848 manifold.deferred/eval17318/subscribe/fn
deferred.clj: 344 manifold.deferred.Listener/onSuccess
deferred.clj: 523 manifold.deferred.Deferred/fn
deferred.clj: 523 manifold.deferred.Deferred/success
deferred.clj: 368 manifold.deferred/success!
deferred.clj: 365 manifold.deferred/success!
client_middleware.clj: 272 aleph.http.client-middleware/wrap-exceptions/fn/fn/f--auto--
AFn.java: 22 clojure.lang.AFn/run
Executor.java: 320 io.aleph.dirigiste.Executor$3/run
Executor.java: 62 io.aleph.dirigiste.Executor$Worker$1/run
executor.clj: 70 manifold.executor/thread-factory/reify/f
If I try https afterwards it seems it still tries to intepret the result as http:
@(http/get "https://example.com" {:pool pool}) =>
Execution error (IllegalArgumentException) at io.netty.handler.codec.http.HttpVersion/<init> (HttpVersion.java:123).
invalid version format: Ú¶�&¿IÐQ¡_QI�Q�©ÎÜÕCÛ�D��Á�Î�6Ó§��L�³K�·�¹}�§¾¿¶.¯}¦KSSǶ'}ÇÊÂÍ��0I\�R�ÊJ:Ñ�Ú�/CY�̱Þ
HttpVersion.java: 123 io.netty.handler.codec.http.HttpVersion/<init>
HttpVersion.java: 85 io.netty.handler.codec.http.HttpVersion/valueOf
HttpResponseDecoder.java: 194 io.netty.handler.codec.http.HttpResponseDecoder/createMessage
HttpObjectDecoder.java: 354 io.netty.handler.codec.http.HttpObjectDecoder/decode
HttpClientCodec.java: 320 io.netty.handler.codec.http.HttpClientCodec$Decoder/decode
ByteToMessageDecoder.java: 530 io.netty.handler.codec.ByteToMessageDecoder/decodeRemovalReentryProtection
ByteToMessageDecoder.java: 469 io.netty.handler.codec.ByteToMessageDecoder/callDecode
ByteToMessageDecoder.java: 290 io.netty.handler.codec.ByteToMessageDecoder/channelRead
CombinedChannelDuplexHandler.java: 251 io.netty.channel.CombinedChannelDuplexHandler/channelRead
HttpProxyHandler.java: 284 io.netty.handler.proxy.HttpProxyHandler$HttpClientCodecWrapper/channelRead
AbstractChannelHandlerContext.java: 444 io.netty.channel.AbstractChannelHandlerContext/invokeChannelRead
AbstractChannelHandlerContext.java: 420 io.netty.channel.AbstractChannelHandlerContext/invokeChannelRead
AbstractChannelHandlerContext.java: 412 io.netty.channel.AbstractChannelHandlerContext/fireChannelRead
LoggingHandler.java: 280 io.netty.handler.logging.LoggingHandler/channelRead
AbstractChannelHandlerContext.java: 442 io.netty.channel.AbstractChannelHandlerContext/invokeChannelRead
AbstractChannelHandlerContext.java: 420 io.netty.channel.AbstractChannelHandlerContext/invokeChannelRead
AbstractChannelHandlerContext.java: 412 io.netty.channel.AbstractChannelHandlerContext/fireChannelRead
DefaultChannelPipeline.java: 1407 io.netty.channel.DefaultChannelPipeline$HeadContext/channelRead
AbstractChannelHandlerContext.java: 440 io.netty.channel.AbstractChannelHandlerContext/invokeChannelRead
AbstractChannelHandlerContext.java: 420 io.netty.channel.AbstractChannelHandlerContext/invokeChannelRead
DefaultChannelPipeline.java: 918 io.netty.channel.DefaultChannelPipeline/fireChannelRead
AbstractNioByteChannel.java: 166 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe/read
NioEventLoop.java: 788 io.netty.channel.nio.NioEventLoop/processSelectedKey
NioEventLoop.java: 724 io.netty.channel.nio.NioEventLoop/processSelectedKeysOptimized
NioEventLoop.java: 650 io.netty.channel.nio.NioEventLoop/processSelectedKeys
NioEventLoop.java: 562 io.netty.channel.nio.NioEventLoop/run
SingleThreadEventExecutor.java: 994 io.netty.util.concurrent.SingleThreadEventExecutor$4/run
ThreadExecutorMap.java: 74 io.netty.util.internal.ThreadExecutorMap$2/run
executor.clj: 70 manifold.executor/thread-factory/reify/f
AFn.java: 22 clojure.lang.AFn/run
FastThreadLocalRunnable.java: 30 io.netty.util.concurrent.FastThreadLocalRunnable/run
Thread.java: 1583 java.lang.Thread/run
although, I sometimes can get this error right away, without trying http first, so it doesn't seem to be a matter of misusing a pool for different protocols (which we don't do in prod anyway).
So even though I can not reproduce the exact error I see in prod locally, I think we can conclude that there is a regression in 0.7.0 and later compared to 0.6.4.