Skip to content

intermittent connection reset by peer errors when using neon.tech #1706

Open
@bittermandel

Description

@bittermandel

Preflight checklist

Ory Network Project

No response

Describe the bug

We are running upstream Keto in an on-premise Kubernetes cluster. We recently moved our postgres instance to https://neon.tech rather than running a CNPG instance in the cluster itself.

This has lead to very frequent connection reset by peer, specifically for both Hydra and Keto. I've been in contact with the Neon team and they suspect there's something on the application level which causes this issue.

Have you seen this issue before internally, or with other users running on Neon? At this point I am not sure how to debug this.

I did find this issue though, which seems similar to the issue we are seeing: jackc/pgx#984.

Reproducing the bug

  1. Run Keto with default configuration and PG DSN pointing at a Neon.tech instance without connection pooling.
  2. Ensure there is no active connection open.
  3. Send any read or write request which reads from DB. In our case GetRelationships

Relevant log output

Keto logs:

time=2025-02-06T15:08:49Z level=error msg=failed to look up direct access in db audience=application error=map[message:write failed: write tcp 10.0.72.159:50814->72.144.105.10:5432: write: connection reset by peer] method=checkDirect service_name=Ory Keto service_version=v0.11.1-alpha.0


Neon-side logs:

2025-02-28T13:17:21.220670Z  WARN connect_request{protocol=tcp session_id=3890068a-cae8-483d-a1b0-37ebb16f77a0 conn_info=65.109.68.175 role=neondb_owner ep=ep-flat-glitter-a9n862jk}: per-client task finished with an error: peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-eof

2025-02-28T13:17:21.220602Z  INFO connect_request{protocol=tcp session_id=3890068a-cae8-483d-a1b0-37ebb16f77a0 conn_info=65.109.68.175 role=neondb_owner ep=ep-flat-glitter-a9n862jk}:{user="neondb_owner" db=Some("neondb") app=None}: forwarding error to user kind="clientdisconnect" error=peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-eof msg="Internal error"

2025-02-28T13:17:21.220576Z  WARN connect_request{protocol=tcp session_id=3890068a-cae8-483d-a1b0-37ebb16f77a0 conn_info=65.109.68.175 role=neondb_owner ep=ep-flat-glitter-a9n862jk}:authenticate{allow_cleartext=false}: error processing scram messages error=Io(Custom { kind: UnexpectedEof, error: "peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-e

Relevant configuration

Version

v0.11.1-alpha.0

On which operating system are you observing this issue?

Linux

In which environment are you deploying?

Kubernetes

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething is not working.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions