Skip to content

[DocDB] PgSingleTServerTest.ManyYsqlConnections deadlocks via TriggerRelcacheInitConnection on kNormal pool #31420

@ellabaron-code

Description

@ellabaron-code

Description

PgSingleTServerTest.ManyYsqlConnections fails on both Linux and macOS due to a self-deadlock in
the kNormal PgClient RPC pool.

PgClientService::TriggerRelcacheInitConnection is a synchronous RPC handler. It schedules
MakeRelcacheInitConnection on the messenger's scheduler and then calls future::wait_for on the
in-flight promise, parking the kNormal worker. MakeRelcacheInitConnection itself runs on
kNormal, and opens a libpq connection to the local PG listener whose backend issues PgClient RPCs
that also land on kNormal.

Under enough concurrent relcache-init callers (e.g. ManyYsqlConnections with 64 backends and a
32-thread kNormal pool), every kNormal worker parks in wait_for, and the only thing that
could call promise.set_value can never get a worker — classic self-deadlock.

Steps to reproduce

yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test \
  --gtest_filter PgSingleTServerTest.ManyYsqlConnections

Affects

  • Linux (release + debug)
  • macOS

Issue Type

kind/failing-test

Warning

I confirm this issue does not contain any sensitive information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions