Skip to content

Expose database connection pool settings #5333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

syeopite
Copy link
Member

@syeopite syeopite commented Jun 1, 2025

Unless an instance maintainer was using the database_url attribute, and knew about the connection pool query parameters, the database connection pool settings was relegated to the default settings...

And the default settings are not great for any large instance... With it essentially only allowing a single connection within the pool, a maximum checkout time of a 5 seconds, and basically no additional retries whatsoever its no wonder that PgBouncer has became an necessity among Invidious instances.

This PR changes that by exposing the ability to configure the database connection pool for Postgres. I've also added some saner defaults that should hopefully help Invidious run better out of the box.

For reference here are the default values: https://crystal-lang.org/reference/1.16/database/connection_pool.html#configuration*

*The default retry delay is actually 0.2 seconds now but the documentation hasn't been updated to reflect this.

Unless an instance maintainer was using the `database_url` attribute,
and knew about the connection pool query parameters, the database
connection pool settings was relegated to the default settings...

And the default settings are... not great for any large instance. With it
essentially only allowing a single connection within the pool, a maximum
checkout time of a 5 seconds, and basically no additional retries whatsoever
its no wonder that PgBouncer has became a staple among Invidious instances.

This PR changes that by exposing the ability to configure the database
connection pool that is used within the library that Invidious uses to
interact with Postgres.
## Accepted values: a positive integer
## Default: 100
##
#max_idle_pool_size: 100
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally also exposed the idle pool size config for the HTTP connection pool in #5081 but later reverted the change after I learnt what idle actually meant in the context of the connection pool.

I've added it here for completeness but I don't know if there's actually any benefits in exposing this setting over making it just always equal the max pool size.

## Accepted values: a positive integer
## Default: 100
##
#max_pool_size: 100
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a probably a non issue but

Although all of this is nested under the db field, the options here are still really similar to the ones for the HTTP connection pool which might get confusing. And this could be made worse if we ever decide to offer more granular control for the latter (eg configuring the pool options for each domain) which will lead to a lot of essentially duplicate options.

@syeopite syeopite requested a review from Fijxu June 12, 2025 18:17
Copy link
Member

@Fijxu Fijxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the requested changes, works great from what I tested.

Comment on lines +10 to +19
# How many connections to construct on start-up, and keep it there.
property initial_pool_size : Int32 = 1
# The maximum size of the connection pool
property max_pool_size : Int32 = 100
# The maximum amount of idle connections within the pool
# idle connections are defined as created connections that are
# sitting around in the pool. Exceeding this number will cause new connections
# to be created on checkout and then simply dropped on release, till the maximum pool size
# from which there will be a checkout timeout.
property max_idle_pool_size : Int32 = 100
Copy link
Member

@Fijxu Fijxu Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should note two things.

The value of max_idle_pool_size has to be higher than initial_pool_size, if initial_pool_size is 500, Invidious will start with 500 connections, and then it will reduce them to 100 if max_idle_pool_size is set to 100.

max_connections on PostgreSQL has to be slightly higher than the value used here, the default value of max_connections is 100 (https://postgresqlco.nf/doc/en/param/max_connections/), so if someone reason Invidious needs to create new clients on the pool, it will hit 100 connections on the DB, depleting all the remaining connections and preventing other services using the same PostgreSQL server to connect to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about forcing a user to change their default postgres configuration just to accommodate Invidious' default behavior.

Maybe we could just reduce the default max_pool_size instead? I'm not sure what'll be a reasonable value though.

Copy link
Member

@Fijxu Fijxu Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could just reduce the default max_pool_size instead? I'm not sure what'll be a reasonable value though.

That will depend of the traffic of the instance. I think 32 or even 16 is fine. My instance used to run using a unix socket without pgbouncer (but with ~4 replicas, so at the end, the total connections to the database were 24 (4 replicas * 6 connections)) at all and everything worked fine.

I tested using max_pool_size 32 and this are the results:

When using wrk with 2 threads and 10 connections, I got 1772req/s with ~5ms average, against an Invidious playlist with 1 video using the Invidious API /api/v1/playlists/:plid. Invidious used 10 db connections.

When using 2 threads and 50 connections, I got 1826req/s with 123ms average, Invidious used 32 connections.

What matters is the latency of each operation. I'm not a database expert, but I still think this setting will depend of the current setup of Invidious people have.

People that use a shared database used along other services like me may need to set a higher max_connections on PostgreSQL or set a lower max_pool_size so Invidious doesn't eat all the connections for any reason (mainly DoS).

People that use the Docker guide of https://docs.invidious.io/installation/#docker-compose-method-production should be fine since max_pool_size matches with the default max_connections value of PostgreSQL, and that instance of PostgreSQL runs exclusively for Invidious.

People that use Docker replicas or some sort of replicas, will have to adjust the max_pool_size value to limit how many connections each replica can use, which before this PR, it used to be fixed to 6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants