-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Expose database connection pool settings #5333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Unless an instance maintainer was using the `database_url` attribute, and knew about the connection pool query parameters, the database connection pool settings was relegated to the default settings... And the default settings are... not great for any large instance. With it essentially only allowing a single connection within the pool, a maximum checkout time of a 5 seconds, and basically no additional retries whatsoever its no wonder that PgBouncer has became a staple among Invidious instances. This PR changes that by exposing the ability to configure the database connection pool that is used within the library that Invidious uses to interact with Postgres.
## Accepted values: a positive integer | ||
## Default: 100 | ||
## | ||
#max_idle_pool_size: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally also exposed the idle pool size config for the HTTP connection pool in #5081 but later reverted the change after I learnt what idle actually meant in the context of the connection pool.
I've added it here for completeness but I don't know if there's actually any benefits in exposing this setting over making it just always equal the max pool size.
## Accepted values: a positive integer | ||
## Default: 100 | ||
## | ||
#max_pool_size: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a probably a non issue but
Although all of this is nested under the db field, the options here are still really similar to the ones for the HTTP connection pool which might get confusing. And this could be made worse if we ever decide to offer more granular control for the latter (eg configuring the pool options for each domain) which will lead to a lot of essentially duplicate options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the requested changes, works great from what I tested.
# How many connections to construct on start-up, and keep it there. | ||
property initial_pool_size : Int32 = 1 | ||
# The maximum size of the connection pool | ||
property max_pool_size : Int32 = 100 | ||
# The maximum amount of idle connections within the pool | ||
# idle connections are defined as created connections that are | ||
# sitting around in the pool. Exceeding this number will cause new connections | ||
# to be created on checkout and then simply dropped on release, till the maximum pool size | ||
# from which there will be a checkout timeout. | ||
property max_idle_pool_size : Int32 = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should note two things.
The value of max_idle_pool_size
has to be higher than initial_pool_size
, if initial_pool_size
is 500
, Invidious will start with 500
connections, and then it will reduce them to 100
if max_idle_pool_size
is set to 100
.
max_connections
on PostgreSQL has to be slightly higher than the value used here, the default value of max_connections
is 100
(https://postgresqlco.nf/doc/en/param/max_connections/), so if someone reason Invidious needs to create new clients on the pool, it will hit 100
connections on the DB, depleting all the remaining connections and preventing other services using the same PostgreSQL server to connect to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about forcing a user to change their default postgres configuration just to accommodate Invidious' default behavior.
Maybe we could just reduce the default max_pool_size
instead? I'm not sure what'll be a reasonable value though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could just reduce the default max_pool_size instead? I'm not sure what'll be a reasonable value though.
That will depend of the traffic of the instance. I think 32 or even 16 is fine. My instance used to run using a unix socket without pgbouncer (but with ~4 replicas, so at the end, the total connections to the database were 24 (4 replicas * 6 connections)) at all and everything worked fine.
I tested using max_pool_size
32 and this are the results:
When using wrk
with 2 threads and 10 connections, I got 1772req/s with ~5ms
average, against an Invidious playlist with 1 video using the Invidious API /api/v1/playlists/:plid
. Invidious used 10 db connections.
When using 2 threads and 50 connections, I got 1826req/s with 123ms
average, Invidious used 32 connections.
What matters is the latency of each operation. I'm not a database expert, but I still think this setting will depend of the current setup of Invidious people have.
People that use a shared database used along other services like me may need to set a higher max_connections
on PostgreSQL or set a lower max_pool_size
so Invidious doesn't eat all the connections for any reason (mainly DoS).
People that use the Docker guide of https://docs.invidious.io/installation/#docker-compose-method-production should be fine since max_pool_size
matches with the default max_connections
value of PostgreSQL, and that instance of PostgreSQL runs exclusively for Invidious.
People that use Docker replicas or some sort of replicas, will have to adjust the max_pool_size
value to limit how many connections each replica can use, which before this PR, it used to be fixed to 6
Unless an instance maintainer was using the
database_url
attribute, and knew about the connection pool query parameters, the database connection pool settings was relegated to the default settings...And the default settings are not great for any large instance... With it essentially only allowing a single connection within the pool, a maximum checkout time of a 5 seconds, and basically no additional retries whatsoever its no wonder that PgBouncer has became an necessity among Invidious instances.
This PR changes that by exposing the ability to configure the database connection pool for Postgres. I've also added some saner defaults that should hopefully help Invidious run better out of the box.
For reference here are the default values: https://crystal-lang.org/reference/1.16/database/connection_pool.html#configuration*
*The default retry delay is actually 0.2 seconds now but the documentation hasn't been updated to reflect this.