Skip to content

Commit c43f368

Browse files
committed
Move gel-connpool from edgedb repo
1 parent fa85c0b commit c43f368

File tree

14 files changed

+5683
-0
lines changed

14 files changed

+5683
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ docs can currently be found on docs.rs:
1010
| [gel-babelfish](https://docs.rs/gel-babelfish) | [Source](./gel-babelfish) | Babelfish is a Gel socket frontend that speaks Gel, Postgres, HTTP and more. |
1111
| [gel-captive](https://docs.rs/gel-captive) | [Source](./gel-captive) | Run a captive Gel server for testing purposes. |
1212
| [gel-config](https://docs.rs/gel-config) | [Source](./gel-config) | Configuration file parser for Gel. |
13+
| [gel-connpool](https://docs.rs/gel-connpool) | [Source](./gel-connpool) | Load-balancing connection pool for Gel database with QoS optimization. |
1314
| [gel-derive](https://docs.rs/gel-derive) | [Source](./gel-derive) | Derive macros for Gel database client. |
1415
| [gel-db-protocol](https://docs.rs/gel-db-protocol) | [Source](./gel-db-protocol) | Low-level protocol implementation of the EdgeDB/Gel wire protocol. |
1516
| [gel-dsn](https://docs.rs/gel-dsn) | [Source](./gel-dsn) | Data-source name (DSN) parser for Gel and PostgreSQL databases. |

gel-connpool/Cargo.toml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
[package]
2+
name = "gel-connpool"
3+
version = "0.1.0"
4+
license = "MIT/Apache-2.0"
5+
authors = ["MagicStack Inc. <hello@magic.io>"]
6+
edition = "2021"
7+
description = "Load-balancing connection pool for Gel database with QoS optimization."
8+
readme = "README.md"
9+
rust-version.workspace = true
10+
11+
[lints]
12+
workspace = true
13+
14+
[features]
15+
optimizer = ["genetic_algorithm", "lru", "rand", "statrs", "anyhow", "tokio/test-util"]
16+
17+
[dependencies]
18+
tokio.workspace = true
19+
tracing.workspace = true
20+
21+
futures = "0"
22+
scopeguard = "1"
23+
itertools = "0"
24+
thiserror = "2"
25+
strum = { version = "0.26", features = ["derive"] }
26+
consume_on_drop = "0"
27+
smart-default = "0"
28+
serde = { version = "1", features = ["derive"] }
29+
30+
# For the optimizer
31+
genetic_algorithm = { version = "0.9.0", optional = true }
32+
lru = { version = "0.12.4", optional = true }
33+
rand = { version = "0.8.5", optional = true }
34+
statrs = { version = "0.17.1", optional = true }
35+
anyhow = { version = "1", optional = true }
36+
37+
derive_more = { version = "2", features = ["full"] }
38+
39+
[dev-dependencies]
40+
tokio = { workspace = true, features = ["test-util"] }
41+
42+
pretty_assertions = "1.2.0"
43+
test-log = { version = "0", features = ["trace"] }
44+
anyhow = "1"
45+
rstest = "0"
46+
47+
statrs = "0.17.1"
48+
rand = "0.8.5"
49+
50+
[lib]
51+
52+
[[bin]]
53+
name = "optimizer"
54+
required-features = ["optimizer"]

gel-connpool/README.md

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
# Connection Pool
2+
3+
## Overview
4+
5+
The load-balancing algorithm is designed to optimize the allocation and
6+
management of database connections in a way that maximizes Quality of Service
7+
(QoS). This involves minimizing the overall time spent on connecting and
8+
reconnecting (connection efficiency) while ensuring that latencies remain
9+
similar across different streams of connections (fairness).
10+
11+
## Architecture
12+
13+
This library is split into four major components:
14+
15+
1. The low-level blocks/block, connections, and metrics code. This code
16+
creates, destroys and transfers connections without understanding of
17+
policies, quotas or any sort of algorithm. We ensure that the blocks and
18+
metrics are reliable, and use this as a building block for our pool.
19+
2. The algorithm. This performs planning operations for acquisition, release
20+
and rebalancing of the pool. The algorithm does not perform operations, but
21+
rather informs that caller what it should do.
22+
3. The pool itself. This drives the blocks and the connector interface, and
23+
polls the algorithm to plan next steps during acquisition, release and
24+
during the timer-based planning callback.
25+
4. The Python integration code. This is behind an optional feature, and exposes
26+
PyO3-based interface that allows a connection factory to be implemented in
27+
Python.
28+
29+
## Details
30+
31+
Demand for connections is measured in terms of “database time,” which is
32+
calculated as the product of the number of connections and the average hold time
33+
of these connections. This metric provides a basis for determining how resources
34+
should be distributed among different database blocks to meet their needs
35+
effectively.
36+
37+
To maximize QoS, the algorithm aims to minimize the time spent on managing
38+
connections and keep the latencies low and uniform across various connection
39+
streams. This involves allocation strategies that balance the immediate needs of
40+
different database blocks with the overall system capacity and future demand
41+
predictions.
42+
43+
When a connection is acquired, the system may be in a state where the pool is
44+
not currently constrained by demand. In such cases, connections can be allocated
45+
greedily without complex balancing, as there are sufficient resources to meet
46+
all demands. This allows for quick connection handling without additional
47+
overhead.
48+
49+
When the pool is constrained, the “stealing” algorithm aims to transfer
50+
connections from less utilized or idle database blocks (victims) to those
51+
experiencing high demand (hunger) to ensure efficient resource use and maintain
52+
QoS. A victim block is chosen based on its idle state, characterized by holding
53+
connections but having low or no immediate demand for them.
54+
55+
Upon releasing a connection, the algorithm evaluates which backend (database
56+
block) needs the connection the most (the hungriest). This decision is based on
57+
current demand, wait times, and historical usage patterns. By reallocating
58+
connections to the blocks that need them most, the algorithm ensures that
59+
resources are utilized efficiently and effectively.
60+
61+
Unused connection capacity is eventually reclaimed to prevent wastage. The
62+
algorithm includes mechanisms to identify and collect these idle connections,
63+
redistributing them to blocks with higher demand or returning them to the pool
64+
for future use. This helps maintain an optimal number of active connections,
65+
reducing unnecessary resource consumption.
66+
67+
To avoid excessive thrashing, the algorithm ensures that connections are held
68+
for a minimum period, which is longer than the time it takes to reconnect to a
69+
database or a configured minimum threshold. This reduces the frequency of
70+
reallocation, preventing performance degradation due to constant connection
71+
churn and ensuring that blocks can maintain stable and predictable access to
72+
resource
73+
74+
## Detailed Algorithm
75+
76+
The algorithm is designed to 1) maximize time spent running queries in a
77+
database and 2) minimize latency of queries waiting for their turn to run. These
78+
goals may be in conflict at times. We do this by optimizing the time spent
79+
switching between databases, which is considered "dead time" -- as the database
80+
is not actively performing operations.
81+
82+
The demand for a connection is based on estimated total sequential processing
83+
time. We use the average time that a connection is held, times the number of
84+
connections in demand as a rough idea of how much total sequential time a
85+
certain block demands in the future.
86+
87+
At a regular interval, we compute two items for each block: a quota, and a
88+
"hunger" metric. The hunger metric may indicate that a block is "hungry"
89+
(wanting more connections), satisfied (having the expected number of
90+
connections) or overfull (holding more connections than it should). The "hungry"
91+
score is determined by the estimated total sequential time needed for a block.
92+
The "overfull" score is determined by the number of extra connections held by
93+
this block, in combination with how old the longest-held connection is. Quota is
94+
determined by the connection rate.
95+
96+
We then use the hunger metric and quota in an attempt to rebalance the pool
97+
proactively to ensure that the connection capacity of each block reflects its
98+
most recent demand profile. Blocks are sorted into a list of hungry and overfull
99+
blocks, and we attempt to transfer from the most hungry to the most overfull
100+
until we run out of either list. We may not be able to perform the rebalance
101+
fully because of block activity that cannot be interrupted.
102+
103+
If a connection is requested for a block that is hungry, it is allowed to steal
104+
a connection from the block that most overfull and has idle connections. As the
105+
"overfull" score is calculated in part by the longest-held connection's age, we
106+
minimize context switching.
107+
108+
When a connection is released, we choose what happens based on its state. If
109+
more connections are waiting on this block, we return the connection to the
110+
block to be re-used immediately. If no connections are waiting but the block is
111+
hungry, we return it. If the block is satisfied or overfull and we have hungry
112+
blocks waiting, we transfer it to a hungry block that has waiters.
113+
114+
## Error Handling
115+
116+
The pool will attempt to provide a connection where possible, but connection
117+
operations may not always be reliable. The error for a connection failure will
118+
be routed through the acquire operation if the pool detects there are no other
119+
potential sources for a connection for the acquire. Sources for a connection may
120+
be a currently-connecting connection, a reconnecting connection, a connection
121+
that is actively held by someone else or a connection that is sitting idle.
122+
123+
The pool does not currently retry, and retry logic should be included in the
124+
connect operation.

0 commit comments

Comments
 (0)