feat: add connection timeout, TCP keepalive, and pool health features#283
Conversation
061ce14 to
86757fa
Compare
madchicken
left a comment
There was a problem hiding this comment.
It looks good on my side. @knutwalker please, take a look and merge it if you agree
neo-walker
left a comment
There was a problem hiding this comment.
Thanks for the PR, this looks good. I only got a small suggestions for API improvements.
For the builds, can you run cargo fmt and cargo xtask msrv min (might have to install jq for this) and commit the changes?
FYI, I am using a different account here, but I am still @knutwalker -- trying to get the old account back into this repo though :) |
4c5f15b to
d9a418f
Compare
I can't get fn pin_msrv_versions(dry_run: bool, sh: &Shell, cargo: &str, lockfile: &str) -> Result<()> {
cmd!(sh, "rm {lockfile}").run_if(dry_run)?;
let pin_versions: &[(&str, &str)] = &[
("backon", "1.5.2"), // 1.6.0 bumped MSRV to 1.85
("idna_adapter", "1.2.0"), // 1.2.1 requires 1.82, transitive from url
("nalgebra", "0.32.6"), // transitive requirement from nav_types,
+ ("serde_with", "3.16.1"), // 3.17+ bumped MSRV to 1.82; 3.18+ requires time ~0.3.47 (edition2024)
+ ("time", "0.3.44"), // 0.3.45+ pulls time-macros >=0.2.25 which needs edition2024
+ ("uuid", "1.20.0"), // 1.21+ requires getrandom 0.4 which needs edition2024
+ ("deranged", "0.5.5"), // 0.5.6+ bumped MSRV to 1.85
];Not sure whether I should include this or leave it be. I did commit the resulting changes to the codebase from running that, but I haven't pushed the |
|
Thanks, I'll take another look
Can you please commit those as well? I try to keep the MSRV at or below what debian stable ships, which is 1.85 currently, but for now we're still on 1.81. Those lockfiles and the version pins in the task are how we test MSRV compatibility. With the ecosystem moving along, we have to keep maintaining those. |
|
@neo-walker just pushed! |
|
@ajmeese7 I noticed that you added socket2 in version 0.5 to the Cargo.toml, the latest is 0.6 (0.6.3), which also works with the MSRV. Is there a particular reason you chose 0.5 over 0.6? |
|
Also looks like |
|
@ajmeese7 I just merged an update to the lockfiles and the version pins, can you rebase your PR? You can ignore the changes in the main.rs and the lockfiles from your side, and then re-run the xtask to update the lockfiles for the new dependency |
|
@neo-walker will do, have been without internet for a bit but it should be repaired soon! |
- Add connection_timeout (default 30s) wrapping TcpStream::connect and recv - Add tcp_keepalive (default 60s) via socket2 on new connections - Add idle_timeout and max_lifetime config options wired to deadpool - Add 5s timeout on pool recycle() to prevent broken connection poisoning - Add ConnectionTimedOut error variant Fixes indefinite hangs when TCP connections become half-open (container restarts, network interruptions), which previously exhausted the deadpool connection pool permanently.
- Add 7 unit tests covering config defaults, custom timeouts, pool creation, and error display - Update lib.rs Configurations doc section with new options - All 290 tests pass
Rebase onto upstream/main which includes MSRV pin updates (#284). Re-ran `cargo xtask msrv min` to regenerate lockfiles and added missing timeout/keepalive fields to Config initializers in tests.
b95a26c to
91994db
Compare
|
Should be handled @neo-walker, let me know if I need to do anything else |
Problem
neo4rs has no mechanism to detect or recover from broken TCP connections. When a connection to Neo4j becomes half-open (peer closes, network interruption, container restart),
recv()inconnection.rshangs indefinitely becauseTcpStreamhas no read timeout configured. This poisons the deadpool connection pool:Graph::run()acquires a connection from deadpoolsend()— succeeds (OS buffers the write)recv()— hangs forever (peer is gone, no TCP RST received)recycle()callsConnection::reset()→send_recv()→ also hangs onrecv()There is no workaround within neo4rs — users must wrap every
graph.run()/graph.execute()call intokio::time::timeout, which is error-prone and doesn't prevent pool poisoning.Fix
Connection timeouts
TcpStream::connect()andrecv()intokio::time::timeout(default 30s, configurable viaConfigBuilder::connection_timeout)Error::ConnectionTimedOutvariant for timeout errorsTCP keepalive
socket2(default 60s interval, 10s probe interval, 3 retries)ConfigBuilder::tcp_keepalive, set toNoneto disablePool health
recycle()in a 5-second timeout — broken connections are evicted instead of blocking the pool foreveridle_timeoutandmax_lifetimethrough to deadpool'sPool::builder(), allowing automatic eviction of stale/old connectionsNew
ConfigBuilderoptionsTests
7 new unit tests covering config defaults, custom timeout values, pool creation with timeout options, and error display. All 290 tests pass.
Dependencies
socket2 = "0.5"(withallfeatures) for TCP keepalive configuration