Skip to content

feat: add connection timeout, TCP keepalive, and pool health features#283

Merged
neo-walker merged 5 commits into
neo4j-labs:mainfrom
ajmeese7:feat/connection-timeouts
Apr 7, 2026
Merged

feat: add connection timeout, TCP keepalive, and pool health features#283
neo-walker merged 5 commits into
neo4j-labs:mainfrom
ajmeese7:feat/connection-timeouts

Conversation

@ajmeese7
Copy link
Copy Markdown
Contributor

@ajmeese7 ajmeese7 commented Mar 26, 2026

Problem

neo4rs has no mechanism to detect or recover from broken TCP connections. When a connection to Neo4j becomes half-open (peer closes, network interruption, container restart), recv() in connection.rs hangs indefinitely because TcpStream has no read timeout configured. This poisons the deadpool connection pool:

  1. Graph::run() acquires a connection from deadpool
  2. Sends the Bolt request via send() — succeeds (OS buffers the write)
  3. Waits for response via recv()hangs forever (peer is gone, no TCP RST received)
  4. Deadpool's recycle() calls Connection::reset()send_recv() → also hangs on recv()
  5. The connection is never returned, permanently reducing available connections
  6. With enough broken connections, the pool is exhausted and all operations deadlock

There is no workaround within neo4rs — users must wrap every graph.run()/graph.execute() call in tokio::time::timeout, which is error-prone and doesn't prevent pool poisoning.

Fix

Connection timeouts

  • Wrap TcpStream::connect() and recv() in tokio::time::timeout (default 30s, configurable via ConfigBuilder::connection_timeout)
  • New Error::ConnectionTimedOut variant for timeout errors

TCP keepalive

  • Configure OS-level TCP keepalive on new connections via socket2 (default 60s interval, 10s probe interval, 3 retries)
  • Configurable via ConfigBuilder::tcp_keepalive, set to None to disable

Pool health

  • Wrap recycle() in a 5-second timeout — broken connections are evicted instead of blocking the pool forever
  • Wire idle_timeout and max_lifetime through to deadpool's Pool::builder(), allowing automatic eviction of stale/old connections

New ConfigBuilder options

ConfigBuilder::new()
    .uri("bolt://localhost:7687")
    .user("neo4j")
    .password("password")
    .connection_timeout(Duration::from_secs(10))      // default: 30s
    .tcp_keepalive(Some(Duration::from_secs(30)))     // default: Some(60s)
    .idle_timeout(Some(Duration::from_secs(300)))     // default: None
    .max_lifetime(Some(Duration::from_secs(3600)))    // default: None
    .build()
    .unwrap();

Tests

7 new unit tests covering config defaults, custom timeout values, pool creation with timeout options, and error display. All 290 tests pass.

Dependencies

  • Added socket2 = "0.5" (with all features) for TCP keepalive configuration

@ajmeese7 ajmeese7 force-pushed the feat/connection-timeouts branch from 061ce14 to 86757fa Compare March 26, 2026 19:25
Copy link
Copy Markdown
Collaborator

@madchicken madchicken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good on my side. @knutwalker please, take a look and merge it if you agree

Copy link
Copy Markdown
Collaborator

@neo-walker neo-walker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this looks good. I only got a small suggestions for API improvements.

For the builds, can you run cargo fmt and cargo xtask msrv min (might have to install jq for this) and commit the changes?

Comment thread lib/src/config.rs Outdated
Comment thread lib/src/config.rs Outdated
Comment thread lib/src/config.rs Outdated
@neo-walker
Copy link
Copy Markdown
Collaborator

It looks good on my side. @knutwalker please, take a look and merge it if you agree

FYI, I am using a different account here, but I am still @knutwalker -- trying to get the old account back into this repo though :)

@ajmeese7 ajmeese7 force-pushed the feat/connection-timeouts branch from 4c5f15b to d9a418f Compare March 31, 2026 18:35
@ajmeese7
Copy link
Copy Markdown
Contributor Author

Thanks for the PR, this looks good. I only got a small suggestions for API improvements.

For the builds, can you run cargo fmt and cargo xtask msrv min (might have to install jq for this) and commit the changes?

I can't get cargo xtask msrv min to work with my Rust v1.81 without updating pin_msrv_versions. I was able to get it to build with the following modifications:

fn pin_msrv_versions(dry_run: bool, sh: &Shell, cargo: &str, lockfile: &str) -> Result<()> {
    cmd!(sh, "rm {lockfile}").run_if(dry_run)?;

    let pin_versions: &[(&str, &str)] = &[
        ("backon", "1.5.2"),       // 1.6.0 bumped MSRV to 1.85
        ("idna_adapter", "1.2.0"), // 1.2.1 requires 1.82, transitive from url
        ("nalgebra", "0.32.6"),    // transitive requirement from nav_types,
+       ("serde_with", "3.16.1"),   // 3.17+ bumped MSRV to 1.82; 3.18+ requires time ~0.3.47 (edition2024)
+       ("time", "0.3.44"),        // 0.3.45+ pulls time-macros >=0.2.25 which needs edition2024
+       ("uuid", "1.20.0"),        // 1.21+ requires getrandom 0.4 which needs edition2024
+       ("deranged", "0.5.5"),     // 0.5.6+ bumped MSRV to 1.85
    ];

Not sure whether I should include this or leave it be. I did commit the resulting changes to the codebase from running that, but I haven't pushed the Cargo.lock.msrv, Cargo.lock.min, or main.rs changes.

@neo-walker
Copy link
Copy Markdown
Collaborator

Thanks, I'll take another look

Not sure whether I should include this or leave it be. I did commit the resulting changes to the codebase from running that, but I haven't pushed the Cargo.lock.msrv, Cargo.lock.min, or main.rs changes.

Can you please commit those as well? I try to keep the MSRV at or below what debian stable ships, which is 1.85 currently, but for now we're still on 1.81.

Those lockfiles and the version pins in the task are how we test MSRV compatibility. With the ecosystem moving along, we have to keep maintaining those.

@ajmeese7
Copy link
Copy Markdown
Contributor Author

ajmeese7 commented Apr 1, 2026

@neo-walker just pushed!

@neo-walker
Copy link
Copy Markdown
Collaborator

@ajmeese7 I noticed that you added socket2 in version 0.5 to the Cargo.toml, the latest is 0.6 (0.6.3), which also works with the MSRV. Is there a particular reason you chose 0.5 over 0.6?

@neo-walker
Copy link
Copy Markdown
Collaborator

Also looks like security-framework has to be pinned to 3.6.0 for MSRV

@neo-walker
Copy link
Copy Markdown
Collaborator

@ajmeese7 I just merged an update to the lockfiles and the version pins, can you rebase your PR? You can ignore the changes in the main.rs and the lockfiles from your side, and then re-run the xtask to update the lockfiles for the new dependency

@ajmeese7
Copy link
Copy Markdown
Contributor Author

ajmeese7 commented Apr 2, 2026

@neo-walker will do, have been without internet for a bit but it should be repaired soon!

ajmeese7 added 5 commits April 2, 2026 19:54
- Add connection_timeout (default 30s) wrapping TcpStream::connect and recv
- Add tcp_keepalive (default 60s) via socket2 on new connections
- Add idle_timeout and max_lifetime config options wired to deadpool
- Add 5s timeout on pool recycle() to prevent broken connection poisoning
- Add ConnectionTimedOut error variant

Fixes indefinite hangs when TCP connections become half-open (container
restarts, network interruptions), which previously exhausted the
deadpool connection pool permanently.
- Add 7 unit tests covering config defaults, custom timeouts, pool
  creation, and error display
- Update lib.rs Configurations doc section with new options
- All 290 tests pass
Rebase onto upstream/main which includes MSRV pin updates (#284).
Re-ran `cargo xtask msrv min` to regenerate lockfiles and added
missing timeout/keepalive fields to Config initializers in tests.
@ajmeese7 ajmeese7 force-pushed the feat/connection-timeouts branch from b95a26c to 91994db Compare April 2, 2026 23:56
@ajmeese7
Copy link
Copy Markdown
Contributor Author

ajmeese7 commented Apr 6, 2026

Should be handled @neo-walker, let me know if I need to do anything else

@neo-walker neo-walker merged commit 45a649b into neo4j-labs:main Apr 7, 2026
10 checks passed
@ajmeese7 ajmeese7 deleted the feat/connection-timeouts branch April 7, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants