Add closed connection tracking with close reasons to DIAG HTTP server

## Problem (Why)

In the infra-bemisc deployment, client connections from omni-gateway (Deno) to `ldj.frontdoorhd.com` (behind haproxy + netius proxy_c) are being killed every 5-10 minutes with no prior notice, despite `KEEPALIVE_TIMEOUT=3600`. HAProxy's idle timeout is set to 2 hours, ruling it out as the cause. Diagnosing this is difficult because the DIAG HTTP server only exposes currently active connections — once a connection closes, all context is lost. Log-based diagnostics have proven inefficient due to high traffic volume. We need a way to inspect recently closed connections and their close reasons via the DIAG HTTP endpoint to identify the root cause of these disconnections.

## Description (What)

Add a ring buffer of recently closed connections to the DIAG system, capturing close reason, timestamps, duration, last activity time, error details, and paired connection ID (for proxy correlation). Expose this via a new `GET /connections/closed` endpoint on DiagApp. Close reasons will be string constants (e.g., `"timeout"`, `"client_eof"`, `"upstream_error"`, `"error"`, `"explicit"`). The ring buffer defaults to 512 entries, configurable via `DIAG_CLOSED_MAX`. Tracking is active when running under DIAG mode. The endpoint returns the full buffer, most recent first.

## Implementation (How)

- [ ] **1. Define close reason constants** in `src/netius/base/conn.py` — add string constants: `"timeout"`, `"client_eof"`, `"upstream_error"`, `"error"`, `"explicit"`, `"unknown"` (and others as needed)
- [ ] **2. Add `close_reason` field to `BaseConnection`** — initialize to `None`, set before `close()` is called; include `close_reason`, `close_timestamp`, and `last_activity_timestamp` in `info_dict()` output
- [ ] **3. Implement a ring buffer for closed connections** in `src/netius/base/common.py` (or a new utility) — use `collections.deque(maxlen=N)` with max size from `DIAG_CLOSED_MAX` conf (default 512); store a snapshot dict of the connection's `info_dict()` plus close metadata at close time
- [ ] **4. Hook into `Base.on_connection_d()`** — when DIAG is active, capture the closed connection's info dict (including close reason, close timestamp, connection duration, last activity timestamp, error details) and append to the ring buffer
- [ ] **5. Propagate close reasons at all close call sites** — audit `BaseConnection.close()`, timeout handlers, EOF/error handlers in `src/netius/base/common.py`, and ensure each sets `close_reason` before closing
- [ ] **6. Propagate close reasons in proxy server** — in `src/netius/servers/proxy.py`, set appropriate close reasons in `_on_prx_close()` (upstream error), `_on_raw_close()` (tunnel close), `on_connection_d()`, `on_stream_d()`; include paired/correlated connection ID in the close metadata
- [ ] **7. Add paired connection ID to proxy close records** — when a proxy frontend or backend connection closes, include the paired connection's ID (from `conn_map`) in the close snapshot so frontend/backend closures can be correlated
- [ ] **8. Add `GET /connections/closed` endpoint to DiagApp** in `src/netius/base/diag.py` — return the full ring buffer contents as JSON, most recent first
- [ ] **9. Add `DIAG_CLOSED_MAX` conf support** — read from netius conf system, default to 512, used to size the deque
- [ ] **10. Test** — add tests for the ring buffer behavior (overflow, ordering), close reason propagation, and the new DIAG endpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add closed connection tracking with close reasons to DIAG HTTP server #55

Problem (Why)

Description (What)

Implementation (How)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add closed connection tracking with close reasons to DIAG HTTP server #55

Description

Problem (Why)

Description (What)

Implementation (How)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions