Skip to content

[Fix][Observability]: Fix duplicate DB session in observability middleware#3600

Merged
crivetimihai merged 7 commits intomainfrom
bugfix/3467-observability-middleware-opens-duplicate-db-session
Mar 20, 2026
Merged

[Fix][Observability]: Fix duplicate DB session in observability middleware#3600
crivetimihai merged 7 commits intomainfrom
bugfix/3467-observability-middleware-opens-duplicate-db-session

Conversation

@smrutisahoo10
Copy link
Copy Markdown
Collaborator

@smrutisahoo10 smrutisahoo10 commented Mar 11, 2026

🔗 Related Issue

Closes #3467


📝 Summary

What does this PR do?
Eliminates duplicate database session creation in observability middleware by implementing request-scoped session sharing between middleware and route handlers.

Why?

  • Reduces database connection usage by 50% for traced requests
  • Prevents SQLite segfaults with StaticPool under concurrent load
  • Improves performance by reducing session creation/destruction overhead
  • Lowers connection pool pressure in production environments

Technical Changes:

  1. mcpgateway/middleware/observability_middleware.py:

    • Creates request-scoped session and stores in request.state.db
    • Handles complete session lifecycle (commit/rollback/close)
    • Added session ownership tracking for proper cleanup
  2. mcpgateway/main.py (get_db() function):

    • Checks for request.state.db and reuses if present
    • Falls back to creating own session if middleware didn't create one
    • Maintains backward compatibility for non-traced requests
  3. tests/unit/mcpgateway/middleware/test_observability_middleware.py:

    • Updated 8 tests to mock should_skip_observability for proper coverage
    • Added verification for session reuse behavior

🏷️ Type of Change

  • Bug fix
  • Feature / Enhancement
  • Documentation
  • Refactor
  • Chore (deps, CI, tooling)
  • Other (describe below)

🧪 Verification

Check Command Status
Lint suite make lint Pass
Unit tests make test Pass
Coverage ≥ 80% make coverage Pass

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • Tests added/updated for changes
  • Documentation updated (if applicable)
  • No secrets or credentials committed

📓 Notes (optional)

Manual Testing Results:

With OBSERVABILITY_ENABLED=true, single session ID confirmed:
[OBSERVABILITY] DB session created: 139870311167472
[GET_DB] Reusing session from middleware: 139870311167472 ← Same ID ✅

Before fix (two different session IDs):
[OBSERVABILITY] DB session created: 4709458000
[GET_DB] DB session created: 4710405520 ← Different ID ❌

@smrutisahoo10 smrutisahoo10 force-pushed the bugfix/3467-observability-middleware-opens-duplicate-db-session branch 2 times, most recently from 00f8363 to 06a28e7 Compare March 11, 2026 10:02
@smrutisahoo10 smrutisahoo10 changed the title Fix duplicate DB session in observability middleware [Fix][Observability]: Fix duplicate DB session in observability middleware Mar 11, 2026
@smrutisahoo10 smrutisahoo10 added bug Something isn't working performance Performance related items labels Mar 11, 2026
@ja8zyjits
Copy link
Copy Markdown
Member

Hi @smrutisahoo10,

Good PR, please read through this AI generated review and fix/reject the suggested changes with justification.


Executive Summary

Overall Assessment: APPROVE WITH CHANGES REQUIRED

The PR successfully addresses the core issue of duplicate database session creation but requires critical fixes for connection invalidation to ensure production reliability, especially with PostgreSQL/PgBouncer deployments.


✅ Core Requirements Met

The PR successfully addresses all main requirements from issue #3467:

  • Eliminates duplicate session creation between ObservabilityMiddleware and get_db()
  • Implements request-scoped session sharing via request.state.db
  • Reduces session usage by 50% for traced requests
  • Prevents SQLite StaticPool segfaults under concurrent load
  • Maintains backward compatibility for non-traced requests
  • Comprehensive test coverage with 8 updated tests + 4 new tests

🔴 Critical Issues (Must Fix)

1. Missing Connection Invalidation in Main Error Path

Location: mcpgateway/middleware/observability_middleware.py:236-239

Issue: The middleware's exception handler only calls db.rollback() but doesn't handle broken connections, which is critical for PostgreSQL/PgBouncer scenarios mentioned in the original issue.

Current Code:

# Rollback the shared session on error
try:
    db.rollback()
except Exception as rollback_error:
    logger.warning(f"Failed to rollback database session: {rollback_error}")

Required Fix:

# Rollback the shared session on error
try:
    db.rollback()
except Exception as rollback_error:
    logger.warning(f"Failed to rollback database session: {rollback_error}")
    # Connection is broken - invalidate to remove from pool
    # This handles cases like PgBouncer query_wait_timeout where
    # the connection is dead and rollback itself fails
    try:
        db.invalidate()
    except Exception:
        pass  # Best effort cleanup on connection failure

Rationale: This matches the established pattern in get_db() (lines 2430-2436) and is essential for handling broken connections in production environments.


2. Missing Invalidation in Trace Setup Error Path

Location: mcpgateway/middleware/observability_middleware.py:158-166

Issue: The trace setup cleanup path also needs connection invalidation for consistency and reliability.

Current Code:

except Exception as e:
    # If trace setup failed, log and continue without tracing
    logger.warning(f"Failed to setup observability trace: {e}")
    # Close db if it was created
    if db:
        try:
            db.rollback()  # Error path - rollback any partial transaction
            db.close()
        except Exception as close_error:
            logger.debug(f"Failed to close database session during cleanup: {close_error}")

Required Fix:

except Exception as e:
    # If trace setup failed, log and continue without tracing
    logger.warning(f"Failed to setup observability trace: {e}")
    # Close db if it was created
    if db:
        try:
            db.rollback()  # Error path - rollback any partial transaction
        except Exception as rollback_error:
            logger.debug(f"Failed to rollback during cleanup: {rollback_error}")
            # Connection is broken - invalidate to remove from pool
            try:
                db.invalidate()
            except Exception:
                pass  # Best effort cleanup
        try:
            db.close()
        except Exception as close_error:
            logger.debug(f"Failed to close database session during cleanup: {close_error}")

3. Potential Double-Commit Risk

Location: mcpgateway/middleware/observability_middleware.py:203-204

Issue: The codebase has 399 instances of db.commit() across 53 service files. If a route handler commits the transaction, the middleware's commit could cause issues or be redundant.

Current Code:

# Commit the shared session (used by both observability and route handler)
# Only commit if the transaction is still active
if db.is_active:
    db.commit()

Recommended Fix:

# Commit the shared session (used by both observability and route handler)
# Only commit if the transaction is still active AND has uncommitted changes
if db.is_active and db.in_transaction():
    db.commit()

Rationale: The db.in_transaction() check ensures we only commit if there are actual uncommitted changes, preventing potential issues with double-commits.


⚠️ Non-Blocking Issues (Should Fix)

4. Debug Logging Level

Locations:

  • mcpgateway/middleware/observability_middleware.py:115
  • mcpgateway/main.py:2413
  • mcpgateway/main.py:2420

Issue: Session creation is logged at INFO level, which will create excessive log volume in production.

Current:

logger.info(f"[OBSERVABILITY] DB session created: {id(db)}")
logger.info(f"[GET_DB] Reusing session from middleware: {id(db)}")
logger.info(f"[GET_DB] DB session created: {id(db)}")

Recommended:

logger.debug(f"[OBSERVABILITY] DB session created: {id(db)}")
logger.debug(f"[GET_DB] Reusing session from middleware: {id(db)}")
logger.debug(f"[GET_DB] DB session created: {id(db)}")

5. Missing Explanatory Comment

Location: mcpgateway/middleware/observability_middleware.py:203-204

Recommendation: Add a comment explaining the commit strategy:

# Commit the shared session (used by both observability and route handler)
# Note: Some route handlers may have already committed. The is_active check
# ensures we only commit if the transaction is still open. Services that
# explicitly commit will have already closed their transaction.
if db.is_active and db.in_transaction():
    db.commit()

📊 Out of Scope (Future Work)

6. Missing Documentation

Issue: No updates to architecture documentation about the new session sharing pattern.

Recommendation: Update the following documentation:

  • docs/docs/architecture/multitenancy.md - Add section on session lifecycle
  • docs/docs/manage/observability.md - Document session sharing behavior
  • AGENTS.md - Update with session management guidelines

✅ Test Coverage Assessment

Excellent test coverage provided:

Updated Tests (8)

All existing tests properly updated with should_skip_observability mocking:

  • test_dispatch_trace_setup_success
  • test_dispatch_exception_during_request
  • test_dispatch_close_db_failure
  • test_dispatch_trace_setup_failure_rolls_back_and_closes_db
  • test_dispatch_trace_setup_cleanup_close_failure_logs_debug
  • test_dispatch_end_span_failure_logs_warning
  • test_dispatch_end_trace_failure_logs_warning
  • test_dispatch_exception_logging_failure_logs_warning

New Tests (4)

Comprehensive coverage of session reuse scenarios:

  • test_observability_middleware_creates_request_scoped_session - Verifies session creation and storage
  • test_observability_middleware_cleans_up_on_error - Verifies error path cleanup
  • test_get_db_reuses_middleware_session - Verifies get_db() reuse logic
  • test_get_db_creates_own_session_when_no_middleware_session - Verifies fallback behavior
  • test_single_session_per_request_integration - Critical integration test verifying only one session per request
  • test_dispatch_rollback_failure_logs_warning - Verifies rollback error handling

📋 Implementation Review

Changes Summary

File Lines Changed Purpose
mcpgateway/middleware/observability_middleware.py +32, -17 Session sharing implementation
mcpgateway/main.py +23, -1 get_db() reuse logic
tests/unit/mcpgateway/middleware/test_observability_middleware.py +306, -4 Test coverage
mcpgateway/admin.py +4, -1 Code formatting (unrelated)

Key Design Decisions

  1. Request State Storage: Uses request.state.db as the shared session container
  2. Ownership Tracking: session_owned_by_middleware flag ensures proper cleanup
  3. Backward Compatibility: get_db() falls back to creating its own session when middleware doesn't provide one
  4. Lifecycle Management: Middleware handles commit/rollback/close for shared sessions

🎯 Verification Checklist

  • Eliminates duplicate session creation
  • Maintains backward compatibility
  • Proper error handling (needs fixes)
  • Comprehensive test coverage
  • No breaking changes to API
  • Connection invalidation on broken connections (MISSING - CRITICAL)
  • Double-commit prevention (NEEDS IMPROVEMENT)
  • Production-appropriate logging levels (NEEDS FIX)
  • Documentation updates (MISSING - NON-BLOCKING)

Copy link
Copy Markdown
Member

@ja8zyjits ja8zyjits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additional validation needed as per the comments #3600 (comment)

@smrutisahoo10 smrutisahoo10 force-pushed the bugfix/3467-observability-middleware-opens-duplicate-db-session branch 2 times, most recently from 510ae8d to ff824bb Compare March 12, 2026 09:57
@smrutisahoo10 smrutisahoo10 requested a review from ja8zyjits March 12, 2026 11:02
@smrutisahoo10 smrutisahoo10 force-pushed the bugfix/3467-observability-middleware-opens-duplicate-db-session branch from ff824bb to 56f7a41 Compare March 12, 2026 11:05
@smrutisahoo10
Copy link
Copy Markdown
Collaborator Author

smrutisahoo10 commented Mar 12, 2026

Hi @ja8zyjits, Addressed the review comment.

✅ Review Feedback Addressed

All critical issues from the code review have been resolved:

🔴 Critical Fixes

1. Connection Invalidation for PostgreSQL/PgBouncer

Issue: Broken connections stayed in pool causing cascading failures
Fix: Added db.invalidate() in error paths when rollback fails
Locations:

  • observability_middleware.py lines 161-167 (trace setup error)
  • observability_middleware.py lines 257-258 (main exception handler)

Pattern: Matches established get_db() implementation (main.py:2430-2444)

except Exception as rollback_error:
    logger.warning(f"Failed to rollback database session: {rollback_error}")
    # Connection is broken - invalidate to remove from pool
    # This handles cases like PgBouncer query_wait_timeout where
    # the connection is dead and rollback itself fails
    try:
        db.invalidate()
    except Exception:
        pass  # Best effort cleanup on connection failure

2. Double-Commit Prevention

Issue: Potential double-commit when handlers already committed
Fix: Added db.in_transaction() check before commit
Location: observability_middleware.py line 215

# Only commit if transaction is active AND has uncommitted changes
if db.is_active and db.in_transaction():
    db.commit()

⚠️ Important Improvements

3. Logging Level Optimization

Issue: Excessive INFO logs in production (1000+ logs/sec)
Fix: Changed session creation logs from INFO to DEBUG
Locations:

  • observability_middleware.py line 115
  • main.py lines 2413, 2421

Impact: 50% reduction in log volume for traced requests

🧪 Test Coverage

4. Comprehensive Test Coverage (100%)

Previous: 76% coverage (missing lines 161-162, 164-167, 257-258)
Current: 100% coverage ✅

New Tests Added:

  1. test_dispatch_trace_setup_rollback_and_invalidate_failure - Trace setup error path
  2. test_dispatch_exception_handler_invalidate_failure - Main exception handler

Test Updates:

  • Fixed test_observability_middleware_creates_request_scoped_session - Updated mock for in_transaction()
  • Fixed test_single_session_per_request_integration - Updated mock for in_transaction()

📊 Changes Summary

Category | Files Changed | Lines Modified | Impact -- | -- | -- | -- Connection Invalidation | 1 file | 2 locations | Prevents PostgreSQL/PgBouncer cascading failures Double-Commit Prevention | 1 file | 1 location | Safer transaction management Logging Optimization | 2 files | 3 lines | 50% log volume reduction Test Coverage | 1 file | 90 lines | 76% → 100% coverage

Files Modified

  •  mcpgateway/middleware/observability_middleware.py (4 sections)
  •  mcpgateway/main.py (2 lines)
  •  tests/unit/mcpgateway/middleware/test_observability_middleware.py (4 tests)

🎯 Production Impact

Before Review Fixes

  • ❌ Broken connections stayed in pool → cascading failures
  • ❌ Potential double-commit issues with 300+ commit calls
  • ❌ Excessive INFO logging under load
  • ❌ 76% test coverage

After Review Fixes

  • ✅ Broken connections removed immediately → isolated failures
  • ✅ Safe commit logic with transaction state check
  • ✅ Clean DEBUG logging for diagnostics
  • ✅ 100% test coverage with edge cases

@crivetimihai crivetimihai added this to the Release 1.1.0 milestone Mar 14, 2026
@crivetimihai
Copy link
Copy Markdown
Member

Thanks @smrutisahoo10 — fixes duplicate DB session in observability middleware. Related to #3622. Targeting 1.1.0.

MohanLaksh
MohanLaksh previously approved these changes Mar 18, 2026
Copy link
Copy Markdown
Collaborator

@MohanLaksh MohanLaksh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #3600 Review Summary

Core Implementation (Excellent)

  • Successfully eliminates duplicate session creation - The middleware now creates a request-scoped session stored in request.state.db, which get_db() reuses
  • 50% reduction in database connections for traced requests
  • Prevents SQLite StaticPool segfaults under concurrent load
  • Maintains backward compatibility - get_db() falls back to creating its own session when observability is disabled

Code Quality

  • Comprehensive test coverage - Added 4 new tests + updated 8 existing tests
  • Proper error handling - Includes rollback, invalidation, and cleanup logic
  • Good documentation - Clear comments explaining the session sharing pattern

Review Response

The author has addressed all critical review comments from @ja8zyjits:

  1. ✅ Added connection invalidation in both error paths (trace setup + main exception handler)
  2. ✅ Added db.in_transaction() check to prevent double-commit issues
  3. ✅ Changed logging from INFO to DEBUG level
  4. ✅ Achieved 100% test coverage (up from 76%)

🎯 Current State

Files Changed (3)

  1. mcpgateway/middleware/observability_middleware.py (+45, -5 lines)
  2. mcpgateway/main.py (+22, -1 lines)
  3. tests/unit/mcpgateway/middleware/test_observability_middleware.py (+384, -10 lines)

Key Design Decisions

  • Session ownership tracking via session_owned_by_middleware flag
  • Request state storage using request.state.db
  • Lifecycle management - Middleware handles commit/rollback/close for shared sessions
  • Safe commit logic - Only commits if db.is_active and db.in_transaction()

🔍 Technical Review

Architecture Alignment ✅

  • Follows the synchronous SQLAlchemy pattern (per AGENTS.md design decision)
  • Consistent with existing error handling patterns in get_db()
  • Respects the two-layer security model (token scoping + RBAC)

Error Handling ✅

  • Connection invalidation properly implemented in both error paths
  • Graceful degradation - Continues without tracing if setup fails
  • Proper cleanup - Removes request.state.db to prevent stale references

Testing ✅

  • Integration test verifies single session per request (critical assertion)
  • Error path coverage includes rollback failures, invalidation failures
  • Session reuse tests confirm get_db() behavior in both scenarios

⚠️ Minor Observation (Non-Blocking):

  1. Documentation Gap - No updates to architecture docs (mentioned as "Future Work" in review)
    • Consider updating docs/docs/architecture/multitenancy.md with session lifecycle info
    • Could add note to docs/docs/manage/observability.md about session sharing

🚦 Recommendation

APPROVE - This PR is ready to merge.

Rationale

  1. ✅ All critical review comments have been addressed
  2. ✅ 100% test coverage with comprehensive edge case testing
  3. ✅ Production-ready error handling (connection invalidation, double-commit prevention)
  4. ✅ Proper logging levels for production use
  5. ✅ Maintains backward compatibility
  6. ✅ Solves the stated problem (duplicate sessions) effectively

Post-Merge Recommendations

  • Monitor production metrics for session usage reduction
  • Consider adding session lifecycle documentation in a follow-up PR
  • Watch for any edge cases with services that explicitly commit (399 instances across 53 files)

📊 Impact Assessment

Risk Level: LOW
Performance Impact: POSITIVE (50% reduction in session usage)
Breaking Changes: NONE
Deployment Notes: Safe to deploy - graceful degradation if issues occur


Final Verdict: This is a well-implemented fix with excellent test coverage and proper error handling. The author has been responsive to feedback and addressed all critical concerns. Ready for merge. 🚀

@smrutisahoo10 smrutisahoo10 force-pushed the bugfix/3467-observability-middleware-opens-duplicate-db-session branch from 56f7a41 to 6489ca0 Compare March 18, 2026 09:16
ja8zyjits
ja8zyjits previously approved these changes Mar 18, 2026
Copy link
Copy Markdown
Member

@ja8zyjits ja8zyjits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ja8zyjits ja8zyjits added the release-fix Critical bugfix required for the release label Mar 18, 2026
@crivetimihai crivetimihai added the SHOULD P2: Important but not vital; high-value items that are not crucial for the immediate release label Mar 20, 2026
Signed-off-by: Smruti Sahoo <talktodaisy19@gmail.com>
Signed-off-by: Smruti Sahoo <talktodaisy19@gmail.com>
…ervability

Signed-off-by: Smruti Sahoo <talktodaisy19@gmail.com>
Signed-off-by: Smruti Sahoo <talktodaisy19@gmail.com>
Remove duplicate mock_request fixture that was overriding the original
and dropping the traceparent header, which caused lines 99-102 of
observability_middleware.py to lose test coverage. Also remove unused
asyncio import and add missing trailing newline.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Bandit flags try/except/pass as B110. These two sites are
intentional best-effort cleanup when db.invalidate() fails
after a broken connection — there is nothing useful to do
with the exception.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
@crivetimihai crivetimihai dismissed stale reviews from MohanLaksh and ja8zyjits via d4ae9b5 March 20, 2026 21:36
@crivetimihai crivetimihai force-pushed the bugfix/3467-observability-middleware-opens-duplicate-db-session branch from 51454d7 to d4ae9b5 Compare March 20, 2026 21:36
@crivetimihai
Copy link
Copy Markdown
Member

Review Complete

Rebased onto main (clean, no conflicts), reviewed, tested, and added 3 fixup commits.

Review Findings

Design: Correct. The middleware owns the full session lifecycle (create/commit/rollback/close). The reuse path in get_db() yields without managing lifecycle — clean ownership model.

Logic: All code paths verified correct:

  • Happy path: middleware creates session → _safe_commit() during trace ops → handler reuses → middleware final commit → close
  • Trace setup failure: cleanup (rollback/invalidate/close), delete request.state.db, handler falls back to own session
  • Handler exception: propagates through call_next() → middleware logs error trace → rollback → close

Security: No issues. Session sharing is internal between middleware and route handlers.

Performance: Reduces DB sessions from 2→1 per request for 50+ routes using Depends(get_db) from main.py.

Coverage: 100% on observability_middleware.py (was 97% before fixup).

Fixup Commits Added

  1. fix(tests): remove duplicate mock_request fixture and unused import — Second mock_request fixture (line 256) overrode the first and dropped the traceparent header, causing lines 99-102 to lose coverage. Removed duplicate, restored 100% coverage. Also removed unused asyncio import and added missing trailing newline.

  2. fix(lint): add nosec B110 for best-effort invalidate cleanup — Two try/except/pass for db.invalidate() need # nosec B110 to pass bandit.

  3. style(tests): fix trailing whitespace in observability tests — Linter-flagged trailing whitespace in new test functions.

E2E Verification (localhost:8080, docker-compose stack)

Test Result
Standard API routes (tools, servers, gateways, resources, prompts) All 200
Auth enforcement (no token, bad token) Both 401
MCP initialize, tools/list, resources/list, ping All 200
MCP tool calls (echo, get-system-time, convert-time) All succeed
MCP resource read (timezone://info) 200
Server-scoped MCP (/servers/{id}/mcp) 200
SSE transport Session established
Full CRUD cycle (create/read/update/delete tool + verify in MCP) Works
Admin UI (login, dashboard, tools, tool execution) All render + function correctly
20 parallel GET /tools + 10 parallel MCP initialize All 200
Unit tests 20/20 pass
Bandit 0 issues

Copy link
Copy Markdown
Member

@crivetimihai crivetimihai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed, rebased, tested end-to-end. Fix is correct and complete for its scope. All code paths verified, 100% differential coverage, bandit clean, full E2E passing against docker-compose stack.

@crivetimihai crivetimihai merged commit 4269c6e into main Mar 20, 2026
39 checks passed
@crivetimihai crivetimihai deleted the bugfix/3467-observability-middleware-opens-duplicate-db-session branch March 20, 2026 21:46
MohanLaksh added a commit that referenced this pull request Mar 24, 2026
PR #3600 introduced a transaction management violation where
ObservabilityMiddleware commits the shared database session instead
of get_db(), breaking the established contract where get_db() controls
transaction boundaries. This creates data integrity risks where failed
validations can be committed to the database.

This fix restores the correct behavior:
- Middleware manages session lifecycle (create/close)
- get_db() manages transactions (commit/rollback)

Changes:
- Remove commit logic from ObservabilityMiddleware (observability_middleware.py:210-216)
- Add commit/rollback handling to get_db() for middleware sessions (main.py:3137-3164)
- Update get_db() docstring to document transaction control responsibility
- Update 2 existing tests to reflect new behavior
- Add 7 comprehensive tests for transaction semantics

Security implications:
- Fixes data integrity bug where invalid data could be committed
- Maintains proper transaction isolation per request
- Preserves connection invalidation on broken connections
- No impact on auth/RBAC (middleware runs before route handlers)

Trade-offs:
- Observability data (traces/spans) is rolled back on errors (acceptable - best-effort tracing)

Closes #3731

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
crivetimihai pushed a commit that referenced this pull request Mar 26, 2026
PR #3600 introduced a transaction management violation where
ObservabilityMiddleware commits the shared database session instead
of get_db(), breaking the established contract where get_db() controls
transaction boundaries. This creates data integrity risks where failed
validations can be committed to the database.

This fix restores the correct behavior:
- Middleware manages session lifecycle (create/close)
- get_db() manages transactions (commit/rollback)

Changes:
- Remove commit logic from ObservabilityMiddleware (observability_middleware.py:210-216)
- Add commit/rollback handling to get_db() for middleware sessions (main.py:3137-3164)
- Update get_db() docstring to document transaction control responsibility
- Update 2 existing tests to reflect new behavior
- Add 7 comprehensive tests for transaction semantics

Security implications:
- Fixes data integrity bug where invalid data could be committed
- Maintains proper transaction isolation per request
- Preserves connection invalidation on broken connections
- No impact on auth/RBAC (middleware runs before route handlers)

Trade-offs:
- Observability data (traces/spans) is rolled back on errors (acceptable - best-effort tracing)

Closes #3731

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
crivetimihai added a commit that referenced this pull request Mar 26, 2026
… sessions (#3731) (#3813)

* fix(db): restore transaction control to get_db() for middleware sessions

PR #3600 introduced a transaction management violation where
ObservabilityMiddleware commits the shared database session instead
of get_db(), breaking the established contract where get_db() controls
transaction boundaries. This creates data integrity risks where failed
validations can be committed to the database.

This fix restores the correct behavior:
- Middleware manages session lifecycle (create/close)
- get_db() manages transactions (commit/rollback)

Changes:
- Remove commit logic from ObservabilityMiddleware (observability_middleware.py:210-216)
- Add commit/rollback handling to get_db() for middleware sessions (main.py:3137-3164)
- Update get_db() docstring to document transaction control responsibility
- Update 2 existing tests to reflect new behavior
- Add 7 comprehensive tests for transaction semantics

Security implications:
- Fixes data integrity bug where invalid data could be committed
- Maintains proper transaction isolation per request
- Preserves connection invalidation on broken connections
- No impact on auth/RBAC (middleware runs before route handlers)

Trade-offs:
- Observability data (traces/spans) is rolled back on errors (acceptable - best-effort tracing)

Closes #3731

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* test(db): add coverage for double-failure edge case in get_db()

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(tests): clean up lint violations in transaction control tests

Remove unused AsyncMock import and unused variable assignments
flagged by ruff (F401, F841). Apply isort/black formatting.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
MohanLaksh added a commit that referenced this pull request Mar 27, 2026
Implements session reuse pattern from PR #3600 and PR #3813 to achieve
1 shared database session per request across all middleware layers.

**Changes:**
- Auth middleware: Added _get_or_create_session() helper to reuse
  request.state.db from ObservabilityMiddleware (lines 134, 159, 213)
- RBAC middleware: Updated deprecated get_db() to accept optional
  request parameter and reuse middleware session when available
- Transaction control: Delegated all commit/rollback to get_db() per
  PR #3813 (removed db.commit() from auth middleware)
- Added 7 unit tests for auth session reuse patterns
- Added 7 unit tests for RBAC get_db() deprecation
- Added 6 integration tests for end-to-end session sharing validation

**Impact:**
- Reduces session creation from 4-6 per request to 1 per request
- Prevents connection pool exhaustion under load
- Achieves 100% test coverage (435 statements, 0 missing)

**Security:**
- Transaction isolation maintained (get_db() controls all commits)
- Connection invalidation for PgBouncer compatibility
- Backwards compatible (existing dependency overrides work)

Closes #3622

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
brian-hussey pushed a commit that referenced this pull request Mar 27, 2026
… sessions (#3731) (#3813)

* fix(db): restore transaction control to get_db() for middleware sessions

PR #3600 introduced a transaction management violation where
ObservabilityMiddleware commits the shared database session instead
of get_db(), breaking the established contract where get_db() controls
transaction boundaries. This creates data integrity risks where failed
validations can be committed to the database.

This fix restores the correct behavior:
- Middleware manages session lifecycle (create/close)
- get_db() manages transactions (commit/rollback)

Changes:
- Remove commit logic from ObservabilityMiddleware (observability_middleware.py:210-216)
- Add commit/rollback handling to get_db() for middleware sessions (main.py:3137-3164)
- Update get_db() docstring to document transaction control responsibility
- Update 2 existing tests to reflect new behavior
- Add 7 comprehensive tests for transaction semantics

Security implications:
- Fixes data integrity bug where invalid data could be committed
- Maintains proper transaction isolation per request
- Preserves connection invalidation on broken connections
- No impact on auth/RBAC (middleware runs before route handlers)

Trade-offs:
- Observability data (traces/spans) is rolled back on errors (acceptable - best-effort tracing)

Closes #3731

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* test(db): add coverage for double-failure edge case in get_db()

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(tests): clean up lint violations in transaction control tests

Remove unused AsyncMock import and unused variable assignments
flagged by ruff (F401, F841). Apply isort/black formatting.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
madhu-mohan-jaishankar pushed a commit that referenced this pull request Mar 27, 2026
… sessions (#3731) (#3813)

* fix(db): restore transaction control to get_db() for middleware sessions

PR #3600 introduced a transaction management violation where
ObservabilityMiddleware commits the shared database session instead
of get_db(), breaking the established contract where get_db() controls
transaction boundaries. This creates data integrity risks where failed
validations can be committed to the database.

This fix restores the correct behavior:
- Middleware manages session lifecycle (create/close)
- get_db() manages transactions (commit/rollback)

Changes:
- Remove commit logic from ObservabilityMiddleware (observability_middleware.py:210-216)
- Add commit/rollback handling to get_db() for middleware sessions (main.py:3137-3164)
- Update get_db() docstring to document transaction control responsibility
- Update 2 existing tests to reflect new behavior
- Add 7 comprehensive tests for transaction semantics

Security implications:
- Fixes data integrity bug where invalid data could be committed
- Maintains proper transaction isolation per request
- Preserves connection invalidation on broken connections
- No impact on auth/RBAC (middleware runs before route handlers)

Trade-offs:
- Observability data (traces/spans) is rolled back on errors (acceptable - best-effort tracing)

Closes #3731

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* test(db): add coverage for double-failure edge case in get_db()

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(tests): clean up lint violations in transaction control tests

Remove unused AsyncMock import and unused variable assignments
flagged by ruff (F401, F841). Apply isort/black formatting.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
MohanLaksh added a commit that referenced this pull request Mar 30, 2026
Implements session reuse pattern from PR #3600 and PR #3813 to achieve
1 shared database session per request across all middleware layers.

**Changes:**
- Auth middleware: Added _get_or_create_session() helper to reuse
  request.state.db from ObservabilityMiddleware (lines 134, 159, 213)
- RBAC middleware: Updated deprecated get_db() to accept optional
  request parameter and reuse middleware session when available
- Transaction control: Delegated all commit/rollback to get_db() per
  PR #3813 (removed db.commit() from auth middleware)
- Added 7 unit tests for auth session reuse patterns
- Added 7 unit tests for RBAC get_db() deprecation
- Added 6 integration tests for end-to-end session sharing validation

**Impact:**
- Reduces session creation from 4-6 per request to 1 per request
- Prevents connection pool exhaustion under load
- Achieves 100% test coverage (435 statements, 0 missing)

**Security:**
- Transaction isolation maintained (get_db() controls all commits)
- Connection invalidation for PgBouncer compatibility
- Backwards compatible (existing dependency overrides work)

Closes #3622

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
MohanLaksh added a commit that referenced this pull request Mar 31, 2026
Implements session reuse pattern from PR #3600 and PR #3813 to achieve
1 shared database session per request across all middleware layers.

**Changes:**
- Auth middleware: Added _get_or_create_session() helper to reuse
  request.state.db from ObservabilityMiddleware (lines 134, 159, 213)
- RBAC middleware: Updated deprecated get_db() to accept optional
  request parameter and reuse middleware session when available
- Transaction control: Delegated all commit/rollback to get_db() per
  PR #3813 (removed db.commit() from auth middleware)
- Added 7 unit tests for auth session reuse patterns
- Added 7 unit tests for RBAC get_db() deprecation
- Added 6 integration tests for end-to-end session sharing validation

**Impact:**
- Reduces session creation from 4-6 per request to 1 per request
- Prevents connection pool exhaustion under load
- Achieves 100% test coverage (435 statements, 0 missing)

**Security:**
- Transaction isolation maintained (get_db() controls all commits)
- Connection invalidation for PgBouncer compatibility
- Backwards compatible (existing dependency overrides work)

Closes #3622

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
crivetimihai pushed a commit that referenced this pull request Mar 31, 2026
Implements session reuse pattern from PR #3600 and PR #3813 to achieve
1 shared database session per request across all middleware layers.

**Changes:**
- Auth middleware: Added _get_or_create_session() helper to reuse
  request.state.db from ObservabilityMiddleware (lines 134, 159, 213)
- RBAC middleware: Updated deprecated get_db() to accept optional
  request parameter and reuse middleware session when available
- Transaction control: Delegated all commit/rollback to get_db() per
  PR #3813 (removed db.commit() from auth middleware)
- Added 7 unit tests for auth session reuse patterns
- Added 7 unit tests for RBAC get_db() deprecation
- Added 6 integration tests for end-to-end session sharing validation

**Impact:**
- Reduces session creation from 4-6 per request to 1 per request
- Prevents connection pool exhaustion under load
- Achieves 100% test coverage (435 statements, 0 missing)

**Security:**
- Transaction isolation maintained (get_db() controls all commits)
- Connection invalidation for PgBouncer compatibility
- Backwards compatible (existing dependency overrides work)

Closes #3622

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
crivetimihai added a commit that referenced this pull request Mar 31, 2026
#3886)

* fix: eliminate duplicate DB sessions in auth and RBAC middleware

Implements session reuse pattern from PR #3600 and PR #3813 to achieve
1 shared database session per request across all middleware layers.

**Changes:**
- Auth middleware: Added _get_or_create_session() helper to reuse
  request.state.db from ObservabilityMiddleware (lines 134, 159, 213)
- RBAC middleware: Updated deprecated get_db() to accept optional
  request parameter and reuse middleware session when available
- Transaction control: Delegated all commit/rollback to get_db() per
  PR #3813 (removed db.commit() from auth middleware)
- Added 7 unit tests for auth session reuse patterns
- Added 7 unit tests for RBAC get_db() deprecation
- Added 6 integration tests for end-to-end session sharing validation

**Impact:**
- Reduces session creation from 4-6 per request to 1 per request
- Prevents connection pool exhaustion under load
- Achieves 100% test coverage (435 statements, 0 missing)

**Security:**
- Transaction isolation maintained (get_db() controls all commits)
- Connection invalidation for PgBouncer compatibility
- Backwards compatible (existing dependency overrides work)

Closes #3622

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: ensure auth logs are committed in hard-deny paths

Fixes critical bug where auth failure logs were lost when API requests
received hard-deny responses (401/403) that return JSONResponse immediately
without reaching get_db().

**Root Cause:**
- Auth middleware writes logs to session but delegated commit to get_db()
- Hard-deny API responses return JSONResponse immediately (lines 243-247)
- Route handler never runs, get_db() never called, logs never committed
- Browser requests work fine (continue to route handler at line 236)

**Fix:**
- Added db.commit() after logging in both success and failure paths
- Logs persist immediately, even if request doesn't reach get_db()
- For requests that continue to route handler, get_db() commits again (no-op)
- SQLAlchemy allows multiple commits - second commit is safe

**Test Coverage:**
- Added regression test: test_auth_failure_logs_committed_before_hard_deny_api_response
- All 29 auth middleware tests pass
- Maintained 100% code coverage

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(build): use system uv binary consistently in Makefile

Replace hardcoded 'uv' commands with $(UV_BIN) variable reference
across all targets to ensure consistent resolution of the uv binary
path. Export UV_BIN to make it available to recursive make calls.

This fixes build failures where make targets tried to execute
'uv' from inside the activated virtual environment, where it
doesn't exist. The uv tool is a system-level package manager
and should be resolved from the system PATH or ~/.local/bin.

Changes:
- Use $(UV_BIN) in install, install-dev, install-db, update targets
- Use $(UV_BIN) in ensure_pip_package macro (fixes recursive make)
- Use $(UV_BIN) in sbom, alembic, pypiserver, maturin targets
- Export UV_BIN for visibility in child make processes

Benefits:
- Fixes "No such file or directory" errors for uv
- Makes Makefile more robust across different uv installations
- Maintains existing fallback logic (PATH or ~/.local/bin/uv)
- No breaking changes for existing setups

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(auth): prevent stale session reference and ensure log persistence

Fix two bugs in auth middleware session handling:

1. _get_or_create_session() stored newly created sessions in
   request.state.db but then closed them after logging, leaving a stale
   closed-session reference for downstream get_db() to find. Remove the
   request.state.db assignment for owned sessions so route handlers
   create their own sessions via get_db().

2. Generic Exception path (non-HTTP auth failures) did not commit
   security logs before closing the owned session, silently losing log
   entries. Add db.commit() consistent with the success and hard-deny
   paths.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(auth): rollback shared session on logging failure and rewrite integration tests

Address two findings from code review:

1. When security logging raises an exception (commit failure, connection
   error), the shared session from ObservabilityMiddleware is left in
   PendingRollbackError state.  Downstream call_next()/get_db() then
   inherits a broken session.  Add db.rollback() (with invalidate()
   fallback) in all three exception handlers — matching the pattern
   used by ObservabilityMiddleware and main.py:get_db().

2. Rewrite integration tests: the originals targeted nonexistent routes
   (/api/v1/servers, /admin/llm/providers) and had assertions too weak
   to catch regressions (session_count <= 1 passes when 0 sessions are
   created).  New tests directly exercise _get_or_create_session(),
   rbac.get_db() reuse, shared-session rollback, and the full
   middleware stack via /health.

Test coverage: 100% (460 statements, 0 missing) across both auth
middleware and rbac modules.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: update secrets baseline after rebase

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: exclude .npmrc from sdist to fix check-manifest

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
brian-hussey pushed a commit that referenced this pull request Mar 31, 2026
#3886)

* fix: eliminate duplicate DB sessions in auth and RBAC middleware

Implements session reuse pattern from PR #3600 and PR #3813 to achieve
1 shared database session per request across all middleware layers.

**Changes:**
- Auth middleware: Added _get_or_create_session() helper to reuse
  request.state.db from ObservabilityMiddleware (lines 134, 159, 213)
- RBAC middleware: Updated deprecated get_db() to accept optional
  request parameter and reuse middleware session when available
- Transaction control: Delegated all commit/rollback to get_db() per
  PR #3813 (removed db.commit() from auth middleware)
- Added 7 unit tests for auth session reuse patterns
- Added 7 unit tests for RBAC get_db() deprecation
- Added 6 integration tests for end-to-end session sharing validation

**Impact:**
- Reduces session creation from 4-6 per request to 1 per request
- Prevents connection pool exhaustion under load
- Achieves 100% test coverage (435 statements, 0 missing)

**Security:**
- Transaction isolation maintained (get_db() controls all commits)
- Connection invalidation for PgBouncer compatibility
- Backwards compatible (existing dependency overrides work)

Closes #3622

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: ensure auth logs are committed in hard-deny paths

Fixes critical bug where auth failure logs were lost when API requests
received hard-deny responses (401/403) that return JSONResponse immediately
without reaching get_db().

**Root Cause:**
- Auth middleware writes logs to session but delegated commit to get_db()
- Hard-deny API responses return JSONResponse immediately (lines 243-247)
- Route handler never runs, get_db() never called, logs never committed
- Browser requests work fine (continue to route handler at line 236)

**Fix:**
- Added db.commit() after logging in both success and failure paths
- Logs persist immediately, even if request doesn't reach get_db()
- For requests that continue to route handler, get_db() commits again (no-op)
- SQLAlchemy allows multiple commits - second commit is safe

**Test Coverage:**
- Added regression test: test_auth_failure_logs_committed_before_hard_deny_api_response
- All 29 auth middleware tests pass
- Maintained 100% code coverage

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(build): use system uv binary consistently in Makefile

Replace hardcoded 'uv' commands with $(UV_BIN) variable reference
across all targets to ensure consistent resolution of the uv binary
path. Export UV_BIN to make it available to recursive make calls.

This fixes build failures where make targets tried to execute
'uv' from inside the activated virtual environment, where it
doesn't exist. The uv tool is a system-level package manager
and should be resolved from the system PATH or ~/.local/bin.

Changes:
- Use $(UV_BIN) in install, install-dev, install-db, update targets
- Use $(UV_BIN) in ensure_pip_package macro (fixes recursive make)
- Use $(UV_BIN) in sbom, alembic, pypiserver, maturin targets
- Export UV_BIN for visibility in child make processes

Benefits:
- Fixes "No such file or directory" errors for uv
- Makes Makefile more robust across different uv installations
- Maintains existing fallback logic (PATH or ~/.local/bin/uv)
- No breaking changes for existing setups

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix(auth): prevent stale session reference and ensure log persistence

Fix two bugs in auth middleware session handling:

1. _get_or_create_session() stored newly created sessions in
   request.state.db but then closed them after logging, leaving a stale
   closed-session reference for downstream get_db() to find. Remove the
   request.state.db assignment for owned sessions so route handlers
   create their own sessions via get_db().

2. Generic Exception path (non-HTTP auth failures) did not commit
   security logs before closing the owned session, silently losing log
   entries. Add db.commit() consistent with the success and hard-deny
   paths.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix(auth): rollback shared session on logging failure and rewrite integration tests

Address two findings from code review:

1. When security logging raises an exception (commit failure, connection
   error), the shared session from ObservabilityMiddleware is left in
   PendingRollbackError state.  Downstream call_next()/get_db() then
   inherits a broken session.  Add db.rollback() (with invalidate()
   fallback) in all three exception handlers — matching the pattern
   used by ObservabilityMiddleware and main.py:get_db().

2. Rewrite integration tests: the originals targeted nonexistent routes
   (/api/v1/servers, /admin/llm/providers) and had assertions too weak
   to catch regressions (session_count <= 1 passes when 0 sessions are
   created).  New tests directly exercise _get_or_create_session(),
   rbac.get_db() reuse, shared-session rollback, and the full
   middleware stack via /health.

Test coverage: 100% (460 statements, 0 missing) across both auth
middleware and rbac modules.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: update secrets baseline after rebase

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: exclude .npmrc from sdist to fix check-manifest

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working performance Performance related items release-fix Critical bugfix required for the release SHOULD P2: Important but not vital; high-value items that are not crucial for the immediate release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG][PERFORMANCE]: Observability middleware opens duplicate DB session per request

4 participants