Feature/logging overhaul by dcox761 · Pull Request #226 · aws-samples/bedrock-access-gateway

dcox761 · 2026-03-03T11:13:36Z

Issue #225

Description of changes:
1. Centralise TRACE_LEVEL in setting.py

Register the custom TRACE level (5, below DEBUG at 10) once in setting.py and import it everywhere:

# setting.py
TRACE_LEVEL = 5
logging.addLevelName(TRACE_LEVEL, "TRACE")

2. Three-tier logging

INFO (default) — startup, errors, CORS warnings, and USAGE summary per request
DEBUG (DEBUG=true) — request/response details, model resolution, tool calls, cache points
TRACE (TRACE=true) — per-chunk streaming data, raw request/response bodies, request headers

Only the api logger hierarchy is affected — boto3/botocore/urllib3 stay at INFO:

if TRACE or DEBUG:
    logging.getLogger("api").setLevel(TRACE_LEVEL if TRACE else logging.DEBUG)

3. ECS-aware log format

Detect ECS/Fargate via ECS_CONTAINER_METADATA_URI and skip log formatting (CloudWatch adds its own timestamps and metadata). Outside ECS, use human-readable format:

_log_kwargs = {"level": logging.INFO}
if not os.environ.get("ECS_CONTAINER_METADATA_URI"):
    _log_kwargs["format"] = "%(asctime)s [%(levelname)s] %(message)s"
logging.basicConfig(**_log_kwargs)

4. Lazy %s formatting and isEnabledFor guards

All logger.debug() calls use %s lazy formatting instead of f-strings. Expensive debug operations (JSON serialisation, dict formatting) are guarded with isEnabledFor:

# Before (eager f-string evaluation even when debug is off):
logger.debug(f"Resolved {model_id} → {result}")

# After (lazy — no string formatting unless DEBUG is enabled):
logger.debug("Resolved %s -> %s", model_id, result)

# Expensive operations guarded:
if logger.isEnabledFor(TRACE_LEVEL):
    logger.log(TRACE_LEVEL, "Request body: %s",
               json.dumps(chat_request.model_dump(), indent=2, default=str))

5. Per-request USAGE logging (INFO level)

A structured one-line summary logged after every request completes:

USAGE | user=admin@example.com | chat=abc-123 | model=claude-sonnet-4 | max_tokens=4096 | in=1523 | out=847 | cache_write=0 | cache_read=1200 | ua=OpenAI/Python

User and chat ID are extracted from configurable HTTP headers (empty defaults — only active when configured):

# Optional: configure headers from your upstream proxy
export USAGE_USER_HEADER=x-openwebui-user-email
export USAGE_CHAT_ID_HEADER=x-openwebui-chat-id

6. Improved validation error handler

The validation exception handler now logs error count at WARNING level and optionally logs the rejected request body at TRACE level:

@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request, exc):
    error_count = len(exc.errors()) if hasattr(exc, 'errors') else 'unknown'
    logger.warning("Request validation failed: %s %s - %s error(s)",
                   request.method, request.url.path, error_count)
    if logger.isEnabledFor(TRACE_LEVEL):
        body = (await request.body()).decode("utf-8", errors="replace")[:2000]
        logger.log(TRACE_LEVEL, "Rejected request body: %s", body)
    return PlainTextResponse(str(exc), status_code=400)

New environment variables

Variable	Default	Description
`TRACE`	`false`	Enable TRACE-level logging (very verbose)
`USAGE_USER_HEADER`	`""`	HTTP header for user identity in USAGE lines
`USAGE_CHAT_ID_HEADER`	`""`	HTTP header for chat/session ID in USAGE lines

Files: src/api/setting.py, src/api/app.py, src/api/routers/chat.py, src/api/models/bedrock.py

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…trumentation - Register custom TRACE level (5) for per-chunk streaming logs - Refactor all 'if DEBUG: logger.info' to logger.debug with isEnabledFor guards - Replace per-chunk logging with TRACE level (2 sites in bedrock.py) - Add INFO-level USAGE line per request: user, chat, model, max_tokens, tokens in/out/cache, user-agent - Add configurable USAGE_USER_HEADER / USAGE_CHAT_ID_HEADER env vars for extracting request attribution from any upstream proxy (empty by default; set to your proxy's header names to enable) - Add TRACE env var in setting.py - Enhanced validation error handler with error details and TRACE body logging

Use Python's default basicConfig format (LEVEL:name:message) instead of a custom format with timestamps. This avoids: 1. Duplicate timestamps — CloudWatch adds its own 2. Inconsistent format — uvicorn access logs use 'INFO:' prefix while our custom format used '[INFO]' brackets

…tion Three debug log calls that originally had 'if DEBUG:' guards in upstream lost their protection when converted to logger.debug() in the logging overhaul. This restores evaluation guards using isEnabledFor(DEBUG) to avoid unnecessary str() conversion on every request when debug logging is disabled. Also converts one string concatenation debug call to f-string for consistency (model_name is already a string, no str() cost).

Remove verbose per-field error enumeration and JSON body truncation logic from the validation exception handler. The WARNING log now just reports method, path, and error count. The TRACE-level body dump is a simple raw decode with a 2000-char cap.

The profile resolution block was outside the isEnabledFor(DEBUG) guard, causing unnecessary dictionary lookup and method call on every request even when debug logging is disabled.

- Remove unused 'import json' from app.py (no longer needed after handler simplification) - Convert 7 f-string logger.debug() calls to lazy %s formatting in bedrock.py to avoid eager string interpolation when DEBUG is disabled

- Collapse if/elif TRACE/DEBUG into single conditional setLevel call - Restore upstream's log format (asctime + level + message) for local use - Auto-detect ECS_CONTAINER_METADATA_URI: skip format in Fargate so CloudWatch doesn't get duplicate timestamps

TRACE_LEVEL=5 was defined in three files (app.py, bedrock.py, chat.py). Move the definition and addLevelName() call into setting.py so it is registered once and imported by all consumers.

zxkane

Thanks for the thorough logging overhaul! The three-tier logging architecture (INFO/DEBUG/TRACE), ECS-aware formatting, and scoping to the api logger hierarchy are all well-designed improvements.

There are a couple of security concerns that should be addressed before merging:

Critical:

TRACE logging dumps full HTTP headers including Authorization: Bearer <API_KEY> — credentials should be redacted from logs even at TRACE level.
Silent except Exception: pass in the validation error handler swallows all errors when TRACE is enabled for debugging.

Important:

Mutable class-level default request_meta: dict = {} is a classic Python pitfall — currently safe by coincidence but fragile.
User identity extracted from untrusted HTTP headers is logged without sanitization (log injection risk) and at always-on INFO level (PII concern).

Suggestions:

Some debug lines still use eager string concatenation / f-strings, inconsistent with the PR's stated goal of lazy %s formatting.

See inline comments for details and suggested fixes.

zxkane · 2026-03-06T13:16:45Z

+            TRACE_LEVEL,
+            "Request headers: %s",
+            json.dumps(dict(request.headers), indent=2),
+        )
+        logger.log(
+            TRACE_LEVEL,


Critical: This dumps full HTTP headers including Authorization: Bearer <API_KEY> to the log stream.

This application uses HTTPBearer authentication (see auth.py). When TRACE is enabled, every caller's API key is written to logs in plaintext. In AWS deployments, CloudWatch logs may have broader access than secrets management, and logs may be forwarded to third-party observability platforms (Datadog, Splunk, etc.).

Please redact sensitive headers before logging:

REDACTED_HEADERS = {"authorization", "cookie", "x-api-key"} if logger.isEnabledFor(TRACE_LEVEL): safe_headers = { k: ("***REDACTED***" if k.lower() in REDACTED_HEADERS else v) for k, v in request.headers.items() } logger.log(TRACE_LEVEL, "Request headers: %s", json.dumps(safe_headers, indent=2))

zxkane · 2026-03-06T13:16:47Z

+            body = (await request.body()).decode("utf-8", errors="replace")[:2000]
+            logger.log(TRACE_LEVEL, "Rejected request body: %s", body)
+        except Exception:
+            pass
+


Critical: Silent except Exception: pass swallows all errors.

When an operator has explicitly enabled TRACE for debugging, silently discarding the body-read failure defeats the purpose — it's exactly the kind of diagnostic info they need. At minimum, log the exception:

except Exception as e: logger.warning("Failed to read rejected request body for TRACE logging: %s", e)

Also, the WARNING-level message above (line 69-73) only logs the error count now but no longer shows what the errors are. The previous code at least included the first line of the exception string. Consider adding a brief summary:

error_summary = "; ".join( f"{e.get('loc', '?')}: {e.get('type', '?')}" for e in (exc.errors()[:3] if hasattr(exc, 'errors') else []) ) logger.warning( "Request validation failed: %s %s - %s error(s): %s", request.method, request.url.path, error_count, error_summary or "no details", )

zxkane · 2026-03-06T13:16:48Z



 class BedrockModel(BaseChatModel):
+    request_meta: dict = {}


Important: Mutable class-level default dict = {} is shared across all instances.

BedrockModel inherits from BaseChatModel(ABC) — a plain Python class, not Pydantic. This {} is created once at class definition time and shared across all instances. Currently safe by coincidence (because chat.py always reassigns a fresh dict), but fragile — any future code that does model.request_meta["key"] = val (mutation) instead of reassignment would leak data across requests.

Also, _log_usage_summary uses getattr(self, "request_meta", {}) defensively, which suggests uncertainty about whether the attribute exists — further sign of a fragile design.

Fix: use None as default and initialize per-instance:

class BedrockModel(BaseChatModel): request_meta: dict | None = None def _log_usage_summary(self, ...): meta = self.request_meta or {} ...

zxkane · 2026-03-06T13:16:49Z

+    model.request_meta = {
+        "user_email": request.headers.get(USAGE_USER_HEADER, "-") if USAGE_USER_HEADER else "-",
+        "chat_id": request.headers.get(USAGE_CHAT_ID_HEADER, "-") if USAGE_CHAT_ID_HEADER else "-",
+        "max_tokens": effective_max_tokens,
+        "user_agent": request.headers.get("user-agent", "-"),
+    }
    model.validate(chat_request)


Important: User identity from untrusted HTTP headers — log injection and PII concerns.

Two issues here:

1. Log injection risk — Header values are logged without any sanitization. A caller can set the header to a value containing newlines to create fake log entries, e.g.:

x-openwebui-user-email: admin@evil.com\n2026-03-06 USAGE | user=legitimate@company.com | ...

Python's %s formatting does not escape newlines. This could be used for impersonation in audit logs or to pollute monitoring/alerting systems.

2. PII at INFO level — When USAGE_USER_HEADER is configured, user email addresses are logged at INFO level (always-on) for every request. This is PII under GDPR/CCPA. In AWS, CloudWatch logs may be retained indefinitely by default. Consider logging USAGE at DEBUG level, or at minimum document clearly that configuring these headers causes PII to be written to logs.

Suggestion: sanitize header values before logging:

import re def _sanitize_header(value: str, max_len: int = 128) -> str: return re.sub(r'[\x00-\x1f\x7f]', '', value)[:max_len] model.request_meta = { "user_email": _sanitize_header(request.headers.get(USAGE_USER_HEADER, "-")) if USAGE_USER_HEADER else "-", ... }

zxkane · 2026-03-06T13:16:51Z

        """Common logic for invoke bedrock models"""
-        if DEBUG:
-            logger.info("Raw request: " + chat_request.model_dump_json())
+        if logger.isEnabledFor(logging.DEBUG):
+            logger.debug("Raw request: " + chat_request.model_dump_json())

            # Log profile resolution for debugging
            if chat_request.model in profile_metadata:
                resolved = self._resolve_to_foundation_model(chat_request.model)
                profile_type = profile_metadata[chat_request.model].get("profile_type", "UNKNOWN")
-                logger.info(
+                logger.debug(
                    f"Profile resolution: {chat_request.model} ({profile_type}) → {resolved}"
                )

        # convert OpenAI chat request to Bedrock SDK request


Suggestion: Several debug lines still use eager string concatenation / f-strings.

The PR description states the goal of converting to lazy %s formatting, but these lines within the isEnabledFor guard still use eager evaluation:

Line 366: logger.debug("Raw request: " + chat_request.model_dump_json())

Line 372: logger.debug(f"Profile resolution: {chat_request.model} ({profile_type}) → {resolved}")

Line 377: logger.debug("Bedrock request: " + json.dumps(str(args)))

While the isEnabledFor guard prevents this when debug is off, converting to %s would be consistent with the PR's stated approach:

logger.debug("Raw request: %s", chat_request.model_dump_json()) logger.debug("Profile resolution: %s (%s) -> %s", chat_request.model, profile_type, resolved) logger.debug("Bedrock request: %s", json.dumps(str(args)))

bcsdcox added 8 commits March 3, 2026 14:20

Nest profile resolution debug log inside existing isEnabledFor guard

9f848a8

The profile resolution block was outside the isEnabledFor(DEBUG) guard, causing unnecessary dictionary lookup and method call on every request even when debug logging is disabled.

Centralise TRACE_LEVEL in setting.py, import everywhere

752a1af

TRACE_LEVEL=5 was defined in three files (app.py, bedrock.py, chat.py). Move the definition and addLevelName() call into setting.py so it is registered once and imported by all consumers.

zxkane requested changes Mar 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/logging overhaul#226

Feature/logging overhaul#226
dcox761 wants to merge 8 commits into
aws-samples:mainfrom
bluecrystalsolutions:feature/logging-overhaul

dcox761 commented Mar 3, 2026

Uh oh!

zxkane left a comment

Uh oh!

zxkane Mar 6, 2026

Uh oh!

zxkane Mar 6, 2026

Uh oh!

zxkane Mar 6, 2026

Uh oh!

zxkane Mar 6, 2026

Uh oh!

zxkane Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dcox761 commented Mar 3, 2026

New environment variables

Uh oh!

zxkane left a comment

Choose a reason for hiding this comment

Uh oh!

zxkane Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

zxkane Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

zxkane Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

zxkane Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

zxkane Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants