Skip to content

feat(logging): stamp run/action context via LogRecordFactory#1038

Merged
EngHabu merged 7 commits into
mainfrom
logging-record-factory
May 8, 2026
Merged

feat(logging): stamp run/action context via LogRecordFactory#1038
EngHabu merged 7 commits into
mainfrom
logging-record-factory

Conversation

@EngHabu

@EngHabu EngHabu commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Replaces the per-logger logging.Filter approach for context stamping with a global LogRecordFactory. The factory runs on every LogRecord regardless of which logger created it, so any user-defined logger automatically picks up run_name, action_name, and is_flyte_internal attributes — no per-handler filter wiring required.

This also fixes a latent bug in main: the old ContextFilter and FlyteInternalFilter mutated record.msg in place, so a record passing through multiple handlers could get its [run][action] and [flyte] prefixes prepended multiple times. Stamping attributes on the record (and rendering them at format time) sidesteps that entirely.

What changed

  • LogRecordFactory (src/flyte/_logging.py) — stamps the active flyte action context on every record, lazily importing flyte._context to avoid import-time cycles.
  • ContextFormatter replaces ContextFilter + FlyteInternalFilter. The [flyte] internal-prefix marker is now a formatter flag (internal_prefix=True), so the same record can be formatted by multiple handlers without prefixes compounding.
  • ContextFormatter(inner=...) — optional kwarg to delegate base formatting to an existing logging.Formatter. Used to wrap pre-existing root-handler formatters without losing their layout.
  • initialize_logger(reset_root_logger=False) now wraps each existing root-handler formatter with ContextFormatter(inner=existing), preserving main's "third-party log lines through root get [run][action]" behavior. Idempotent on repeated init calls. Same temporal limitation as main: handlers added after flyte.init() won't be wrapped, but the factory still stamps the attrs so callers can format them themselves.
  • get_rich_handler(internal_prefix=True) — opt-out parameter; the user-logger call site passes False so user-emitted lines don't carry the [flyte] marker even under rich logging.

Behavior parity vs main

main this PR
[run][action] on flyte/flyte.user logs
[run][action] on third-party logs through root (when handler exists at init) ✅ (via filter) ✅ (via formatter wrap)
[run][action] on third-party logs through root (handler added after init) ❌ — but attrs still on record
[flyte] marker on flyte internal logs
Compounding prefixes when record passes through multiple handlers ❌ (latent bug) ✅ fixed
record.run_name / action_name available everywhere partial ✅ universal (factory)

Why this matters for users

flyte.logger and any child of flyte.user continue to work as before. New: anyone using a stdlib logger (logging.getLogger("myapp")) outside the flyte namespace also gets run_name / action_name / is_flyte_internal stamped on every record by the factory — they just need a formatter that references them (e.g. "[%(run_name)s][%(action_name)s] %(message)s").

Example

examples/basics/reuse_concurrent_logging.py exercises three logger configs with a reusable container running concurrent tasks, so log output from multiple actions interleaves on a single stderr — each line correctly prefixed by its own [run][action]:

  1. flyte.logger — canonical user-facing logger.
  2. logging.getLogger("flyte.user.myapp") — child of flyte.user, inherits the formatter via propagation.
  3. logging.getLogger("myapp") — fully independent logger with its own handler that references the factory-stamped attrs in its format string.

Test plan

  • pytest tests/user_api/test_logging.py tests/flyte/test_logging.py passes (23 tests)
  • test_user_logger_no_flyte_prefix updated to check the formatter flag instead of the removed FlyteInternalFilter
  • test_user_logger_no_flyte_prefix_after_rich_init — regression test for rich-handler internal prefix bleeding into user logger
  • test_initialize_logger_wraps_existing_root_handlers — covers wrap path + idempotency on repeated init
  • Manual: run examples/basics/reuse_concurrent_logging.py against a cluster and confirm all three worker variants emit [run][action]-prefixed log lines on the shared reused container

🤖 Generated with Claude Code

Switch from logging.Filter-based context stamping to a global
LogRecordFactory. The factory runs on every LogRecord regardless
of which logger created it, so any user-defined logger
(`logging.getLogger("myapp")`, child of `flyte.user`, or a fully
independent stdlib logger) automatically picks up `run_name`,
`action_name`, and `is_flyte_internal` attributes — no per-handler
filter wiring required.

Adds `register_log_context(name, var)` / `unregister_log_context`
so users can stamp their own contextvars onto every record. The
JSONFormatter emits them as top-level fields; console formatters
can reference them via `%(name)s`.

The `[flyte]` internal-prefix marker is now a ContextFormatter
flag instead of a filter, so the same record can be formatted by
multiple handlers without compounding prefixes.

- `examples/basics/reuse_concurrent_logging.py`: demonstrates four
  logger configs (flyte.logger, flyte.user.* child, independent
  stdlib logger, register_log_context contextvar) under reusable
  containers with concurrent tasks.
- `tests/user_api/test_logging.py`: updated detection to look for
  the formatter flag instead of the removed FlyteInternalFilter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
EngHabu and others added 2 commits May 6, 2026 23:06
`initialize_logger(enable_rich=True)` (called by flyte.init via
_initialize_logger) attaches a rich handler to flyte.user. The
handler builder always set `internal_prefix=True` on its formatter,
so user-side records were getting the [flyte] internal marker.
Surfaced as a CI failure in test_user_logger_no_flyte_prefix once
init ordering put a rich-enabled init ahead of the assertion.

- get_rich_handler now takes `internal_prefix: bool = True`; the
  user-logger call site passes False.
- Adds test_user_logger_no_flyte_prefix_after_rich_init that calls
  initialize_logger(enable_rich=True) explicitly so the regression
  is caught regardless of test ordering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
Comment thread src/flyte/_logging.py Outdated
def _flyte_record_factory(*args: Any, **kwargs: Any) -> logging.LogRecord:
record = _orig_record_factory(*args, **kwargs)

for attr, var in _LOG_CONTEXT_VARS.items():

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we're iterating over _LOG_CONTEXT_VARS.items() with concurrent usage, can't this run into a race condition where one thread tried to update and another tries to read. Python wouldn't let you iterate over a dictionary that has changed its size.
This may not be too noticeable since registration is done before the task even starts, but if someone were to dynamically register mid run this could be an issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but, it happens on startup only right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not, but this is python and so it will work fine :( GIL, but we should make it cleaner, have a lock

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lock would be cleaner yes, or you could just snapshot by casting it as a list()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got rid of that code altogether. feature creep.. adding custom context vars is out of scope :D

Comment thread src/flyte/_logging.py Outdated
def _flyte_record_factory(*args: Any, **kwargs: Any) -> logging.LogRecord:
record = _orig_record_factory(*args, **kwargs)

for attr, var in _LOG_CONTEXT_VARS.items():

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for attr, var in _LOG_CONTEXT_VARS.items():
for attr, var in list(_LOG_CONTEXT_VARS.items()):

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

EngHabu and others added 3 commits May 8, 2026 11:08
…tion]

Restores parity with main's ContextFilter behavior: when
initialize_logger() is called with reset_root_logger=False (the
default), wrap any pre-existing root-handler formatter so
third-party log lines routed through root render with
[run][action]. Captures handlers registered before this call,
matching main's temporal limitation; the LogRecord factory still
stamps the attrs on every record so callers can format them
themselves for handlers added later.

ContextFormatter gains an `inner: logging.Formatter | None` kwarg.
When set, base formatting is delegated to the inner formatter
instead of ContextFormatter's own fmt string — this preserves the
original handler's layout while letting us prepend the prefix.

Also removes the unused register_log_context / unregister_log_context
machinery (no longer needed; users can stamp their own attrs
through their own formatters now that the factory exposes context).

- tests/user_api/test_logging.py: adds
  test_initialize_logger_wraps_existing_root_handlers covering the
  wrap path and idempotency on repeated init calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
@EngHabu EngHabu requested a review from AdilFayyaz May 8, 2026 18:18
Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
@EngHabu EngHabu merged commit 8eba296 into main May 8, 2026
36 checks passed
@EngHabu EngHabu deleted the logging-record-factory branch May 8, 2026 18:36
AdilFayyaz pushed a commit that referenced this pull request May 12, 2026
## Summary

Replaces the per-logger `logging.Filter` approach for context stamping
with a global `LogRecordFactory`. The factory runs on every `LogRecord`
regardless of which logger created it, so any user-defined logger
automatically picks up `run_name`, `action_name`, and
`is_flyte_internal` attributes — no per-handler filter wiring required.

This also fixes a latent bug in main: the old `ContextFilter` and
`FlyteInternalFilter` mutated `record.msg` in place, so a record passing
through multiple handlers could get its `[run][action]` and `[flyte]`
prefixes prepended multiple times. Stamping attributes on the record
(and rendering them at format time) sidesteps that entirely.

## What changed

- **LogRecordFactory** (`src/flyte/_logging.py`) — stamps the active
flyte action context on every record, lazily importing `flyte._context`
to avoid import-time cycles.
- **`ContextFormatter`** replaces `ContextFilter` +
`FlyteInternalFilter`. The `[flyte]` internal-prefix marker is now a
formatter flag (`internal_prefix=True`), so the same record can be
formatted by multiple handlers without prefixes compounding.
- **`ContextFormatter(inner=...)`** — optional kwarg to delegate base
formatting to an existing `logging.Formatter`. Used to wrap pre-existing
root-handler formatters without losing their layout.
- **`initialize_logger(reset_root_logger=False)`** now wraps each
existing root-handler formatter with `ContextFormatter(inner=existing)`,
preserving main's "third-party log lines through root get
`[run][action]`" behavior. Idempotent on repeated init calls. Same
temporal limitation as main: handlers added after `flyte.init()` won't
be wrapped, but the factory still stamps the attrs so callers can format
them themselves.
- **`get_rich_handler(internal_prefix=True)`** — opt-out parameter; the
user-logger call site passes `False` so user-emitted lines don't carry
the `[flyte]` marker even under rich logging.

## Behavior parity vs main

| | main | this PR |
|---|---|---|
| `[run][action]` on flyte/flyte.user logs | ✅ | ✅ |
| `[run][action]` on third-party logs through root (when handler exists
at init) | ✅ (via filter) | ✅ (via formatter wrap) |
| `[run][action]` on third-party logs through root (handler added after
init) | ❌ | ❌ — but attrs still on record |
| `[flyte]` marker on flyte internal logs | ✅ | ✅ |
| Compounding prefixes when record passes through multiple handlers | ❌
(latent bug) | ✅ fixed |
| `record.run_name` / `action_name` available everywhere | partial | ✅
universal (factory) |

## Why this matters for users

`flyte.logger` and any child of `flyte.user` continue to work as before.
New: anyone using a stdlib logger (`logging.getLogger("myapp")`) outside
the flyte namespace also gets `run_name` / `action_name` /
`is_flyte_internal` stamped on every record by the factory — they just
need a formatter that references them (e.g.
`"[%(run_name)s][%(action_name)s] %(message)s"`).

## Example

`examples/basics/reuse_concurrent_logging.py` exercises three logger
configs with a reusable container running concurrent tasks, so log
output from multiple actions interleaves on a single stderr — each line
correctly prefixed by its own `[run][action]`:

1. `flyte.logger` — canonical user-facing logger.
2. `logging.getLogger("flyte.user.myapp")` — child of `flyte.user`,
inherits the formatter via propagation.
3. `logging.getLogger("myapp")` — fully independent logger with its own
handler that references the factory-stamped attrs in its format string.

## Test plan

- [x] `pytest tests/user_api/test_logging.py
tests/flyte/test_logging.py` passes (23 tests)
- [x] `test_user_logger_no_flyte_prefix` updated to check the formatter
flag instead of the removed `FlyteInternalFilter`
- [x] `test_user_logger_no_flyte_prefix_after_rich_init` — regression
test for rich-handler internal prefix bleeding into user logger
- [x] `test_initialize_logger_wraps_existing_root_handlers` — covers
wrap path + idempotency on repeated init
- [ ] Manual: run `examples/basics/reuse_concurrent_logging.py` against
a cluster and confirm all three worker variants emit
`[run][action]`-prefixed log lines on the shared reused container

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: M. Adil Fayyaz <62440954+AdilFayyaz@users.noreply.github.com>
samhita-alla pushed a commit that referenced this pull request May 13, 2026
## Summary

Replaces the per-logger `logging.Filter` approach for context stamping
with a global `LogRecordFactory`. The factory runs on every `LogRecord`
regardless of which logger created it, so any user-defined logger
automatically picks up `run_name`, `action_name`, and
`is_flyte_internal` attributes — no per-handler filter wiring required.

This also fixes a latent bug in main: the old `ContextFilter` and
`FlyteInternalFilter` mutated `record.msg` in place, so a record passing
through multiple handlers could get its `[run][action]` and `[flyte]`
prefixes prepended multiple times. Stamping attributes on the record
(and rendering them at format time) sidesteps that entirely.

## What changed

- **LogRecordFactory** (`src/flyte/_logging.py`) — stamps the active
flyte action context on every record, lazily importing `flyte._context`
to avoid import-time cycles.
- **`ContextFormatter`** replaces `ContextFilter` +
`FlyteInternalFilter`. The `[flyte]` internal-prefix marker is now a
formatter flag (`internal_prefix=True`), so the same record can be
formatted by multiple handlers without prefixes compounding.
- **`ContextFormatter(inner=...)`** — optional kwarg to delegate base
formatting to an existing `logging.Formatter`. Used to wrap pre-existing
root-handler formatters without losing their layout.
- **`initialize_logger(reset_root_logger=False)`** now wraps each
existing root-handler formatter with `ContextFormatter(inner=existing)`,
preserving main's "third-party log lines through root get
`[run][action]`" behavior. Idempotent on repeated init calls. Same
temporal limitation as main: handlers added after `flyte.init()` won't
be wrapped, but the factory still stamps the attrs so callers can format
them themselves.
- **`get_rich_handler(internal_prefix=True)`** — opt-out parameter; the
user-logger call site passes `False` so user-emitted lines don't carry
the `[flyte]` marker even under rich logging.

## Behavior parity vs main

| | main | this PR |
|---|---|---|
| `[run][action]` on flyte/flyte.user logs | ✅ | ✅ |
| `[run][action]` on third-party logs through root (when handler exists
at init) | ✅ (via filter) | ✅ (via formatter wrap) |
| `[run][action]` on third-party logs through root (handler added after
init) | ❌ | ❌ — but attrs still on record |
| `[flyte]` marker on flyte internal logs | ✅ | ✅ |
| Compounding prefixes when record passes through multiple handlers | ❌
(latent bug) | ✅ fixed |
| `record.run_name` / `action_name` available everywhere | partial | ✅
universal (factory) |

## Why this matters for users

`flyte.logger` and any child of `flyte.user` continue to work as before.
New: anyone using a stdlib logger (`logging.getLogger("myapp")`) outside
the flyte namespace also gets `run_name` / `action_name` /
`is_flyte_internal` stamped on every record by the factory — they just
need a formatter that references them (e.g.
`"[%(run_name)s][%(action_name)s] %(message)s"`).

## Example

`examples/basics/reuse_concurrent_logging.py` exercises three logger
configs with a reusable container running concurrent tasks, so log
output from multiple actions interleaves on a single stderr — each line
correctly prefixed by its own `[run][action]`:

1. `flyte.logger` — canonical user-facing logger.
2. `logging.getLogger("flyte.user.myapp")` — child of `flyte.user`,
inherits the formatter via propagation.
3. `logging.getLogger("myapp")` — fully independent logger with its own
handler that references the factory-stamped attrs in its format string.

## Test plan

- [x] `pytest tests/user_api/test_logging.py
tests/flyte/test_logging.py` passes (23 tests)
- [x] `test_user_logger_no_flyte_prefix` updated to check the formatter
flag instead of the removed `FlyteInternalFilter`
- [x] `test_user_logger_no_flyte_prefix_after_rich_init` — regression
test for rich-handler internal prefix bleeding into user logger
- [x] `test_initialize_logger_wraps_existing_root_handlers` — covers
wrap path + idempotency on repeated init
- [ ] Manual: run `examples/basics/reuse_concurrent_logging.py` against
a cluster and confirm all three worker variants emit
`[run][action]`-prefixed log lines on the shared reused container

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants