Skip to content

Conversation

@tushar00jain
Copy link
Contributor

Summary:

  • setup 3 basic structured logs
    • quorums: every time a rank changes quorum id
    • commits: every time a rank commits a step
    • errors: every time a rank calls abort on process group
  • allow otel.py to initialize loggers in multiple namespaces for each structured loggers
  • pass replica_id to process group so that it can be logged
  • flag to enable or disable otel logging

Differential Revision: D84571482

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 14, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 14, 2025

@tushar00jain has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84571482.

tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 14, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 14, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 14, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 14, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 14, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 15, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 15, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Differential Revision: D84571482
tushar00jain added a commit to tushar00jain/torchft that referenced this pull request Oct 16, 2025
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Reviewed By: d4l3k

Differential Revision: D84571482
Summary:

- setup 3 basic structured logs
  - quorums: every time a rank changes quorum id
  - commits: every time a rank commits a step
  - errors: every time a rank calls abort on process group
- allow `otel.py` to initialize loggers in multiple namespaces for each structured loggers
- pass `replica_id` to process group so that it can be logged
- flag to enable or disable otel logging

Reviewed By: d4l3k

Differential Revision: D84571482
@meta-codesync
Copy link

meta-codesync bot commented Oct 16, 2025

This pull request has been merged in b3be7ad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants