Skip to content

fix backend listener#681

Merged
xutongNV merged 2 commits intomainfrom
xutongr/fic
Mar 11, 2026
Merged

fix backend listener#681
xutongNV merged 2 commits intomainfrom
xutongr/fic

Conversation

@xutongNV
Copy link
Contributor

@xutongNV xutongNV commented Mar 11, 2026

Description

Issue #None

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Summary by CodeRabbit

  • Bug Fixes
    • Improved time handling by switching internal timestamp parsing and timeout calculations to timezone-aware UTC. This makes container error timeout detection and related status reporting consistent and reliable across different system timezones, reducing false or inconsistent backend error markings.

@xutongNV xutongNV requested a review from a team as a code owner March 11, 2026 17:10
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

Updated time handling to use timezone-aware UTC now() in runtime logic and corresponding tests: calculate_pod_status now computes time differences with UTC, and test helpers return UTC-aware datetimes.

Changes

Cohort / File(s) Summary
Time Zone Adjustment (runtime)
src/operator/backend_listener.py
Replaced naive local datetime.now() usage with datetime.now(datetime.timezone.utc) in calculate_pod_status for CreateContainerConfigError waiting/timeout computation.
Time Zone Adjustment (tests)
src/operator/tests/test_pod_status_calculator.py
Updated parse_time_string to produce timezone-aware UTC datetimes (utc_now = datetime.now(datetime.timezone.utc)) and use it for "now", relative parsing, and fallback cases; docstring updated accordingly.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I hopped the clock from local to blue,
UTC now guides what I do,
One careful change, both test and core,
Time aligned — I twitched my whiskers more. 🕰️✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is vague and does not specify what aspect of the backend listener was fixed or what problem was addressed. Use a more descriptive title that explains the specific fix, such as 'Use UTC timezone for pod status timeout calculations' or 'Fix timezone handling in backend listener status checks'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch xutongr/fic

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/operator/backend_listener.py (2)

933-941: ⚠️ Potential issue | 🔴 Critical

Fix timezone mismatch causing pipeline failure.

The change to use datetime.datetime.now(datetime.timezone.utc) is correct in intent, but condition.last_transition_time from the Kubernetes API may be timezone-naive, causing the subtraction at line 936 to fail with TypeError: can't subtract offset-naive and offset-aware datetimes.

You need to ensure last_transition_time is timezone-aware before comparison.

🐛 Proposed fix to handle timezone mismatch
                     if condition.type == 'Ready' and condition.status == 'False':
                         now = datetime.datetime.now(datetime.timezone.utc)
                         last_transition_time = condition.last_transition_time
                         if last_transition_time:
+                            # Ensure timezone-aware comparison
+                            if last_transition_time.tzinfo is None:
+                                last_transition_time = last_transition_time.replace(
+                                    tzinfo=datetime.timezone.utc)
                             time_diff = now - last_transition_time

Note: The same issue likely exists at lines 952-955 for the ContainerStatusUnknown handling, which uses the same pattern. Consider applying the same fix there for consistency and safety.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/operator/backend_listener.py` around lines 933 - 941,
condition.last_transition_time from the k8s API can be timezone-naive, causing
TypeError when subtracting from datetime.now(timezone.utc); update the check
around now = datetime.datetime.now(datetime.timezone.utc) and time_diff = now -
last_transition_time (the block that sets
task.TaskGroupStatus.FAILED_BACKEND_ERROR and exit_code =
task.ExitCode.FAILED_BACKEND_ERROR.value) to first coerce last_transition_time
to an aware timestamp (e.g., if last_transition_time.tzinfo is None then treat
it as UTC by attaching datetime.timezone.utc or convert it) before computing
time_diff, and apply the same fix in the ContainerStatusUnknown handling that
uses the same pattern so both comparisons are timezone-aware and safe.

952-960: ⚠️ Potential issue | 🟠 Major

Same timezone mismatch vulnerability exists here.

This code block has the identical pattern and will fail with the same TypeError if last_transition_time is timezone-naive. Apply the same timezone handling fix for consistency.

🐛 Proposed fix
                     if condition.type == 'Ready' and condition.status == 'False':
                         now = datetime.datetime.now(datetime.timezone.utc)
                         last_transition_time = condition.last_transition_time
                         if last_transition_time:
+                            # Ensure timezone-aware comparison
+                            if last_transition_time.tzinfo is None:
+                                last_transition_time = last_transition_time.replace(
+                                    tzinfo=datetime.timezone.utc)
                             time_diff = now - last_transition_time
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/operator/backend_listener.py` around lines 952 - 960, The same
timezone-naive vulnerability appears when computing time_diff using now =
datetime.datetime.now(datetime.timezone.utc) and condition.last_transition_time;
ensure last_transition_time is made timezone-aware before subtracting: check
condition.last_transition_time.tzinfo (or use hasattr) and if it's naive, set
its tzinfo to datetime.timezone.utc (e.g., via replace) so the subsequent
time_diff > datetime.timedelta(minutes=30) comparison is between two aware
datetimes; update the block that sets status =
task.TaskGroupStatus.FAILED_BACKEND_ERROR and exit_code =
task.ExitCode.FAILED_BACKEND_ERROR.value to use this normalized
last_transition_time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/operator/backend_listener.py`:
- Around line 933-941: condition.last_transition_time from the k8s API can be
timezone-naive, causing TypeError when subtracting from
datetime.now(timezone.utc); update the check around now =
datetime.datetime.now(datetime.timezone.utc) and time_diff = now -
last_transition_time (the block that sets
task.TaskGroupStatus.FAILED_BACKEND_ERROR and exit_code =
task.ExitCode.FAILED_BACKEND_ERROR.value) to first coerce last_transition_time
to an aware timestamp (e.g., if last_transition_time.tzinfo is None then treat
it as UTC by attaching datetime.timezone.utc or convert it) before computing
time_diff, and apply the same fix in the ContainerStatusUnknown handling that
uses the same pattern so both comparisons are timezone-aware and safe.
- Around line 952-960: The same timezone-naive vulnerability appears when
computing time_diff using now = datetime.datetime.now(datetime.timezone.utc) and
condition.last_transition_time; ensure last_transition_time is made
timezone-aware before subtracting: check condition.last_transition_time.tzinfo
(or use hasattr) and if it's naive, set its tzinfo to datetime.timezone.utc
(e.g., via replace) so the subsequent time_diff > datetime.timedelta(minutes=30)
comparison is between two aware datetimes; update the block that sets status =
task.TaskGroupStatus.FAILED_BACKEND_ERROR and exit_code =
task.ExitCode.FAILED_BACKEND_ERROR.value to use this normalized
last_transition_time.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 044b4958-d17d-4d25-90f2-883927bd9d5d

📥 Commits

Reviewing files that changed from the base of the PR and between b53f053 and a380b4b.

📒 Files selected for processing (1)
  • src/operator/backend_listener.py

Copy link
Collaborator

@ethany-nv ethany-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix title and description

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/operator/tests/test_pod_status_calculator.py (1)

36-49: LGTM! Timezone-aware UTC datetime changes are consistent.

The refactoring correctly ensures all code paths return timezone-aware UTC datetimes, aligning with the production code changes in backend_listener.py.

One minor consideration: the fallback on line 49 silently returns utc_now for invalid format strings (e.g., "now-5d" or malformed input). Since this is a test helper, you might want to raise an error for unsupported formats to catch test configuration mistakes early.

💡 Optional: Add explicit error for unsupported formats
         elif parts.endswith('h'):
             hours = int(parts[:-1])
             return utc_now - datetime.timedelta(hours=hours)
-    return utc_now
+        else:
+            raise ValueError(f"Unsupported time format: {time_str}")
+    raise ValueError(f"Unsupported time format: {time_str}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/operator/tests/test_pod_status_calculator.py` around lines 36 - 49, The
helper parse_time_string currently falls back to returning utc_now for unknown
or malformed inputs; instead change it to raise a clear ValueError for
unsupported formats so test misconfiguration fails fast. In parse_time_string,
after handling 'now', 'now-<n>m', and 'now-<n>h' cases, replace the final return
utc_now with raising ValueError(f"Unsupported time string: {time_str}"), and
ensure any int conversion/parsing paths propagate errors (or convert parse
errors into the same ValueError) so inputs like "now-5d" or malformed strings
trigger a test failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/operator/tests/test_pod_status_calculator.py`:
- Around line 36-49: The helper parse_time_string currently falls back to
returning utc_now for unknown or malformed inputs; instead change it to raise a
clear ValueError for unsupported formats so test misconfiguration fails fast. In
parse_time_string, after handling 'now', 'now-<n>m', and 'now-<n>h' cases,
replace the final return utc_now with raising ValueError(f"Unsupported time
string: {time_str}"), and ensure any int conversion/parsing paths propagate
errors (or convert parse errors into the same ValueError) so inputs like
"now-5d" or malformed strings trigger a test failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4b0e1931-4749-49a3-8702-0711632884ec

📥 Commits

Reviewing files that changed from the base of the PR and between a380b4b and 796ddf5.

📒 Files selected for processing (1)
  • src/operator/tests/test_pod_status_calculator.py

@xutongNV xutongNV enabled auto-merge (squash) March 11, 2026 20:02
@xutongNV xutongNV merged commit a484225 into main Mar 11, 2026
9 checks passed
@xutongNV xutongNV deleted the xutongr/fic branch March 11, 2026 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants