Skip to content

Conversation

@pvditt
Copy link
Contributor

@pvditt pvditt commented Nov 20, 2025

Why are the changes needed?

pods can get stuck in terminating essentially indefinitely due to terminationGracePeriodSeconds getting incorrectly cast from the config value

What changes were proposed in this pull request?

correctly cast config value for terminationGracePeriodSeconds

How was this patch tested?

has been running in Union clusters for a while

Labels

Please add one or more of the following labels to categorize your PR:

  • added: For new features.
  • changed: For changes in existing functionality.
  • deprecated: For soon-to-be-removed features.
  • removed: For features being removed.
  • fixed: For any bug fixed.
  • security: In case of vulnerabilities

This is important to improve the readability of release notes.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

## Overview
some v2 pods in demo were stuck in terminating. This was due to the deletionTimeStamp being in the far future due to terminationGracePeriodSeconds being set to 3600000000000

## Test Plan
ran locally

## Rollout Plan (if applicable)
managed-all

## Upstream Changes
Should this change be upstreamed to OSS (flyteorg/flyte)? If not, please uncheck this box, which is used for auditing. Note, it is the responsibility of each developer to actually upstream their changes. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F).
- [x] To be upstreamed to OSS

## Issue
fixes: https://linear.app/unionai/issue/BB-6136/demo-pods-stuck-terminating

## Checklist
* [ ] Added tests
* [ ] Ran a deploy dry run and shared the terraform plan
* [ ] Added logging and metrics
* [ ] Updated [dashboards](https://unionai.grafana.net/dashboards) and [alerts](https://unionai.grafana.net/alerting/list)
* [ ] Updated documentation

Signed-off-by: Paul Dittamo <[email protected]>
@flyte-bot
Copy link
Collaborator

Bito Automatic Review Skipped - Draft PR

Bito didn't auto-review because this pull request is in draft status.
No action is needed if you didn't intend for the agent to review it. Otherwise, to manually trigger a review, type /review in a comment and save.
You can change draft PR review settings here, or contact your Bito workspace admin at [email protected].

@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.72%. Comparing base (bca4499) to head (7f9315a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6753   +/-   ##
=======================================
  Coverage   59.71%   59.72%           
=======================================
  Files         929      929           
  Lines       58011    58012    +1     
=======================================
+ Hits        34642    34646    +4     
+ Misses      20214    20211    -3     
  Partials     3155     3155           
Flag Coverage Δ
unittests-datacatalog 60.30% <ø> (ø)
unittests-flyteadmin 57.86% <ø> (ø)
unittests-flytecopilot 43.16% <ø> (ø)
unittests-flytectl 65.36% <ø> (+0.05%) ⬆️
unittests-flyteidl 78.64% <ø> (ø)
unittests-flyteplugins 62.05% <100.00%> (+<0.01%) ⬆️
unittests-flytepropeller 55.55% <ø> (ø)
unittests-flytestdlib 64.02% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants