Skip to content

fix(controller): direct api to avoid stale promo object#5754

Open
shamsalmon wants to merge 1 commit intoakuity:release-1.9from
shamsalmon:5282_promo_staleness_fix_v2
Open

fix(controller): direct api to avoid stale promo object#5754
shamsalmon wants to merge 1 commit intoakuity:release-1.9from
shamsalmon:5282_promo_staleness_fix_v2

Conversation

@shamsalmon
Copy link

Closes: #5282

I am sure there is a more elegant solution than this but I wanted to get a solution out there. We are still testing this, but so far looking solid.

I noted on the issue why I believe this is occurring and I can confirm the "direct" api object is correct. However my initial assumptions were wrong.

Here is a timeline:

  1. Promotion starts
  2. "Running" - Status update sent
  3. Promotion works on something that requires a wait (git-wait-for-pr, argocd-sync) - Sends another status update
  4. Immediate reconciliation triggers, however informer does not have latest status / metadata.
  5. Reconciliation runs against the stale / cached promotion object which has no metadata

Using the apiReader (which already exists) we can ensure that we are getting the real and non cached version of the object.

@shamsalmon shamsalmon requested a review from a team as a code owner February 18, 2026 00:14
@netlify
Copy link

netlify bot commented Feb 18, 2026

Deploy Preview for docs-kargo-io ready!

Name Link
🔨 Latest commit d8466f0
🔍 Latest deploy log https://app.netlify.com/projects/docs-kargo-io/deploys/69950d71ee3d220008240fdb
😎 Deploy Preview https://deploy-preview-5754.docs.kargo.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Signed-off-by: sshannon <sshannon@Beeswax.com>
@shamsalmon shamsalmon force-pushed the 5282_promo_staleness_fix_v2 branch from 6141a4a to d8466f0 Compare February 18, 2026 00:53
@shamsalmon shamsalmon changed the title fix(api): direct api to avoid stale promo object fix(controller): direct api to avoid stale promo object Feb 18, 2026
@krancour
Copy link
Member

@shamsalmon you lost me at step 4. How or why is the Promotion being re-reconciled immediately? I agree this could theoretically cause the problem you've been observing, but I can't quite see how or why immediate re-reconciliation would happen in the first place.

@krancour
Copy link
Member

Hmmm. Is it from mashing refresh maybe?

@krancour
Copy link
Member

Or I supposed the potential exists that hitting it even once could possibly cause this.

@shamsalmon
Copy link
Author

I am sure my users are being aggressive with the refresh button but ive seen this happen just by chance pressing it once as you say.

@shamsalmon
Copy link
Author

I think several factors probably make this worse, including latency from kargo controller to targeted cluster in a sharded setup. We probably run 50-200 promotions a day (100ish users) and see this happen once or twice a day.

@krancour
Copy link
Member

I'm fairly certain the refresh has something to do with this. For one, it could mean a promo currently being reconciled is added to the work queue so that there is that immediate re-reconciliation you spoke of. Two, I believe handling of the refresh at the head end of the reconciliation process involves a programmatic immediate requeue. I don't have the code in front of me at the moment, but I plan to dig into whether or how these are individually, or in combination, contributing to this. We can potentially consider your proposed change as a stop gap, but I feel we're really close to the smoking gun here and may be able to do something more strategic.

@shamsalmon
Copy link
Author

Sounds good! I would dig into this more but I need to switch my focus onto some other priorities. If you do have another idea I would be happy to test it out for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants