PR: Fix bootstrap_resource_assignments race condition on concurrent pod restart#5003
Open
rakdutta wants to merge 7 commits into
Open
PR: Fix bootstrap_resource_assignments race condition on concurrent pod restart#5003rakdutta wants to merge 7 commits into
rakdutta wants to merge 7 commits into
Conversation
…ock to prevent race conditions (fixes #4993) Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
… for resource assignments Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
…se conflicts in per-row commits Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
…leanup, and conflict tracking Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
Signed-off-by: Rakhi Dutta <rakhibiswas@yahoo.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
closes #4993 - Prevents race conditions when multiple gateway pods concurrently assign orphaned resources during startup, while preserving the PgBouncer compatibility fix from #4051.
Problem
When multiple replicas restart simultaneously (e.g., Kubernetes rolling deployment), they race when assigning orphaned resources to the admin team, causing
IntegrityErrorcrashes due to unique constraint violations.Why not use advisory locks?
Issue #4051 identified that advisory locks on the fast path cause indefinite hangs with PgBouncer in transaction pooling mode. Session-scoped locks get orphaned across backend handoffs, causing pods to spin for ~10 minutes until timeout. This was fixed in commit e4b245f by skipping advisory locks on the fast path.
Solution
Use per-row commits with
IntegrityErrorexception handling instead of locks:Key insight: Let the database enforce uniqueness constraints. Handle exceptions gracefully instead of preventing them with locks.
Changes
mcpgateway/bootstrap_db.py: AddedIntegrityErrorimport, changed batch commit to per-row commits with exception handlingtests/unit/mcpgateway/test_bootstrap_db.py: Updated test to verify fast path remains lock-freeTesting
Concurrency test (
scripts/test_issue_4993_fix.sh):Unit tests:
Performance Impact
Deployment Impact
Related Issues