Enhance the SubnetPort requeue logic#1364
Open
heypnus wants to merge 1 commit intovmware-tanzu:v9.1from
Open
Conversation
In some cases, when the subnetport_controller hits the RealizeStateError, it may update the SubnetPort status with detailed NSX SubnetPort path, which will trigger another reconcile in the subnetport_controller, that may make the wait time for requeue increase to a extreme long time in several seconds. So that the SubnetPort may need many minutes (up to 1000s) to be ready. This patch will alleviate this issue with the following fine tunings: 1. Shorten SubnetPort GC interval from 600s to 60s, to make the stale SubnetPorts to be GCed more quickly, in case to block the new SubnetPort's realization with the same IP. 2. Make the retry for RealizeStateError smoother, i.e. the ResultRequeueAfter60sec won't trigger the backoff of the wait time. This patch also adds a custom RateLimiter to facilitate troubleshooting such issues in the future. ``` Testing done: 1. Reproduce the issue by recreating the SubnetPort with same name in NCP downtime, it can be observed with in about 6 times. We can see that the SubnetPort CR was realized after a long time: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailUpdate 29m subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_stpqs realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/b741d0bf-7bce-4074-a4be-925b960494be] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/482d2c2f-41ff-4173-8a94-8b9f6a5b30ed]] Warning FailUpdate 29m subnetport-controller nsx error code: 503638, message: Port attachment ID 65613934-3139-4130-ad62-3337382d3436 used by another segment port /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_stpqs. Warning FailUpdate 29m subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_o3bkg realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/af0033c0-d8e4-4d3a-9d03-19a573f268ec] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/482d2c2f-41ff-4173-8a94-8b9f6a5b30ed]] Warning FailUpdate 28m subnetport-controller nsx error code: 503638, message: Port attachment ID 65613934-3139-4130-ad62-3337382d3436 used by another segment port /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_o3bkg. Warning FailUpdate 28m subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_gzjx9 realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/45134073-2421-4fbd-b5b7-360ab3dd3b2b] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/482d2c2f-41ff-4173-8a94-8b9f6a5b30ed]] Warning FailUpdate 28m subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_bwkuf realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/f46e5d3b-05f9-47e0-b4b4-56a1222711ec] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/482d2c2f-41ff-4173-8a94-8b9f6a5b30ed]] Warning FailUpdate 27m subnetport-controller nsx error code: 503638, message: Port attachment ID 65613934-3139-4130-ad62-3337382d3436 used by another segment port /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_bwkuf. Warning FailUpdate 27m subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_gh3g9 realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/e356a984-ddbd-42d5-9b92-4c9bca83efe9] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/482d2c2f-41ff-4173-8a94-8b9f6a5b30ed]] Warning FailUpdate 26m subnetport-controller nsx error code: 503638, message: Port attachment ID 65613934-3139-4130-ad62-3337382d3436 used by another segment port /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_gh3g9. Warning FailUpdate 26m (x16 over 26m) subnetport-controller (combined from similar events): /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_fze2o realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/9cc95b5e-ab76-4f5b-9a4e-36fc1ffeca29] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/482d2c2f-41ff-4173-8a94-8b9f6a5b30ed]] Normal SuccessfulUpdate 9m54s (x2 over 9m54s) subnetport-controller SubnetPort CR has been successfully updated 2. Patch the change, then retry step 1, the port realization can be completed within shorter time: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailUpdate 88s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_ewnmn realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/036977a1-1ec4-4ebb-af79-70619034d0f8] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 84s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_sbo0q realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/85810d82-f7fa-49fc-b32f-8aca152f194a] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 81s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_owafm realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/19ec74da-3695-4329-8152-4d2777f9be36] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 79s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_ec08b realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/8d5e495e-dd3c-4787-9a57-b18399414345] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 77s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_1i31y realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/3eb943ab-9d71-4c10-8ca4-75b41c217d74] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 75s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_q7uy0 realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/2638d24b-eee8-4106-8e4a-aae0fa08a39d] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 74s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_77b5w realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/bd738add-cc8d-46e9-a364-9106c7c71a76] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 73s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_88uti realized with errors: [] Warning FailUpdate 72s subnetport-controller /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_e73az realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/f7a12d09-f81d-47e5-8062-b79a86719569] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Warning FailUpdate 5s (x4 over 71s) subnetport-controller (combined from similar events): /orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ports/subnetport-03o-testsq_pcypd realized with errors: [IpAddressAllocation path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/8599c2b6-2356-4b58-9067-e5c9b27d04c7] cannot be created due to the requested IP has been allocated to path=[/orgs/default/projects/project-quality/vpcs/kube-system_5t2au/subnets/vm-default-d13d1cc5_tpv2k/ip-pools/static-ipv4-default/ip-allocations/a49c8e4c-6159-495e-b428-d75d5b9ee036]] Normal SuccessfulUpdate 4s (x2 over 4s) subnetport-controller SubnetPort CR has been successfully updated ``` (cherry picked from commit 922e910)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In some cases, when the subnetport_controller hits the RealizeStateError, it may update the SubnetPort status with detailed NSX SubnetPort path, which will trigger another reconcile in the subnetport_controller, that may make the wait time for requeue increase to a extreme long time in several seconds. So that the SubnetPort may need many minutes (up to 1000s) to be ready. This patch will alleviate this issue with the following fine tunings:
This patch also adds a custom RateLimiter to facilitate troubleshooting such issues in the future.
(cherry picked from commit 922e910)