Description
Is there an existing issue for this?
- I have searched the existing issues
Describe the bug
When bootstrapping multiple (e.g. 3-5) AWS accounts by using the ADF account creation mechanic, the Stepfunction adf-account-bootstrapping fails to bootstrap the accounts, since the Lambda adf-bootstrapping-jump-role-manager is performing too many read operations on the Organizations service resulting in TooManyRequestsException.
Subsequently bootstrapping fails and must be re-triggered manually multiple times until it eventually succeeds.
Expected Behavior
Bootstrapping of accounts by SFN adf-account-bootstrapping should be working without error or manual re-triggers.
Current Behavior
Bootstrapping multiple accounts at once results in the following error in the adf-account-bootstrapping SFN:
{
"error": "Task failed. Granting the ADF Account-Bootstrapping Jump Role privileged cross-account access failed due to an error: An error occurred (TooManyRequestsException) when calling the ListParents operation (reached max retries: 4): AWS Organizations can't complete your request because another request is already in progress. Try again later.."
}
Steps To Reproduce
- Have a relatively large AWS organization (in our case ~500 accounts)
- add 3-5 Accounts to the definition file for ADF account provisioning in the aws-deployment-framework-bootstrap repository
- wait until aws-deployment-framework-bootstrap-pipeline triggers adf-account-bootstrapping SFN
- SFN will fail due to described problem
Possible Solution
- Implement error handling and retry mechanic for TooManyRequestsException
- reduce amount of read operations for Organizations service during adf-bootstrapping-jump-role-manager Lambda execution (as I could see from the logs, the whole organization is traversed in each SFN execution which is only related to a single account
Additional Information/Context
No response
ADF Version
4.0.0
Contributing a fix?
- Yes, I am working on a fix to resolve this issue