[Backport 2025.1] fix(aws): Make cloud-init wait for the network device#977
Draft
scylladbbot wants to merge 1 commit into
Draft
[Backport 2025.1] fix(aws): Make cloud-init wait for the network device#977scylladbbot wants to merge 1 commit into
scylladbbot wants to merge 1 commit into
Conversation
On some AWS instances, a race condition can occur where cloud-init starts before the ENA network interface is fully initialized. This can cause it to fail when it tries to fetch metadata or perform other network-dependent tasks. This commit addresses the issue by introducing a systemd override that forces the `cloud-init-local` service to wait until the `eth0` device is available. This ensures the network is ready before cloud-init runs. Additionally, the `ena` kernel module is now preloaded in initramfs to ensure it's available as early as possible during the boot process. Closes: SMI-243 (cherry picked from commit 232c519)
Author
|
@fruch - This PR has conflicts, therefore it was moved to |
Collaborator
|
@vponomaryov can you pickup this backport fix the conflict and give it a quick test with i3en as the original issue Be in sync with @dimakr for help on building AMIs in repo |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On some AWS instances, a race condition can occur where cloud-init
starts before the ENA network interface is fully initialized. This can
cause it to fail when it tries to fetch metadata or perform other
network-dependent tasks.
This commit addresses the issue by introducing a systemd override that
forces the
cloud-init-localservice to wait until theeth0device isavailable. This ensures the network is ready before cloud-init runs.
Additionally, the
enakernel module is now preloaded in initramfs toensure it's available as early as possible during the boot process.
Closes: SMI-243
Testing:
🟢 https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/artifacts-ami-test/50/
🟢 https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/artifacts-ami-test/51/
🟢 https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/artifacts-ami-test/52/
🟢 - i8g.48xlarge https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/artifacts-ami-arm-test/13/
(cherry picked from commit 232c519)
Parent PR: #858