-
Notifications
You must be signed in to change notification settings - Fork 70
Fix NVMe raw_instance_storage device enumeration for all instance families #196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
neddp
wants to merge
8
commits into
master
Choose a base branch
from
fix-nvme-raw-instance-storage
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+68
−59
Draft
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
8040fef
Update NVMe device path calculation and adjust test expectations for …
neddp 60cbe25
Enhance NVMe storage tests to cover all instance families and improve…
neddp 5a9018a
Update block device mappings to exclude auto-attached NVMe instance s…
neddp 580ef4c
Fix integration test expectation for NVMe raw ephemeral device path
neddp 9d29a70
Remove unnecessary comments
neddp d7b36a3
Refactor raw ephemeral device handling
neddp 56e8ef0
Update NVMe device paths in tests to match simplified hints
neddp 5d94f59
Update raw ephemeral device path to match NVMe instance storage
neddp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,9 +3,10 @@ class BlockDeviceManager | |
| DEFAULT_INSTANCE_STORAGE_DISK_MAPPING = { device_name: '/dev/sdb', virtual_name: 'ephemeral0' }.freeze | ||
| NVME_EBS_BY_ID_DEVICE_PATH_PREFIX = '/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_' | ||
|
|
||
| # Newer, nitro-based instances use NVMe storage volumes. | ||
| # Instance families that use NVMe device naming (/dev/nvme*). | ||
| # This includes Nitro-based instances and some Xen-based instances with NVMe storage (e.g., i3 family). | ||
| # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances | ||
| NVME_INSTANCE_FAMILIES = %w[a1 c5 c5a c5ad c5d c5n c6a c6g c6gd c6gn c6i c6id c6in c7i c7a d3 d3en g4dn g4ad g5 g6 g6e i3en i4i inf1 m5 m5a m5ad m5d m5dn m5n m5zn m6a m6g m6gd m6i m6id m6idn m6in m7i m7a m7i-flex p3dn p4d p5 r5 r5a r5ad r5b r5d r5dn r5n r6a r6g r6gd r6i r6in r6id r6idn r7i r7a r7iz t3 t3a t4g z1d x2iezn].freeze | ||
| NVME_INSTANCE_FAMILIES = %w[a1 c5 c5a c5ad c5d c5n c6a c6g c6gd c6gn c6i c6id c6in c7i c7a d3 d3en g4dn g4ad g5 g6 g6e i3 i3en i4i inf1 m5 m5a m5ad m5d m5dn m5n m5zn m6a m6g m6gd m6i m6id m6idn m6in m7i m7a m7i-flex p3dn p4d p5 r5 r5a r5ad r5b r5d r5dn r5n r6a r6g r6gd r6i r6in r6id r6idn r7i r7a r7iz t3 t3a t4g z1d x2iezn].freeze | ||
|
|
||
| def initialize(logger, stemcell, vm_cloud_props) | ||
| @logger = logger | ||
|
|
@@ -59,7 +60,9 @@ def self.requires_nvme_device(instance_type) | |
|
|
||
| def mappings(info) | ||
| instance_type = @vm_cloud_props.instance_type.nil? ? 'unspecified' : @vm_cloud_props.instance_type | ||
| if instance_type =~ /^i3\./ || instance_type =~ /^i3en\./ | ||
| # For NVMe instances with instance storage, AWS auto-attaches the disks, | ||
| # so we don't include them in the block device mappings | ||
| if BlockDeviceManager.requires_nvme_device(instance_type) | ||
| info = info.reject { |device| device[:bosh_type] == 'raw_ephemeral' } | ||
| end | ||
|
|
||
|
|
@@ -150,34 +153,37 @@ def ephemeral_disk_mapping(instance_type, disk_info) | |
| result | ||
| end | ||
|
|
||
| def first_raw_ephemeral_device | ||
| instance_type = @vm_cloud_props.instance_type.nil? ? 'unspecified' : @vm_cloud_props.instance_type | ||
| case @virtualization_type | ||
|
|
||
| when 'hvm' | ||
| if instance_type =~ /^i3\./ || instance_type =~ /^i3en\./ | ||
| '/dev/nvme0n1' | ||
| else | ||
| '/dev/xvdba' | ||
| end | ||
| when 'paravirtual' | ||
| '/dev/sdc' | ||
| else | ||
| raise Bosh::Clouds::CloudError, "unknown virtualization type #{@virtualization_type}" | ||
| end | ||
| end | ||
|
|
||
| def raw_instance_mappings(num_of_devices) | ||
| next_device = first_raw_ephemeral_device | ||
|
|
||
| # Device hints for raw ephemeral disks: | ||
| # - NVMe instances: Agent discovers instance storage at runtime (enumeration order varies) | ||
| # by excluding EBS volumes identified via /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_* | ||
| # These device_name hints are informational only - agent ignores them and uses discovery. | ||
| # - Paravirtual/HVM instances: Agent trusts these device paths (enumeration is deterministic) | ||
| # | ||
| # In all cases, the count (num_of_devices) must be correct to trigger setup. | ||
|
|
||
| instance_type = @vm_cloud_props.instance_type || 'unspecified' | ||
| requires_nvme = BlockDeviceManager.requires_nvme_device(instance_type) | ||
|
|
||
| num_of_devices.times.map do |index| | ||
| result = { | ||
| { | ||
| virtual_name: "ephemeral#{index}", | ||
| device_name: next_device, | ||
| bosh_type: 'raw_ephemeral', | ||
| device_name: raw_ephemeral_device_name(index, requires_nvme), | ||
| bosh_type: 'raw_ephemeral' | ||
| } | ||
| next_device = next_raw_ephemeral_disk(next_device) | ||
| result | ||
| end | ||
| end | ||
|
|
||
| def raw_ephemeral_device_name(index, requires_nvme) | ||
| if requires_nvme | ||
| # Simple sequential hints - agent will discover actual devices via EBS symlink exclusion | ||
| "/dev/nvme#{index}n1" | ||
| elsif @virtualization_type == 'paravirtual' | ||
| "/dev/sd#{(99 + index).chr}" # 99 is 'c'.ord - starts at sdc, sdd, sde... | ||
| elsif @virtualization_type == 'hvm' | ||
| "/dev/xvdb#{(97 + index).chr}" # 97 is 'a'.ord - starts at xvdba, xvdbb... | ||
| else | ||
| raise Bosh::Clouds::CloudError, "unknown virtualization type #{@virtualization_type}" | ||
| end | ||
| end | ||
|
Comment on lines
+177
to
188
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick | 🔵 Trivial Consider using character literals instead of magic numbers for clarity. While the comments explain the values, using ♻️ Proposed refactor def raw_ephemeral_device_name(index, requires_nvme)
if requires_nvme
# Simple sequential hints - agent will discover actual devices via EBS symlink exclusion
"/dev/nvme#{index}n1"
elsif `@virtualization_type` == 'paravirtual'
- "/dev/sd#{(99 + index).chr}" # 99 is 'c'.ord - starts at sdc, sdd, sde...
+ "/dev/sd#{('c'.ord + index).chr}" # starts at sdc, sdd, sde...
elsif `@virtualization_type` == 'hvm'
- "/dev/xvdb#{(97 + index).chr}" # 97 is 'a'.ord - starts at xvdba, xvdbb...
+ "/dev/xvdb#{('a'.ord + index).chr}" # starts at xvdba, xvdbb...
else
raise Bosh::Clouds::CloudError, "unknown virtualization type #{`@virtualization_type`}"
end
end🤖 Prompt for AI Agents |
||
|
|
||
|
|
@@ -225,16 +231,6 @@ def root_device_name | |
| end | ||
| end | ||
|
|
||
| def next_raw_ephemeral_disk(current_disk) | ||
| if current_disk =~ /^\/dev\/nvme/ | ||
| disk_id = /^\/dev\/nvme(\d+)n.*/.match(current_disk)[1] | ||
| disk_id = disk_id.next | ||
| "/dev/nvme#{disk_id}n1" | ||
| else | ||
| current_disk.next | ||
| end | ||
| end | ||
|
|
||
| class DiskInfo | ||
| INSTANCE_TYPE_DISK_MAPPING = { | ||
| # previous generation | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
AWS EC2 Nitro instance types 2025💡 Result:
AWS EC2 Nitro instance types encompass all current-generation instances built on the AWS Nitro System, which powers the majority of modern EC2 offerings as of 2026. Xen-based instances are legacy and not Nitro-based. The official AWS documentation lists Nitro-based instances across categories. Nitro versions range from v2 to v6, with newer 2025 announcements on v6 (e.g., M8gn, M8gb, C8a, R8i, I7i, I8ge). Key lists from docs: Virtualized Nitro instances (Nitro v6 highlighted as latest): - General Purpose: M8a, M8azn, M8gb, M8gn, M8i, M8id, M8i-flex; earlier like M5*, M6*, M7*, T3/T3a/T4g - Compute Optimized: C8a, C8gb, C8gn, C8i, C8id, C8i-flex; earlier C5*, C6*, C7* - Memory Optimized: R8a, R8gb, R8gn, R8i, R8id, R8i-flex, X8aedz, X8i; earlier R5*, R6*, R7*, U*, X* - Storage Optimized: I8ge; earlier I3en, I4*, I7* - Accelerated: G7e, P6-B200, P6-B300; many others like G4*, G5*, G6*, Inf*, P4*, P5*, Trn* - HPC: Hpc8a; earlier Hpc6*, Hpc7* Bare metal Nitro instances mirror virtualized but dedicated hardware, e.g., M8i.metal, C8i.metal, etc. 2025 Nitro highlights: M8gn/M8gb (Dec 2025, Graviton4, up to 600Gbps net), C8a (Dec 2025, AMD EPYC), R8i/R8i-flex (Sep 2025, Intel Xeon6), I7i (Apr 2025, storage opt), I8ge (Aug 2025). Full current Nitro-based from UserGuide: M5/M5a/M5ad/M5d/M5dn/M5n/M5zn, M6a/M6g/M6gd/M6i/M6id/M6idn/M6in, M7a/M7g/M7gd/M7i/M7i-flex/M8*, T3/T3a/T4g (general); similar for other categories up to *8 series; storage D3/D3en/I3en/I4g/I4i/I7i/I7ie/I8g/I8ge; accel DL1/DL2q/G4ad/G4dn/G5/G5g/G6/G6e/G6f/Gr6/Gr6f/G7e/Inf1/Inf2/P4d/P4de/P5/P5e/P5en/P6-B200/P6-B300/P6e-GB200/Trn1/Trn1n/Trn2/Trn2u/VT1; HPC Hpc6a/Hpc6id/Hpc7a/Hpc7g/Hpc8a. All post-2017 launches are Nitro-based. For complete specs/sizes/regions, see AWS docs.
Citations:
Update NVME_INSTANCE_FAMILIES to include current Nitro instance families.
The list is missing multiple instance families released in 2024-2025. According to AWS documentation, missing Nitro-based families include:
c7g,c8a,c8gb,c8gn,c8i,c8id,c8i-flexm7g,m8a,m8azn,m8gb,m8gn,m8i,m8id,m8i-flexr7g,r8a,r8gb,r8gn,r8i,r8id,r8i-flexi4g,i7i,i7ie,i8g,i8geg7e,p6-b200,p6-b300trn2,trn2uWithout these families, the CPI will generate incorrect device paths on new instance types. Consider either adding these families to the list, or implementing a safer fallback mechanism for unknown instance families on Nitro (e.g., detect NVMe device naming at runtime).
🤖 Prompt for AI Agents