Skip to content

Conversation

@herbertgoto
Copy link

Issue #:
kubernetes-sigs/karpenter#1988

Description of changes:
This blueprint shows how to automatically resize EBS volumes based on the EC2 instance type that Karpenter provisions. EBS volume size requirements differ among different instance types and this pattern ensures that each node gets an appropriately sized root volume without manual intervention.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choic

Copy link
Contributor

@chrismld chrismld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hello Herbert, thanks for your contribution!! I left a few comments. For me, the blueprint didn't work, would you mind checking again? let me know if you'd like to troubleshoot with my setup :)

Copy link
Contributor

@chrismld chrismld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks again Herb, the blueprint works for me now ... still, I'm requesting a few more changes to make a few things even easier to understand. Also, there's one important change that we need in order to test the whole blueprint easily. Also, please don't solve the conversation as it helps me to track what my feedback was :P

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: dynamic-disk-volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use different names both for NodePool and EC2NodeClass ... every time we make a big change to the repo we need to test each blueprint, using different names will allow us to run all the commands and test all the commands in the blueprint without having to cleanup or decide which side to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both EC2NodeClass still have the same name

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: dynamic-disk-volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as al2023

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both EC2NodeClass still have the same name

## Purpose

This blueprint shows how to automatically resize EBS volumes based on the EC2 instance type that Karpenter provisions. EBS volume size requirements differ among different instance types and this pattern ensures that each node gets an appropriately sized root volume without manual intervention. Some use cases this pattern supports:
* Larger instances host more pods, therefore, it is necessary to have larger EBS volumes to store the corresponding container images.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not quite convinced the problem we're solving here is clear here. What I'd like to see is something like: "Let's say you have 10 pods that have large needs for storage and are pulling at least three different container images, as with Karpenter you'll most likely end up having multiple node sizes, you might be in trouble if the EBS volume doesn't have enough storage. Therefore, this blueprint is to help you resize an EBS volume based on the node size. For instance, this blueprint is assuming that if Karpenter launches a 4xlarge instance, the EBS volume size should be 500Gb, if it's a 6xlarge it should be 600Gb, etc."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually have this almost at the end:

"For example, if Karpenter provisioned a c6i.2xlarge instance, you should see that the /dev/xvda device has been automatically resized to 300GB (as per the sizing logic for 2xlarge instances), even though the initial EBS volume was created with only 20GB."

I like this, but having it at the end it's already too late don't you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several issues around this:
aws/karpenter-provider-aws#2394
kubernetes-sigs/karpenter#1988

  • 180 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I'm aware, but my point here is more about the story line and not about the issues ... having this clarification before you start deploying will help users to understand what they're about to do

volumeType: gp3
deleteOnTermination: true
encrypted: true
userData: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the same placeholder (user-data = "<<BASE64_USER_DATA>>") as the Bottlerocket one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the same script is going to work, but I think it helps to have consistency here and use the same pattern in regards of using a script and then injecting it. Or keep the whole script within the EC2NodeClass definition. Whichever approach you think works better long-term it should be used in both. I personally think that having a separate script forces you to do a replacement every time you need to deploy a change, so I prefer the al2023 approach.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consolidated everything in one script but it has to be pass by differently to each OS.
The steps are the same in the guide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you help me understand this bit: "but it has to be pass by differently to each OS"?


</details>

Once you have deployed the `EC2NodeClass` and `Nodepool` for Amazon Linux 2023 or Bottlecket, proceed to deploy the test workload:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you talk about the needs of this workload? that will help to confirm why it made sense to use 300Gb for a 2xlarge instance, say for example: "each pod needs 30Gb, and only 10 can fit into a 2xlarge" or something like that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test workload that it is intended to show that the volume was actually resized.
Is it really relevant that the test complies with the use case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is, it helps to understand better what we're doing and showcasing here

apiVersion: apps/v1
kind: Deployment
metadata:
name: dynamic-disk-ebs-volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use a different workload per OS as per my previous comment to keep things separate and it's easier to test the whole blueprint :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can duplicate the instructions in each section but it makes no sense to duplicate the deployment manifest since it is the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main concern is that at the moment we don't have an automated way to test all blueprints every time we need to make a major update, we might skip one or the other when doing a test ... perhaps we can just add a command to replace the nodeSelectors for each OS (or at least for one)?

@herbertgoto herbertgoto requested a review from chrismld August 16, 2025 22:24
Copy link
Contributor

@chrismld chrismld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the new changes Herb, there are a few things that still need to be resolved and I added a comment

userData: |
[settings.bootstrap-containers.ebsresize]
mode = "once"
user-data = "IyEvYmluL2Jhc2gKc2V0IC1lCgpnZXRfaW1kc190b2tlbigpIHsKICAgIGN1cmwgLVggUFVUICJodHRwOi8vMTY5LjI1NC4xNjkuMjU0L2xhdGVzdC9hcGkvdG9rZW4iIFwKICAgICAgICAtSCAiWC1hd3MtZWMyLW1ldGFkYXRhLXRva2VuLXR0bC1zZWNvbmRzOiAyMTYwMCIgXAogICAgICAgIC0tbWF4LXRpbWUgMTAgLS1yZXRyeSAzCn0KCmdldF9tZXRhZGF0YSgpIHsKICAgIGxvY2FsIHBhdGg9JDEKICAgIGxvY2FsIHRva2VuPSQyCiAgICBjdXJsIC1IICJYLWF3cy1lYzItbWV0YWRhdGEtdG9rZW46ICR0b2tlbiIgXAogICAgICAgICJodHRwOi8vMTY5LjI1NC4xNjkuMjU0L2xhdGVzdC9tZXRhLWRhdGEvJHBhdGgiIFwKICAgICAgICAtLW1heC10aW1lIDEwIC0tcmV0cnkgMwp9CgpkZXRlY3Rfb3MoKSB7CiAgICBpZiBbIC1mIC9ldGMvYm90dGxlcm9ja2V0LXJlbGVhc2UgXTsgdGhlbgogICAgICAgIGVjaG8gImJvdHRsZXJvY2tldCIKICAgIGVsaWYgWyAtZiAvZXRjL29zLXJlbGVhc2UgXSAmJiBncmVwIC1xICJBbWF6b24gTGludXgiIC9ldGMvb3MtcmVsZWFzZTsgdGhlbgogICAgICAgIGVjaG8gImFsMjAyMyIKICAgIGZpCn0KCmdldF90YXJnZXRfc2l6ZV9ieV9zdWZmaXgoKSB7CiAgICBsb2NhbCBpbnN0YW5jZV90eXBlPSQxCiAgICBsb2NhbCBzaXplX3N1ZmZpeD0kKGVjaG8gJGluc3RhbmNlX3R5cGUgfCBzZWQgJ3MvLipcLi8vJykKCiAgICBjYXNlICRzaXplX3N1ZmZpeCBpbgogICAgICAgIG5hbm8pIGVjaG8gMjAgOzsKICAgICAgICBtaWNybykgZWNobyAzMCA7OwogICAgICAgIHNtYWxsKSBlY2hvIDQwIDs7CiAgICAgICAgbWVkaXVtKSBlY2hvIDYwIDs7CiAgICAgICAgbGFyZ2UpIGVjaG8gMTAwIDs7CiAgICAgICAgeGxhcmdlKSBlY2hvIDIwMCA7OwogICAgICAgIDJ4bGFyZ2UpIGVjaG8gMzAwIDs7CiAgICAgICAgM3hsYXJnZSkgZWNobyA0MDAgOzsKICAgICAgICA0eGxhcmdlKSBlY2hvIDUwMCA7OwogICAgICAgIDZ4bGFyZ2UpIGVjaG8gNjAwIDs7CiAgICAgICAgOHhsYXJnZXw5eGxhcmdlKSBlY2hvIDgwMCA7OwogICAgICAgIDEyeGxhcmdlKSBlY2hvIDEwMDAgOzsKICAgICAgICAxNnhsYXJnZXwxOHhsYXJnZSkgZWNobyAxMjAwIDs7CiAgICAgICAgMjR4bGFyZ2UpIGVjaG8gMTUwMCA7OwogICAgICAgIDMyeGxhcmdlfDQ4eGxhcmdlfDU2eGxhcmdlfDExMnhsYXJnZSkgZWNobyAyMDAwIDs7CiAgICAgICAgbWV0YWwpIGVjaG8gMTAwMCA7OwogICAgICAgICopIGVjaG8gMTAwIDs7CiAgICBlc2FjCn0KCnJlc2l6ZV9lYnNfZm9yX2luc3RhbmNlKCkgewogICAgbG9jYWwgb3NfdHlwZT0kKGRldGVjdF9vcykKICAgIGxvY2FsIGRldmljZQogICAgCiAgICBjYXNlICRvc190eXBlIGluCiAgICAgICAgYm90dGxlcm9ja2V0KQogICAgICAgICAgICBkZXZpY2U9Ii9kZXYveHZkYiIKICAgICAgICAgICAgOzsKICAgICAgICBhbDIwMjMpCiAgICAgICAgICAgIGRldmljZT0iL2Rldi94dmRhIgogICAgICAgICAgICA7OwogICAgZXNhYwoKICAgIGxvY2FsIHRva2VuPSQoZ2V0X2ltZHNfdG9rZW4pCiAgICBpZiBbIC16ICIkdG9rZW4iIF07IHRoZW4KICAgICAgICBlY2hvICJGYWlsZWQgdG8gZ2V0IElNRFN2MiB0b2tlbiIKICAgICAgICByZXR1cm4gMQogICAgZmkKCiAgICBsb2NhbCBpbnN0YW5jZV9pZD0kKGdldF9tZXRhZGF0YSAiaW5zdGFuY2UtaWQiICIkdG9rZW4iKQogICAgbG9jYWwgcmVnaW9uPSQoZ2V0X21ldGFkYXRhICJwbGFjZW1lbnQvcmVnaW9uIiAiJHRva2VuIikKICAgIGxvY2FsIGluc3RhbmNlX3R5cGU9JChnZXRfbWV0YWRhdGEgImluc3RhbmNlLXR5cGUiICIkdG9rZW4iKQoKICAgIGlmIFsgLXogIiRpbnN0YW5jZV9pZCIgXSB8fCBbIC16ICIkcmVnaW9uIiBdIHx8IFsgLXogIiRpbnN0YW5jZV90eXBlIiBdOyB0aGVuCiAgICAgICAgZWNobyAiRmFpbGVkIHRvIGdldCByZXF1aXJlZCBtZXRhZGF0YSIKICAgICAgICByZXR1cm4gMQogICAgZmkKCiAgICBsb2NhbCB0YXJnZXRfc2l6ZT0kKGdldF90YXJnZXRfc2l6ZV9ieV9zdWZmaXggIiRpbnN0YW5jZV90eXBlIikKICAgIGVjaG8gIk9TOiAkb3NfdHlwZSwgSW5zdGFuY2U6ICRpbnN0YW5jZV90eXBlIC0+IFRhcmdldDogJHt0YXJnZXRfc2l6ZX1HQiIKCiAgICBsb2NhbCB2b2x1bWVfaWQ9JChhd3MgZWMyIGRlc2NyaWJlLWluc3RhbmNlcyAtLWluc3RhbmNlLWlkcyAkaW5zdGFuY2VfaWQgXAogICAgICAgIC0tcXVlcnkgJ1Jlc2VydmF0aW9uc1swXS5JbnN0YW5jZXNbMF0uQmxvY2tEZXZpY2VNYXBwaW5nc1s/RGV2aWNlTmFtZT09YCckZGV2aWNlJ2BdLkVicy5Wb2x1bWVJZCcgXAogICAgICAgIC0tb3V0cHV0IHRleHQgLS1yZWdpb24gJHJlZ2lvbikKCiAgICBpZiBbIC16ICIkdm9sdW1lX2lkIiBdIHx8IFsgIiR2b2x1bWVfaWQiID0gIk5vbmUiIF07IHRoZW4KICAgICAgICBlY2hvICJFcnJvcjogQ291bGQgbm90IGZpbmQgdm9sdW1lIGZvciBkZXZpY2UgJGRldmljZSIKICAgICAgICByZXR1cm4gMQogICAgZmkKCiAgICBsb2NhbCBjdXJyZW50X3NpemU9JChhd3MgZWMyIGRlc2NyaWJlLXZvbHVtZXMgLS12b2x1bWUtaWRzICR2b2x1bWVfaWQgXAogICAgICAgIC0tcXVlcnkgJ1ZvbHVtZXNbMF0uU2l6ZScgLS1vdXRwdXQgdGV4dCAtLXJlZ2lvbiAkcmVnaW9uKQoKICAgIGlmIFsgIiRjdXJyZW50X3NpemUiIC1nZSAiJHRhcmdldF9zaXplIiBdOyB0aGVuCiAgICAgICAgZWNobyAiVm9sdW1lIGFscmVhZHkgY29ycmVjdCBzaXplICgke2N1cnJlbnRfc2l6ZX1HQikiCiAgICAgICAgcmV0dXJuIDAKICAgIGZpCgogICAgZWNobyAiUmVzaXppbmcgdm9sdW1lIGZyb20gJHtjdXJyZW50X3NpemV9R0IgdG8gJHt0YXJnZXRfc2l6ZX1HQiIKICAgIGF3cyBlYzIgbW9kaWZ5LXZvbHVtZSAtLXZvbHVtZS1pZCAkdm9sdW1lX2lkIC0tc2l6ZSAkdGFyZ2V0X3NpemUgLS1yZWdpb24gJHJlZ2lvbgoKICAgICMgV2FpdCBmb3IgY29tcGxldGlvbgogICAgbG9jYWwgdGltZW91dD0zMDAKICAgIGxvY2FsIGVsYXBzZWQ9MAogICAgbG9jYWwgd2FpdF90aW1lPTIKCiAgICB3aGlsZSBbICRlbGFwc2VkIC1sdCAkdGltZW91dCBdOyBkbwogICAgICAgIGxvY2FsIHN0YXRlPSQoYXdzIGVjMiBkZXNjcmliZS12b2x1bWVzLW1vZGlmaWNhdGlvbnMgLS12b2x1bWUtaWRzICR2b2x1bWVfaWQgXAogICAgICAgICAgICAtLXF1ZXJ5ICdWb2x1bWVzTW9kaWZpY2F0aW9uc1swXS5Nb2RpZmljYXRpb25TdGF0ZScgLS1vdXRwdXQgdGV4dCAtLXJlZ2lvbiAkcmVnaW9uKQoKICAgICAgICBpZiBbWyAiJHN0YXRlIiA9ICJjb21wbGV0ZWQiIHx8ICIkc3RhdGUiID0gIm9wdGltaXppbmciIF1dOyB0aGVuCiAgICAgICAgICAgIGVjaG8gIlZvbHVtZSBtb2RpZmljYXRpb24gY29tcGxldGVkIgogICAgICAgICAgICBicmVhawogICAgICAgIGVsaWYgWyAiJHN0YXRlIiA9ICJmYWlsZWQiIF07IHRoZW4KICAgICAgICAgICAgZWNobyAiVm9sdW1lIG1vZGlmaWNhdGlvbiBmYWlsZWQiCiAgICAgICAgICAgIHJldHVybiAxCiAgICAgICAgZmkKCiAgICAgICAgc2xlZXAgJHdhaXRfdGltZQogICAgICAgIGVsYXBzZWQ9JCgoZWxhcHNlZCArIHdhaXRfdGltZSkpCiAgICAgICAgd2FpdF90aW1lPSQoKHdhaXRfdGltZSAqIDIpKQogICAgICAgIFsgJHdhaXRfdGltZSAtZ3QgMzAgXSAmJiB3YWl0X3RpbWU9MzAKICAgIGRvbmUKCiAgICBpZiBbICRlbGFwc2VkIC1nZSAkdGltZW91dCBdOyB0aGVuCiAgICAgICAgZWNobyAiVGltZW91dCB3YWl0aW5nIGZvciB2b2x1bWUgbW9kaWZpY2F0aW9uIgogICAgICAgIHJldHVybiAxCiAgICBmaQoKICAgICMgT1Mtc3BlY2lmaWMgZmlsZXN5c3RlbSByZXNpemUKICAgIGlmIFsgIiRvc190eXBlIiA9ICJhbDIwMjMiIF07IHRoZW4KICAgICAgICBncm93cGFydCAkZGV2aWNlIDEgfHwgewogICAgICAgICAgICBlY2hvICJGYWlsZWQgdG8gZXh0ZW5kIHBhcnRpdGlvbiIKICAgICAgICAgICAgcmV0dXJuIDEKICAgICAgICB9CgogICAgICAgIGxvY2FsIGZzX3R5cGU9JChsc2JsayAtZiAke2RldmljZX0xIHwgdGFpbCAtMSB8IGF3ayAne3ByaW50ICQyfScpCiAgICAgICAgY2FzZSAkZnNfdHlwZSBpbgogICAgICAgICAgICB4ZnMpCiAgICAgICAgICAgICAgICB4ZnNfZ3Jvd2ZzIC8gfHwgewogICAgICAgICAgICAgICAgICAgIGVjaG8gIkZhaWxlZCB0byByZXNpemUgWEZTIGZpbGVzeXN0ZW0iCiAgICAgICAgICAgICAgICAgICAgcmV0dXJuIDEKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgICAgIDs7CiAgICAgICAgICAgIGV4dDQpCiAgICAgICAgICAgICAgICByZXNpemUyZnMgJHtkZXZpY2V9MSB8fCB7CiAgICAgICAgICAgICAgICAgICAgZWNobyAiRmFpbGVkIHRvIHJlc2l6ZSBleHQ0IGZpbGVzeXN0ZW0iCiAgICAgICAgICAgICAgICAgICAgcmV0dXJuIDEKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgICAgIDs7CiAgICAgICAgICAgICopCiAgICAgICAgICAgICAgICBlY2hvICJVbnN1cHBvcnRlZCBmaWxlc3lzdGVtOiAkZnNfdHlwZSIKICAgICAgICAgICAgICAgIHJldHVybiAxCiAgICAgICAgICAgICAgICA7OwogICAgICAgIGVzYWMKICAgIGZpCgogICAgZWNobyAiRUJTIHJlc2l6ZSBjb21wbGV0ZWQgc3VjY2Vzc2Z1bGx5IgogICAgcmV0dXJuIDAKfQoKIyBFeGVjdXRlIHJlc2l6ZQppZiAhIHJlc2l6ZV9lYnNfZm9yX2luc3RhbmNlOyB0aGVuCiAgICBlY2hvICJFQlMgcmVzaXplIGZhaWxlZCIKZmkKCiMgT1Mtc3BlY2lmaWMgYm9vdHN0cmFwCm9zX3R5cGU9JChkZXRlY3Rfb3MpCmlmIFsgIiRvc190eXBlIiA9ICJhbDIwMjMiIF07IHRoZW4KICAgIC91c3IvYmluL25vZGVhZG0gaW5pdApmaQo="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep the same approach as AL2023 without encoding to base64

@chrismld
Copy link
Contributor

chrismld commented Jan 8, 2026

@herbertgoto hello, are you still planning to update this PR?

@herbertgoto
Copy link
Author

@chrismld ETA: 31/01/2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants