Skip to content

WIP: initial proposals of supporting in-place cpu resize#81

Open
sivanzcw wants to merge 1 commit intoopenkruise:masterfrom
sivanzcw:feature
Open

WIP: initial proposals of supporting in-place cpu resize#81
sivanzcw wants to merge 1 commit intoopenkruise:masterfrom
sivanzcw:feature

Conversation

@sivanzcw
Copy link
Contributor

Ⅰ. Describe what this PR does

This enhancement proposes enabling in-place CPU resizing for sandboxes
allocated from the warm pool through a metadata-based approach.
When a sandbox is claimed via the E2B API, users can specify a CPU scale factor in the metadata
(e.g., e2b.agents.kruise.io/cpu-scale-factor: 2).
The sandbox manager will automatically resize the allocated sandbox's CPU resources
in-place using Kubernetes' pod resize subResource, allowing the warm pool to maintain
minimal resource configurations while enabling on-demand CPU scaling for claimed sandboxes.

Ⅱ. Does this pull request fix one issue?

fixes #68

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46.89%. Comparing base (acb07af) to head (af64dfc).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #81      +/-   ##
==========================================
+ Coverage   45.75%   46.89%   +1.13%     
==========================================
  Files          75       77       +2     
  Lines        4386     4506     +120     
==========================================
+ Hits         2007     2113     +106     
- Misses       2189     2198       +9     
- Partials      190      195       +5     
Flag Coverage Δ
unittests 46.89% <ø> (+1.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

1. **After Sandbox Claim**: Once a sandbox is successfully claimed from the pool
2. **Metadata Check**: Check if `cpu-scale-factor` metadata exists
3. **Current CPU Detection**: Read current CPU from pod spec or status
4. **Target Calculation**: Calculate target CPU = current * scaleFactor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls elaborate the calculation for multi-container and burstable QoS:

  1. for multi-container pod, we can scale up the first non-sidecar container
  2. for burstable qos, maybe we should scale up both request and limit

1. **Enable Metadata-Based CPU Scaling**: Allow users to specify CPU scale factor
via E2B API metadata when creating sandboxes
2. **In-Place Resize**: Leverage Kubernetes pod resize subResource to resize CPU without pod restart
3. **Early Return Support**: Optionally return sandbox immediately
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we introduce an annotation or featuregate to enable this support

| Yes / \ No
| | |
| v v
| [Return Error] [Call Pod /resize]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if resize is infeasible, shall we continue to claim other sandbox? plz elaborate the error handling logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feature request] Enable Inplace-resizing upon claiming sandboxes

2 participants