|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: 'Kubernetes v1.32 增加了新的 CPU Manager 静态策略选项用于严格 CPU 预留' |
| 4 | +date: 2024-12-16 |
| 5 | +slug: cpumanager-strict-cpu-reservation |
| 6 | +author: > |
| 7 | + [Jing Zhang](https://github.com/jingczhang) (Nokia) |
| 8 | +translator: > |
| 9 | + [Xin Li](https://github.com/my-git9) (DaoCloud) |
| 10 | +--- |
| 11 | +<!-- |
| 12 | +layout: blog |
| 13 | +title: 'Kubernetes v1.32 Adds A New CPU Manager Static Policy Option For Strict CPU Reservation' |
| 14 | +date: 2024-12-16 |
| 15 | +slug: cpumanager-strict-cpu-reservation |
| 16 | +author: > |
| 17 | + [Jing Zhang](https://github.com/jingczhang) (Nokia) |
| 18 | +--> |
| 19 | + |
| 20 | +<!-- |
| 21 | +In Kubernetes v1.32, after years of community discussion, we are excited to introduce a |
| 22 | +`strict-cpu-reservation` option for the [CPU Manager static policy](/docs/tasks/administer-cluster/cpu-management-policies/#static-policy-options). |
| 23 | +This feature is currently in alpha, with the associated policy hidden by default. You can only use the |
| 24 | +policy if you explicitly enable the alpha behavior in your cluster. |
| 25 | +--> |
| 26 | +在 Kubernetes v1.32 中,经过社区多年的讨论,我们很高兴地引入了 |
| 27 | +[CPU Manager 静态策略](/zh-cn/docs/tasks/administer-cluster/cpu-management-policies/#static-policy-options)的 |
| 28 | +`strict-cpu-reservation` 选项。此特性当前处于 Alpha 阶段,默认情况下关联的策略是隐藏的。 |
| 29 | +只有在你的集群中明确启用了此 Alpha 行为后,才能使用此策略。 |
| 30 | + |
| 31 | +<!-- |
| 32 | +## Understanding the feature |
| 33 | +
|
| 34 | +The CPU Manager static policy is used to reduce latency or improve performance. The `reservedSystemCPUs` defines an explicit CPU set for OS system daemons and kubernetes system daemons. This option is designed for Telco/NFV type use cases where uncontrolled interrupts/timers may impact the workload performance. you can use this option to define the explicit cpuset for the system/kubernetes daemons as well as the interrupts/timers, so the rest CPUs on the system can be used exclusively for workloads, with less impact from uncontrolled interrupts/timers. More details of this parameter can be found on the [Explicitly Reserved CPU List](/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) page. |
| 35 | +
|
| 36 | +If you want to protect your system daemons and interrupt processing, the obvious way is to use the `reservedSystemCPUs` option. |
| 37 | +--> |
| 38 | +## 理解此特性 |
| 39 | + |
| 40 | +CPU Manager 静态策略用于减少延迟或提高性能。`reservedSystemCPUs` |
| 41 | +定义了一个明确的 CPU 集合,供操作系统系统守护进程和 Kubernetes 系统守护进程使用。 |
| 42 | +此选项专为 Telco/NFV 类型的使用场景设计,在这些场景中,不受控制的中断/计时器可能会影响工作负载的性能。 |
| 43 | +你可以使用此选项为系统/Kubernetes 守护进程以及中断/计时器定义明确的 CPU 集合, |
| 44 | +从而使系统上的其余 CPU 可以专用于工作负载,并减少不受控制的中断/计时器带来的影响。 |
| 45 | +有关此参数的更多详细信息,请参阅 |
| 46 | +[显式预留的 CPU 列表](/zh-cn/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) |
| 47 | +页面。 |
| 48 | + |
| 49 | +如果你希望保护系统守护进程和中断处理,显而易见的方法是使用 `reservedSystemCPUs` 选项。 |
| 50 | + |
| 51 | +<!-- |
| 52 | +However, until the Kubernetes v1.32 release, this isolation was only implemented for guaranteed |
| 53 | +pods that made requests for a whole number of CPUs. At pod admission time, the kubelet only |
| 54 | +compares the CPU _requests_ against the allocatable CPUs. In Kubernetes, limits can be higher than |
| 55 | +the requests; the previous implementation allowed burstable and best-effort pods to use up |
| 56 | +the capacity of `reservedSystemCPUs`, which could then starve host OS services of CPU - and we |
| 57 | +know that people saw this in real life deployments. |
| 58 | +The existing behavior also made benchmarking (for both infrastructure and workloads) results inaccurate. |
| 59 | +
|
| 60 | +When this new `strict-cpu-reservation` policy option is enabled, the CPU Manager static policy will not allow any workload to use the reserved system CPU cores. |
| 61 | +--> |
| 62 | +然而,在 Kubernetes v1.32 发布之前,这种隔离仅针对请求整数个 CPU |
| 63 | +的 Guaranteed 类型 Pod 实现。在 Pod 准入时,kubelet 仅将 CPU |
| 64 | +**请求量**与可分配的 CPU 进行比较。在 Kubernetes 中,限制值可以高于请求值; |
| 65 | +之前的实现允许 Burstable 和 BestEffort 类型的 Pod 使用 `reservedSystemCPUs` 的容量, |
| 66 | +这可能导致主机操作系统服务缺乏足够的 CPU 资源 —— 并且我们已经知道在实际部署中确实发生过这种情况。 |
| 67 | +现有的行为还导致基础设施和工作负载的基准测试结果不准确。 |
| 68 | + |
| 69 | +当启用这个新的 `strict-cpu-reservation` 策略选项后,CPU Manager |
| 70 | +静态策略将不允许任何工作负载使用预留的系统 CPU 核心。 |
| 71 | + |
| 72 | +<!-- |
| 73 | +## Enabling the feature |
| 74 | +
|
| 75 | +To enable this feature, you need to turn on both the `CPUManagerPolicyAlphaOptions` feature gate and the `strict-cpu-reservation` policy option. And you need to remove the `/var/lib/kubelet/cpu_manager_state` file if it exists and restart kubelet. |
| 76 | +
|
| 77 | +With the following kubelet configuration: |
| 78 | +--> |
| 79 | +## 启用此特性 |
| 80 | + |
| 81 | +要启用此特性,你需要同时开启 `CPUManagerPolicyAlphaOptions` 特性门控和 |
| 82 | +`strict-cpu-reservation` 策略选项。并且如果存在 `/var/lib/kubelet/cpu_manager_state` |
| 83 | +文件,则需要删除该文件并重启 kubelet。 |
| 84 | + |
| 85 | +使用以下 kubelet 配置: |
| 86 | + |
| 87 | +```yaml |
| 88 | +kind: KubeletConfiguration |
| 89 | +apiVersion: kubelet.config.k8s.io/v1beta1 |
| 90 | +featureGates: |
| 91 | + ... |
| 92 | + CPUManagerPolicyOptions: true |
| 93 | + CPUManagerPolicyAlphaOptions: true |
| 94 | +cpuManagerPolicy: static |
| 95 | +cpuManagerPolicyOptions: |
| 96 | + strict-cpu-reservation: "true" |
| 97 | +reservedSystemCPUs: "0,32,1,33,16,48" |
| 98 | +... |
| 99 | +``` |
| 100 | + |
| 101 | +<!-- |
| 102 | +When `strict-cpu-reservation` is not set or set to false: |
| 103 | +--> |
| 104 | +当未设置 `strict-cpu-reservation` 或将其设置为 false 时: |
| 105 | + |
| 106 | +```console |
| 107 | +# cat /var/lib/kubelet/cpu_manager_state |
| 108 | +{"policyName":"static","defaultCpuSet":"0-63","checksum":1058907510} |
| 109 | +``` |
| 110 | + |
| 111 | +<!-- |
| 112 | +When `strict-cpu-reservation` is set to true: |
| 113 | +--> |
| 114 | +当 `strict-cpu-reservation` 设置为 true 时: |
| 115 | + |
| 116 | +```console |
| 117 | +# cat /var/lib/kubelet/cpu_manager_state |
| 118 | +{"policyName":"static","defaultCpuSet":"2-15,17-31,34-47,49-63","checksum":4141502832} |
| 119 | +``` |
| 120 | + |
| 121 | +<!-- |
| 122 | +## Monitoring the feature |
| 123 | +
|
| 124 | +You can monitor the feature impact by checking the following CPU Manager counters: |
| 125 | +- `cpu_manager_shared_pool_size_millicores`: report shared pool size, in millicores (e.g. 13500m) |
| 126 | +- `cpu_manager_exclusive_cpu_allocation_count`: report exclusively allocated cores, counting full cores (e.g. 16) |
| 127 | +--> |
| 128 | +## 监控此特性 |
| 129 | + |
| 130 | +你可以通过检查以下 CPU Manager 计数器来监控该特性的影响: |
| 131 | + |
| 132 | +- `cpu_manager_shared_pool_size_millicores`:报告共享池大小,以毫核为单位(例如 13500m) |
| 133 | +- `cpu_manager_exclusive_cpu_allocation_count`:报告独占分配的核心数,按完整核心计数(例如 16) |
| 134 | + |
| 135 | +<!-- |
| 136 | +Your best-effort workloads may starve if the `cpu_manager_shared_pool_size_millicores` count is zero for prolonged time. |
| 137 | +
|
| 138 | +We believe any pod that is required for operational purpose like a log forwarder should not run as best-effort, but you can review and adjust the amount of CPU cores reserved as needed. |
| 139 | +--> |
| 140 | +如果 `cpu_manager_shared_pool_size_millicores` 计数在长时间内为零, |
| 141 | +你的 BestEffort 类型工作负载可能会因资源匮乏而受到影响。 |
| 142 | + |
| 143 | +我们建议,任何用于操作目的的 Pod(如日志转发器)都不应以 BestEffort 方式运行, |
| 144 | +但你可以根据需要审查并调整预留的 CPU 核心数量。 |
| 145 | + |
| 146 | +<!-- |
| 147 | +## Conclusion |
| 148 | +
|
| 149 | +Strict CPU reservation is critical for Telco/NFV use cases. It is also a prerequisite for enabling the all-in-one type of deployments where workloads are placed on nodes serving combined control+worker+storage roles. |
| 150 | +
|
| 151 | +We want you to start using the feature and looking forward to your feedback. |
| 152 | +--> |
| 153 | +## 总结 |
| 154 | + |
| 155 | +严格的 CPU 预留对于 Telco/NFV 使用场景至关重要。 |
| 156 | +它也是启用一体化部署类型(其中工作负载被放置在同时担任控制面节点、工作节点和存储角色的节点上)的前提条件。 |
| 157 | + |
| 158 | +我们希望你开始使用该特性,并期待你的反馈。 |
| 159 | + |
| 160 | +<!-- |
| 161 | +## Further reading |
| 162 | +
|
| 163 | +Please check out the [Control CPU Management Policies on the Node](/docs/tasks/administer-cluster/cpu-management-policies/) |
| 164 | +task page to learn more about the CPU Manager, and how it fits in relation to the other node-level resource managers. |
| 165 | +--> |
| 166 | +## 进一步阅读 |
| 167 | + |
| 168 | +请查看[节点上的控制 CPU 管理策略](/zh-cn/docs/tasks/administer-cluster/cpu-management-policies/)任务页面, |
| 169 | +以了解更多关于 CPU Manager 的信息,以及它如何与其他节点级资源管理器相关联。 |
| 170 | + |
| 171 | +<!-- |
| 172 | +## Getting involved |
| 173 | +
|
| 174 | +This feature is driven by the [SIG Node](https://github.com/Kubernetes/community/blob/master/sig-node/README.md). If you are interested in helping develop this feature, sharing feedback, or participating in any other ongoing SIG Node projects, please attend the SIG Node meeting for more details. |
| 175 | +--> |
| 176 | +## 参与其中 |
| 177 | + |
| 178 | +此特性由 [SIG Node](https://github.com/kubernetes/community/blob/master/sig-node/README.md) |
| 179 | +推动。如果你有兴趣帮助开发此特性、分享反馈或参与任何其他正在进行的 SIG Node 项目, |
| 180 | +请参加 SIG Node 会议以获取更多详情。 |
0 commit comments