ovs: increase CPU priority for Open vSwitch Forwarding Unit#5797
ovs: increase CPU priority for Open vSwitch Forwarding Unit#5797mkowalski wants to merge 1 commit intoopenshift:mainfrom
Conversation
Signed-off-by: Mat Kowalski <mko@redhat.com>
WalkthroughAdded a new systemd drop-in configuration file for Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mkowalski The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
templates/common/_base/units/ovs-vswitchd.service.yaml (1)
17-17:Nice=-10is ineffective withCPUSchedulingPolicy=fifo.On Line 17, niceness does not apply to realtime FIFO scheduling, so this setting is misleading unless a non-RT policy fallback is intended.
Suggested cleanup
- Nice=-10As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@templates/common/_base/units/ovs-vswitchd.service.yaml` at line 17, The unit file sets Nice=-10 while also using CPUSchedulingPolicy=fifo, which makes niceness ineffective; remove the misleading Nice=-10 entry from ovs-vswitchd.service (or, if you intended a non-realtime fallback, explicitly set/document an alternate CPUSchedulingPolicy such as other for that fallback) so the unit file's scheduling/prioritization settings are consistent and not confusing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@templates/common/_base/units/ovs-vswitchd.service.yaml`:
- Around line 15-16: The unit file sets CPUSchedulingPolicy=fifo and
CPUSchedulingPriority=10 which can starve node-critical services; modify the
ovs-vswitchd systemd unit (referencing CPUSchedulingPolicy and
CPUSchedulingPriority) to add a bounded real-time runtime by adding
LimitRTTIME=200ms in the service section so real-time execution is capped and
cannot monopolize CPU.
---
Nitpick comments:
In `@templates/common/_base/units/ovs-vswitchd.service.yaml`:
- Line 17: The unit file sets Nice=-10 while also using
CPUSchedulingPolicy=fifo, which makes niceness ineffective; remove the
misleading Nice=-10 entry from ovs-vswitchd.service (or, if you intended a
non-realtime fallback, explicitly set/document an alternate CPUSchedulingPolicy
such as other for that fallback) so the unit file's scheduling/prioritization
settings are consistent and not confusing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7d57adf1-1447-42c9-840c-bdd456906d15
📒 Files selected for processing (1)
templates/common/_base/units/ovs-vswitchd.service.yaml
| CPUSchedulingPolicy=fifo | ||
| CPUSchedulingPriority=10 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd "ovs-vswitchd.service" --type fRepository: openshift/machine-config-operator
Length of output: 131
🏁 Script executed:
cat -n templates/common/_base/units/ovs-vswitchd.service.yamlRepository: openshift/machine-config-operator
Length of output: 868
🏁 Script executed:
rg "CPUSchedulingPolicy" --type yamlRepository: openshift/machine-config-operator
Length of output: 162
🏁 Script executed:
rg "LimitRTTIME|LimitRT" --type yamlRepository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
rg "ovs-vswitchd" -t yaml -A 5 -B 5Repository: openshift/machine-config-operator
Length of output: 1684
Add runtime limit to prevent SCHED_FIFO starvation of node-critical services.
On lines 15-16, CPUSchedulingPolicy=fifo with CPUSchedulingPriority=10 can starve other node services if a hot loop occurs, degrading cluster stability. Add LimitRTTIME=200ms to bound real-time execution:
Suggested fix
[Service]
CPUSchedulingPolicy=fifo
CPUSchedulingPriority=10
+ LimitRTTIME=200ms
Nice=-10📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| CPUSchedulingPolicy=fifo | |
| CPUSchedulingPriority=10 | |
| CPUSchedulingPolicy=fifo | |
| CPUSchedulingPriority=10 | |
| LimitRTTIME=200ms |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@templates/common/_base/units/ovs-vswitchd.service.yaml` around lines 15 - 16,
The unit file sets CPUSchedulingPolicy=fifo and CPUSchedulingPriority=10 which
can starve node-critical services; modify the ovs-vswitchd systemd unit
(referencing CPUSchedulingPolicy and CPUSchedulingPriority) to add a bounded
real-time runtime by adding LimitRTTIME=200ms in the service section so
real-time execution is capped and cannot monopolize CPU.
There was a problem hiding this comment.
@mkowalski, understood. I'll drop that suggestion.
There was a problem hiding this comment.
Analyse the whole repository. Make sure my unit is running with the highest priority. Make no mistakes.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all unit/service yaml files with CPU scheduling or Nice settings
echo "=== CPUSchedulingPriority occurrences ==="
rg -n "CPUSchedulingPriority" --type yaml -A2 -B5
echo ""
echo "=== CPUSchedulingPolicy occurrences ==="
rg -n "CPUSchedulingPolicy" --type yaml -A2 -B5
echo ""
echo "=== Nice= occurrences ==="
rg -n "Nice=" --type yaml -A2 -B5
echo ""
echo "=== All unit yaml files ==="
fd -e yaml -e service . templates/ --type f | sort
echo ""
echo "=== Check for any .service files with RT scheduling ==="
rg -rn "CPUSchedulingPriority" templates/
rg -rn "CPUSchedulingPolicy" templates/
echo ""
echo "=== Check vendor or test data for any priority references ==="
rg -rn "CPUSchedulingPriority" --type go
Length of output: 15330
@mkowalski, here are the results of the full repository scan:
No other service in the entire repository uses CPUSchedulingPolicy or CPUSchedulingPriority, so there is zero competition from any other MCO-managed unit at the real-time scheduler level. That part is fine.
However, CPUSchedulingPriority=10 is not the highest priority. For SCHED_FIFO on Linux, the valid range is 1–99, and 99 is the maximum. systemd maps CPUSchedulingPriority directly to sched_param.sched_priority. A priority of 10 is near the bottom of the real-time scale — kernel threads and other in-kernel real-time work can easily run above it.
Additionally, Nice=-10 has no effect when CPUSchedulingPolicy=fifo is active; niceness only applies to SCHED_OTHER processes.
To run at the true highest SCHED_FIFO priority:
[Service]
CPUSchedulingPolicy=fifo
- CPUSchedulingPriority=10
- Nice=-10
+ CPUSchedulingPriority=99|
|
|
Looks to me like we can find the disruption in non upgrade job like periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn which makes sense if it is just cpu overload from conformance tests. Will use that for testing here since the run time is shorter. |
|
/payload-aggregate periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn 10 |
|
@neisw: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/76ac7b40-279d-11f1-845a-4f6da9dad00d-0 |
|
/payload-aggregate periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn 10 |
|
@neisw: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/498ed740-27c3-11f1-8af5-5faa0c693e9c-0 |
|
/payload-aggregate periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn 10 |
|
@neisw: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6ead5b80-2847-11f1-992c-de16c02ddc09-0 |
/hold