Tuning of individual kernel threads#628
Conversation
c2c554a to
1c03a83
Compare
It's OK for me, in long-term it's a candidate for rewrite/refactor, but there are other plugins using it as well. We will probably keep the idea and if we change the implementation, this could be then updated in all affected plugins the same way. |
|
Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody. It's OK if you are not using some plugin. We even wanted to add global configuration option allowing selective disablement of specific plugins in the stock profiles. |
1c03a83 to
826223f
Compare
I found the whole cgroup topic to be rather tricky, because in modern systems, SystemD is the "cgroup manager", and it owns (by convention) the cgroup tree. So any creation of new cgroups should happen via SystemD, and can then use Delegation to create further sub-groups. I've had some success with:
But simply having TuneD move processes around seems like it could have unwanted side-effects, and should be handled with care... |
826223f to
262305b
Compare
262305b to
f50ea9e
Compare
f50ea9e to
55f965d
Compare
55f965d to
3e91c01
Compare
3e91c01 to
97b2e60
Compare
|
Update:
@yarda did you already have a chance to review this PR? |
|
Sorry for the delay, I am back on it. The 5d28337 LGTM. Regarding the kthread plugin is there a specific need for the proposed syntax: Wouldn't be better to use one group per instance? I.e. the same affinity/sched_opts setting for individual instance, e.g.: Or: Then for the priority the builtin instance You could also specify multiple regexes per instance: |
I thought about this when I wrote the plugin, and didn't like it because
I find the current format in the
Also re-using the existing |
|
Here in one instance there can be multiple "devices" with multiple different tuning. I don't like that it breaks the used logic concept that in each instance there are "devices" with the same tuning. I am afraid that it could lead to multiple problems later (e.g. when moving "devices" between instances through the API). But maybe it's just my personal preference, @zacikpa, @jmencak what's your opinion? |
|
I'm personally fine with the implementation as it is now (since it's inspired by the |
|
No device plugin, like e.g. the sysfs plugin, this may work. |
One of the main points of the new plugin is to allow dynamic changes of the tuning (through the |
|
Huh, taking back what I said here, I say we should adjust IIUC, you can't and don't plan to use |
The change itself is probably simple, but I'm not sure what would happen for any of the existing non-hotplug Plugins if multiple instances were created. I expect those instances would just interfere with each other and break things.
Treating kthreads as "devices" lets me reuse a lot of nice infrastructure. But it would of course be possible to do this all within the plugin. |
97b2e60 to
4c924b9
Compare
@zacikpa I just pushed a draft that bases the plugin on |
|
@adriaan42 could you please squash the commits? Multiple commits each introducing separate logical change are OK, but multiple commits on the same file (e.g. Regarding the @zacikpa your opinion? |
|
Or maybe split this PR to two PRs. The commit adding switch for disabling kthreads can be merged immediately and removing it from this PR could make rebasing easier. |
I pushed the changes as separate commit as part of the discussion, to show required changes and maybe get some feedback (confirmation that this might be the way to go before I spend effort to finalize the imlementation). Will post an update soon.
I agree. I'm happy to look into better solutions and possible side-effects, if we agree that "allow dynamic instances of non-device plugins" is reasonable and has a chance to be integrated. Again, trying to avoid effort for something that might never get merged...
Sure! -> #734 |
4c924b9 to
502b280
Compare
|
Update:
|
502b280 to
a22a223
Compare
|
Update:
|
a22a223 to
545a307
Compare
545a307 to
f5eba3c
Compare
f5eba3c to
b068785
Compare
b068785 to
a4c0fb6
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a kthread plugin that discovers kernel threads, parses prioritized group. rules, assigns threads to plugin instances with priority-aware transfers, and applies/restores CPU affinity and scheduler policy/priority; optional python-perf integration provides delayed dynamic monitoring for newly spawned threads. ChangesKernel Thread Tuning Plugin
🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tuned/plugins/plugin_kthread.py`:
- Around line 331-353: The code reads and unions
other_instance._kthreads_assigned/_kthreads_processed and iterates those sets
without holding self._lock, causing concurrent-modification races; fix by
acquiring self._lock around every access/union/iteration of per-instance kthread
sets in _instance_release_kthreads and in the prior transfer loop (the block
using _get_matching_kthreads and calling _instance_transfer_kthread), e.g.
snapshot sets with set(...) while holding self._lock (using self._lock:), then
release the lock if you want to perform heavier work outside, or perform the
transfer while still holding the lock to prevent RuntimeError; ensure all
accesses to _kthreads_assigned, _kthreads_processed, _kthread_pids_unassigned
and calls to _instance_add_kthread/_instance_remove_kthread are protected by the
same self._lock.
- Around line 620-642: The method _apply_kthread_tuning currently assigns
kthread.sched_orig before performing changes, so if _set_affinity succeeds but
_set_schedopts later raises the PID becomes partially tuned and unrevertible; to
fix, capture current_affinity/current_policy/current_priority in local vars,
attempt to apply affinity and scheduling changes inside a try/except and only
set kthread.sched_orig = SchedOpts(...) and tuned_* flags after both operations
succeed, and on any exception revert any change already applied (call
_set_affinity(pid, original_affinity) and _set_schedopts(pid, original_policy,
original_priority) as needed), propagate the error, and ensure
affinity_changeable is set correctly based on whether revert succeeded; use the
existing helpers _set_affinity, _set_schedopts and _get_affinity/_get_schedopts
to locate where to implement this.
- Around line 137-142: When calling self._perf_setup() (guarded by have_perf)
wrap the call in a try/except to catch any exceptions (e.g., Exception as e)
raised during perf initialization and fall back to the non-perf path by calling
self._disable_perf(); also emit the same warning used when python-perf was not
found but include the caught exception details so the plugin degrades gracefully
instead of aborting. Target the call site where have_perf is checked and update
logic around self._perf_setup(), referencing the _perf_setup() and
_disable_perf() methods and the existing log.warning message.
- Around line 225-230: The current validation only checks SCHED_OTHER==0 and a
blanket 0..99 range; update the validation in the parsing/validation block that
uses variables policy and priority (in tuned/plugins/plugin_kthread.py) to use
policy-specific bounds: call self._scheduler_utils.get_priority_min(policy) and
get_priority_max(policy), ensure priority is between those bounds, and produce
policy-aware error messages (e.g., for SCHED_BATCH/SCHED_IDLE require 0, for
SCHED_FIFO/SCHED_RR disallow 0) instead of the generic 0..99 check so invalid
values (like b50, i1, f0) are rejected before _set_schedopts() is called.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 0573457f-0de3-4f29-99b2-088e885424ba
📒 Files selected for processing (1)
tuned/plugins/plugin_kthread.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tuned/plugins/plugin_kthread.py`:
- Around line 348-356: When reassigning free_kthreads you must iterate other
instances in priority order instead of dict order; change the loop that
currently iterates self._instances.values() to iterate instances sorted by their
priority (highest first) and skip the releasing instance. Use each instance's
priority attribute (e.g., instance.priority or instance._priority) as the sort
key, then call self._get_matching_kthreads(...) and
self._instance_transfer_kthread(...) in that order so overlapping match sets go
to higher-priority instances first.
- Around line 139-144: The _perf_setup() failure path in KthreadPlugin.__init__
can leak partially initialized perf resources; update the except block so it
invokes self._perf_shutdown() (to close any evlist/evsel/fds created) before
setting the global have_perf = False and logging the warning, and ensure any
exceptions from _perf_shutdown() are caught/suppressed or logged to avoid
masking the original error; reference KthreadPlugin.__init__, _perf_setup(),
_perf_shutdown(), cleanup(), and the global have_perf.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 63ae5d08-6824-4185-addc-db8e9f3a8741
📒 Files selected for processing (1)
tuned/plugins/plugin_kthread.py
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tuned/plugins/plugin_kthread.py (2)
292-303:⚠️ Potential issue | 🟠 Major | ⚡ Quick winIsolate per-thread failures so the perf monitor doesn't die.
_instance_add_kthread()only handlesThreadNoLongerExists. Any other_apply_kthread_tuning()failure bubbles through_kthread_internal_add()into_perf_monitor_thread(), which has no catch. One bad kthread can therefore terminate dynamic monitoring for all future kthreads until TuneD restarts.Suggested fix
while not self._terminate.is_set(): # process new threads when their delay has passed now = time.time() while len(new_threads) > 0 and now >= new_threads[0].ts: - self._kthread_internal_add(new_threads.pop(0).tid) + tid = new_threads.pop(0).tid + try: + self._kthread_internal_add(tid) + except Exception as e: + log.error("Failed to process new kthread %d: %s" % (tid, e)) @@ if event.type == perf.RECORD_COMM: new_threads.append(NewThread(now + thread_add_delay_s, event.tid)) elif event.type == perf.RECORD_EXIT: - self._kthread_internal_remove(event.tid) + try: + self._kthread_internal_remove(event.tid) + except Exception as e: + log.error("Failed to process kthread exit %d: %s" % (event.tid, e))Also applies to: 537-583
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tuned/plugins/plugin_kthread.py` around lines 292 - 303, _instance_add_kthread currently only catches ThreadNoLongerExists so any other exception from _get_instance_sched_options or _apply_kthread_tuning will propagate up into _kthread_internal_add and kill _perf_monitor_thread; wrap the work that fetches options and applies tuning (the call(s) to _get_instance_sched_options and _apply_kthread_tuning) in a broad try/except that catches Exception (while keeping the existing ThreadNoLongerExists handling separate), log the error, call _kthread_internal_remove(pid) to clean up, and only add pid to instance._kthreads_processed on successful completion; apply the identical defensive try/except pattern to the other similar block referenced (the second occurrence around the 537–583 region) so one bad kthread cannot terminate the perf monitor.
357-372:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClear dead PIDs out of the unassigned pool.
_kthread_internal_remove()never discards the PID from_kthread_pids_unassigned, and_instance_release_kthreads()can add it back after removal. In a long-lived daemon that means dead PIDs accumulate in the unassigned set and get re-scanned on every later acquire/release cycle.Suggested fix
def _instance_release_kthreads(self, instance): @@ # the remaining ones go back to unassigned for pid in free_kthreads: self._instance_remove_kthread(instance, pid) - self._kthread_pids_unassigned.add(pid) + if pid in self._kthreads: + self._kthread_pids_unassigned.add(pid) @@ def _kthread_internal_remove(self, pid): @@ try: with self._lock: + self._kthread_pids_unassigned.discard(pid) del self._kthreads[pid] self._plugin_remove_kthread(pid)Also applies to: 467-477
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tuned/plugins/plugin_kthread.py` around lines 357 - 372, _dead_ PIDs are never removed from _kthread_pids_unassigned so unassigned set grows; update the removal logic so dead PIDs are purged: modify _kthread_internal_remove (or ensure callers like _instance_remove_kthread and _instance_release_kthreads) to also discard the pid from _kthread_pids_unassigned (and only add back to _kthread_pids_unassigned if the pid is still alive/valid). Use the symbols _kthread_internal_remove, _instance_remove_kthread, _instance_release_kthreads and _kthread_pids_unassigned to locate the code and implement the check/remove so dead pids are not left in the unassigned pool.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tuned/plugins/plugin_kthread.py`:
- Around line 142-149: The current fallback path swallows exceptions from
_perf_shutdown() after a failed _perf_setup(), forcing have_perf=False and
preventing cleanup() from retrying; change the except-pass behavior so that any
exception raised by _perf_shutdown() is not silently dropped—either re-raise the
shutdown exception (preferably chaining it to the original _perf_setup()
exception) or record it and raise a combined exception so callers see the
failure; apply the same change to the other similar blocks handling
_perf_setup/_perf_shutdown (the other occurrences around the same pattern) so
resource cleanup failures are propagated and cleanup() can retry.
---
Outside diff comments:
In `@tuned/plugins/plugin_kthread.py`:
- Around line 292-303: _instance_add_kthread currently only catches
ThreadNoLongerExists so any other exception from _get_instance_sched_options or
_apply_kthread_tuning will propagate up into _kthread_internal_add and kill
_perf_monitor_thread; wrap the work that fetches options and applies tuning (the
call(s) to _get_instance_sched_options and _apply_kthread_tuning) in a broad
try/except that catches Exception (while keeping the existing
ThreadNoLongerExists handling separate), log the error, call
_kthread_internal_remove(pid) to clean up, and only add pid to
instance._kthreads_processed on successful completion; apply the identical
defensive try/except pattern to the other similar block referenced (the second
occurrence around the 537–583 region) so one bad kthread cannot terminate the
perf monitor.
- Around line 357-372: _dead_ PIDs are never removed from
_kthread_pids_unassigned so unassigned set grows; update the removal logic so
dead PIDs are purged: modify _kthread_internal_remove (or ensure callers like
_instance_remove_kthread and _instance_release_kthreads) to also discard the pid
from _kthread_pids_unassigned (and only add back to _kthread_pids_unassigned if
the pid is still alive/valid). Use the symbols _kthread_internal_remove,
_instance_remove_kthread, _instance_release_kthreads and
_kthread_pids_unassigned to locate the code and implement the check/remove so
dead pids are not left in the unassigned pool.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 1dff8677-9b1e-41be-87e2-933ea4f0cfa1
📒 Files selected for processing (1)
tuned/plugins/plugin_kthread.py
295cad9 to
8aeeb84
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tuned/plugins/plugin_kthread.py`:
- Around line 292-305: The code in _instance_add_kthread currently mutates
ownership/bookkeeping before calling _apply_kthread_tuning, risking permanent
loss on non-ESRCH failures; change the flow so the PID is not removed from its
current owner nor from global bookkeeping until _apply_kthread_tuning(kthread,
opts) completes successfully, or ensure a rollback that re-inserts the PID back
into the previous instance/set and global maps if tuning throws; specifically
wrap the sequence using _kthread_get, _get_instance_sched_options and
_apply_kthread_tuning in a try/except where any exception triggers restoration
via re-adding the PID to the source instance and avoiding call to
_kthread_internal_remove, and apply the same atomic/rollback fix to the
analogous transfer paths that use the same helpers.
- Around line 179-185: The current kthread assignment logic (in
_plugin_add_kthread and in the transfer code paths using
_instance_transfer_kthread / _get_matching_kthreads / _instance_add_kthread)
chooses an owner purely by instance priority and can assign kthreads to inactive
instances; change the selection so that you first scan for matching active
instances and only if none exist fall back to matching inactive instances: in
_plugin_add_kthread iterate self._instances.values() and prefer instance objects
with an active flag (e.g., instance.is_active or instance.active) when calling
_get_matching_kthreads and _instance_add_kthread, and when implementing
_instance_transfer_kthread update the transfer logic to skip handing a PID to an
inactive instance (or park it in _kthreads_assigned/_kthread_pids_unassigned)
until that instance becomes active; apply the same active-first selection change
to the other matching/transfer blocks referenced (the blocks around the other
ranges).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: d18d0240-aefe-4540-b064-3a74479c5c3d
📒 Files selected for processing (1)
tuned/plugins/plugin_kthread.py
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
8aeeb84 to
9a4895a
Compare
|
@coderabbitai resume |
✅ Action performedReviews resumed. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tuned/plugins/plugin_kthread.py`:
- Around line 438-454: In _instance_unapply_static, avoid the TOCTOU by
computing whether shutdown is needed while holding self._lock: after
decrementing self._instance_count inside the with self._lock: block, set a local
flag (e.g., need_shutdown = (self._instance_count == 0)); then release the lock
and, if need_shutdown and have_perf, call self._perf_monitor_shutdown(). This
keeps the count/check atomic (refer to _instance_unapply_static,
self._instance_count, and _perf_monitor_shutdown) while avoiding holding the
lock during the actual shutdown call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 1f148f83-1207-4ee6-93d0-245687766a30
📒 Files selected for processing (1)
tuned/plugins/plugin_kthread.py
In combination with #596 and #580, this PR implements the third feature needed to dynamically tune all relevant aspects of a realtime application using a dedicated HW device (typically a NIC).
Two things might need some discussion:
schedulerplugin, but means we still need_has_dynamic_options, which is marked as a hack in plugins/base.py. The alternative would be to have one plugin instance per "group", which would make the profiles much longer.group.ktimers=0:f:2:*:^\[ktimersschedulerplugin and thekthreadplugin, we'd have two threads doing the same thing. For my applications that's not a problem, because I no longer use theschedulerplugin at all:schedulerhandles three things: IRQ affinities, kernel threads, and userland threadsirqpluginkthread