Skip to content

linuxkpi: Fix cancel_delayed_work_sync() return value#2268

Open
ryanfahy314 wants to merge 1 commit into
freebsd:mainfrom
ryanfahy314:lkpi/cancel-delayed-work-sync-contract
Open

linuxkpi: Fix cancel_delayed_work_sync() return value#2268
ryanfahy314 wants to merge 1 commit into
freebsd:mainfrom
ryanfahy314:lkpi/cancel-delayed-work-sync-contract

Conversation

@ryanfahy314

Copy link
Copy Markdown

Summary

This PR fixes the LinuxKPI workqueue delayed work sync behavior to better match the Linux workqueue API contract.

I found this while debugging Intel Arc A770/DG2 support on 15.1-STABLE which showed GuC context scheduling failures under hardware-accelerated Xorg workloads.

Problem

There's a subtle difference in behavior between the Linux cancel_delayed_work_sync() return value and the LinuxKPI equivalent.

Expected Linux Behavior

The return value behavior of cancel_delayed_work_sync() can be summarized as:

  1. Timer armed but callback not started -> Cancel timer and return true
  2. Timer fired and work queued, but callback not started -> remove queued work and return true
  3. Callback is already executing -> wait until callback finishes and return false
  4. Idle, no work is queued and no timer present -> return false

Reference: https://docs.kernel.org/core-api/workqueue.html

Old LinuxKPI Behavior

The return value behavior diverges from this contract in two cases. In linux_cancel_delayed_work_sync():

  1. Queued task removed before running -> remove queued work and return false
  2. Callback is already executing -> wait until callback finishes and return true

This causes problems in consumers which use this return value to distinguish between "pending work was canceled" and "work already ran/is running/was idle."

Changes

  1. Track the taskqueue pending count separately from the taskqueue_cancel() return value.
  2. Use the pending count for the public cancel_delayed_work_sync() return value.
  3. Use taskqueue_cancel() return value only to decide whether state needs to be re-checked.

This brings the public-facing API into alignment with the Linux workqueue return value semantics.

Testing

I performed A/B testing of this change using an Intel Arc A770 GPU with Xorg direct rendering to exercise the context scheduling pipeline.
To simulate a workload with heavy GuC context scheduling, I ran a script that opened/moved/resized/closed Firefox windows repeatedly.

This testing was performed with a modified driver implementing the changes described in this drm-kmod pull request: freebsd/drm-kmod#468

In addition to the above PR, I also used temporary instrumentation of the i915 GuC scheduler pipeline.

Before LinuxKPI update

With the kernel LinuxKPI layer untouched and the modified driver, under hardware-accelerated loads, I experienced a crash anywhere between 5 and 60 minutes.

The targeted GuC instrumentation observed context 4107 entering a state where it was marked both destroyed and enabled

ctx_id=4107 sched_state=0x12 destroyed=1 pend_en=0 pend_dis=0 enabled=1 registered=0 closed=0 invalid_guc_id=0

Within the same captured window, I observed the scheduled disable preparation updating that context to:

ctx_id=4107 sched_state=0x6 destroyed=1 pend_en=0 pend_dis=1 enabled=0 registered=0 closed=0 invalid_guc_id=0

Leading to:

GUC: Bad context sched_state 0x6, ctx_id 4107

Captured marker counts with the old behavior

Bad context sched_state                 8
Failed to handle HXG                    129
Failed to process CT message            134
Unsolicited response message            126
sched_state=0x12                        555
sched_state=0x6                         24
destroyed=1.*enabled=1.*registered=0    586

Note: These are not necessarily unique failure counts as they were captured from overlapping logs, but they show that the failure was reproducible and showed the same repeated pattern.

I was able to trace this to the i915 GuC submission path, which consumes the return value here:
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c

Simplified:

if (cancel_delayed_work_sync(&ce->guc_state.sched_disable_delay_work)) {
        intel_context_sched_disable_unpin(ce);
} else if (intel_context_is_closed(ce)) {
        if (wait_for(context_close_done(ce), 1500))
                guc_warn(guc, "timed out waiting on context sched close before realloc\n");
}

A nearby comment explains that this path races with context close, so it uses the return value of cancel_delayed_work_sync() to determine whether this path should handle cancellation and unpin, or if it should wait for context close to complete.

After LinuxKPI update

After aligning the behavior of the LinuxKPI with the Linux workqueue return value semantics, I rebuilt and booted a test kernel on the same 15.1-STABLE commit as my baseline:
FreeBSD 15.1-STABLE amd64 1501501
7fa638d6f7cf9ee471648a794756cb9d0aa289ba

and ensured that Xorg was using direct rendering for the desktop display

direct rendering: Yes
Accelerated: yes
OpenGL renderer string: Mesa Intel(R) Arc(tm) A770 Graphics (DG2)
glamor initialized
Initializing extension DRI3
AIGLX: Loaded and initialized iris

After the same Firefox window churning workload, there was no recurrence of the bad GuC/CT/context scheduling markers after approximately 3.5 hours:

== known bad marker count ==
0
== known bad marker lines ==
No known bad markers found

Contributor Note

For transparency, AI assistance was used in a limited scope during discovery and debugging. This included debugging, source navigation guidance, log/crashdump analysis, and temporary instrumentation of the i915 GuC context scheduler. All submitted code changes were written, reviewed, built, and tested by me.

Align behavior between the Linux cancel_delayed_work_sync()
function return value and the LinuxKPI equivalent.

Linux cancel_delayed_work_sync() returns whether delayed
work was pending, even if canceled before executing. This
includes the case where the timer fired and work was queued
but the callback had not yet started.

The LinuxKPI version used the return value from taskqueue_cancel()
as the return value of the public facing API, which inverted the
behavior of two cases, violating the Linux API contract.
Queued work which was removed before running would return false, and
work whose callback was already executing would return true.

Track the taskqueue pending count separately from the
taskqueue_cancel() return value.
Use the pending count for the public return value.
Use taskqueue_cancel() return value only to decide whether
state needs to be re-checked.

This fixes behavior in consumers which use the return value of
cancel_delayed_work_sync() to distinguish canceled pending
work from work that was already running or idle.

Signed-off-by: Ryan Fahy <ryan@rfahy.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant