Move fence synchronisation into wgpu-hal by Vecvec · Pull Request #9475 · gfx-rs/wgpu

Vecvec · 2026-04-30T08:33:37Z

Connections
Fixes #9470

Description
Previously, queue submit wrote to the fence in wgpu-core. However, device.poll read from the same fence, meaning that queue.submit could not execute while device.poll was running. This could be a problem in a multi-threaded environment with long execution times of command buffer. This PR moves the locks into wgpu-hal (where needed) so that they can be dropped when waiting. Also fixes a deadlock exposed because the fence write

One thing that I don't like is the requirement for Arc<RwLock<Fence>>s inside a RwLock<Vec<...>>s for vulkan and openGL, but I could not find any way to remove this. The inner RwLocks are to stop the backend from resetting fences that are being waited on (the wait here should be minimal because the fences are already determined to be signalled) and the Arcs are to allow them to be kept while outside of the main lock (such as in wait).

The lock on the fence in core was not removed. It seems that present uses it to exclude other queue operations. @inner-daemons, is the only operation this needs to exclude queue.submit? I'm happy to complete that in this PR, or wait it until another PR.

Testing
Existing tests run the same as previously. Solves the issue.

Squash or Rebase?

Squash

Checklist

I self-reviewed and fully understand this PR.
WebGPU implementations built with wgpu may be affected behaviorally. (At least, in terms of speed)
Validation and feature gates are in place to confine behavioral changes.
Tests demonstrate the validation and altered logic works.
CHANGELOG.md entries for the user-facing effects of this change are present.
The PR is minimal, and doesn't make sense to land as multiple PRs. The deadlock is so common that I think the PR would be broken otherwise (I usually deadlocked within 10 frames [one submit per frame, the other thread only submitted 20 times per second])
Commits are logically scoped and individually reviewable.
The PR description has enough context to understand the motivation and solution implemented.

Vecvec · 2026-04-30T08:52:18Z

After rebasing, the code that was causing the deadlock appears to have been refactored, so ignore the comments about a deadlock.

Edit: was #9307

andyleiserson · 2026-05-01T19:13:06Z

+    // should be done when destroying it)
+    //
+    // The arc should not be kept after a function has finished
+    sync: Arc<RwLock<glow::Fence>>,


Instead of layering additional locks here, I think Fence::pending could be RwLock<Vec<Arc<GLFence>>> (or maybe just a Mutex). Then, wait can do something like:

// Find a matching fence let gl_fence = self .pending .read() // or `.lock()`, if a mutex .iter() // Greater or equal as an abundance of caution, but there should be one fence per value .find(|gl_fence| gl_fence.value >= wait_value) .map(Arc::clone);

i.e., drop the lock before actually waiting. For Metal, Retained is already like an Arc, so similar can be done with Retained::clone.

I tried that, but I was worried about the possibility of destroying a fence while it was being waited on. The Arc would be unnecessary if there wasn't the RwLock as fences are Copy.

For Metal, Retained is already like an Arc, so similar can be done with Retained::clone.

I believe metal already does this (in this PR), but on metal the resource is destroyed when the ref-count goes to zero, instead of having a destroy method

andyleiserson · 2026-05-01T19:22:49Z

        // Lock ordering requires that the fence lock be acquired after the snatch lock and
        // before the command index lock.
-        let fence = self.device.fence.write();
+        let fence = self.device.fence.read();


I don't particularly like that things are set up this way (or at least, I think it should be documented more clearly what is going on), but I believe that the fence RwLock is being used not just to protect &mut self methods on the fence, but also to ensure mutual exclusion between submit and other things that don't want a submit to happen concurrently. Which will no longer be the case if submit only acquires a read lock.

I also don't particularly like that there are separate locks for the fence and command indices (I feel like the Fence could also have responsibility for giving out command indices), nor do I like that the protection against concurrent submits (see https://github.com/gfx-rs/wgpu/pull/9307/changes#diff-150156a37cf3627465ceb22096ed995ee26ae640007c421e726134bafd499dbeR1679-R1683) is more pessimistic than necessary (the validate_command_buffers processing probably could be done concurrently). But looking for solutions that don't bite off too much refactoring, one strategy might be to switch to using the command indices lock rather than the fence lock to provide mutual exclusion with concurrent submits (and document this, since it's non-obvious). If we do that, then I think we could get rid of the fence lock in wgpu_core entirely and rely on the locking in hal.

I believe that the fence RwLock is being used not just to protect &mut self methods on the fence, but also to ensure mutual exclusion between submit and other things that don't want a submit to happen concurrently. Which will no longer be the case if submit only acquires a read lock.

I had assumed (at least for submission) that command indices was held for that purpose. The only thing which appeared to use fences for exclusion was present, which should still exclude due to using a write.

Vecvec · 2026-05-01T19:56:40Z

The other way for Vulkan & OpenGl that I can see would probably be having a Fence wrapper that destroyed (or reset on Vulkan) the fence when dropped. It couldn't be clone as I think it would have to be unsafe, but it would remove the need for the lock. However, for Vulkan at least, the free list would have to have an Arc<RwLock<...>> around it to allow the drop impl to add the item back.

Edit: If unsafe was fine the lock could be forgotten and then force unlocked. That wouldn't require the Arc at least.

Vecvec added 4 commits April 30, 2026 20:39

Switch vulkan fence pools to rwlocks

37d2803

Switch to internal synchronization on GL

7ca8c54

use rwlocks in metal

a646f00

Don't require &mut in hal

865af59

Vecvec force-pushed the unlocked-fences branch from b1d0ed1 to eb9cc5b Compare April 30, 2026 08:50

Change core to only read lock fence

814b919

Vecvec force-pushed the unlocked-fences branch from eb9cc5b to 814b919 Compare April 30, 2026 09:00

andyleiserson reviewed May 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move fence synchronisation into wgpu-hal#9475

Move fence synchronisation into wgpu-hal#9475
Vecvec wants to merge 5 commits intogfx-rs:trunkfrom
Vecvec:unlocked-fences

Vecvec commented Apr 30, 2026

Uh oh!

Vecvec commented Apr 30, 2026 •

edited

Loading

Uh oh!

andyleiserson May 1, 2026

Uh oh!

Vecvec May 1, 2026

Uh oh!

Vecvec May 1, 2026

Uh oh!

andyleiserson May 1, 2026

Uh oh!

Vecvec May 1, 2026

Uh oh!

Vecvec commented May 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Vecvec commented Apr 30, 2026

Uh oh!

Vecvec commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andyleiserson May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Vecvec May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Vecvec May 1, 2026

Choose a reason for hiding this comment

Uh oh!

andyleiserson May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Vecvec May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Vecvec commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vecvec commented Apr 30, 2026 •

edited

Loading

Vecvec commented May 1, 2026 •

edited

Loading