-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Move fence synchronisation into wgpu-hal #9475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Changes from all commits
37d2803
7ca8c54
a646f00
865af59
814b919
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,20 +1,29 @@ | ||
| use alloc::vec::Vec; | ||
| use alloc::{sync::Arc, vec::Vec}; | ||
| use core::sync::atomic::Ordering; | ||
| use parking_lot::RwLock; | ||
|
|
||
| use glow::HasContext; | ||
|
|
||
| use crate::AtomicFenceValue; | ||
|
|
||
| #[derive(Debug, Copy, Clone)] | ||
| #[derive(Debug)] | ||
| struct GLFence { | ||
| sync: glow::Fence, | ||
| // Since a fence can be `Copy`ed, there can exist some | ||
| // cases where (without proper synchronisation), | ||
| // a fence could be destroyed while something else is | ||
| // still using it. Therefore, while a function is | ||
| // using this fence, it should read this. (write | ||
| // should be done when destroying it) | ||
| // | ||
| // The arc should not be kept after a function has finished | ||
| sync: Arc<RwLock<glow::Fence>>, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of layering additional locks here, I think i.e., drop the lock before actually waiting. For Metal,
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried that, but I was worried about the possibility of destroying a fence while it was being waited on. The Arc would be unnecessary if there wasn't the RwLock as fences are
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I believe metal already does this (in this PR), but on metal the resource is destroyed when the ref-count goes to zero, instead of having a destroy method |
||
| value: crate::FenceValue, | ||
| } | ||
|
|
||
| #[derive(Debug)] | ||
| pub struct Fence { | ||
| last_completed: AtomicFenceValue, | ||
| pending: Vec<GLFence>, | ||
| pending: RwLock<Vec<GLFence>>, | ||
| fence_behavior: wgt::GlFenceBehavior, | ||
| } | ||
|
|
||
|
|
@@ -29,24 +38,27 @@ impl Fence { | |
| pub fn new(options: &wgt::GlBackendOptions) -> Self { | ||
| Self { | ||
| last_completed: AtomicFenceValue::new(0), | ||
| pending: Vec::new(), | ||
| pending: RwLock::new(Vec::new()), | ||
| fence_behavior: options.fence_behavior, | ||
| } | ||
| } | ||
|
|
||
| pub fn signal( | ||
| &mut self, | ||
| &self, | ||
| gl: &glow::Context, | ||
| value: crate::FenceValue, | ||
| ) -> Result<(), crate::DeviceError> { | ||
| if self.fence_behavior.is_auto_finish() { | ||
| *self.last_completed.get_mut() = value; | ||
| self.last_completed.store(value, Ordering::Release); | ||
| return Ok(()); | ||
| } | ||
|
|
||
| let sync = unsafe { gl.fence_sync(glow::SYNC_GPU_COMMANDS_COMPLETE, 0) } | ||
| .map_err(|_| crate::DeviceError::OutOfMemory)?; | ||
| self.pending.push(GLFence { sync, value }); | ||
| self.pending.write().push(GLFence { | ||
| sync: Arc::new(RwLock::new(sync)), | ||
| value, | ||
| }); | ||
|
|
||
| Ok(()) | ||
| } | ||
|
|
@@ -62,12 +74,15 @@ impl Fence { | |
| return max_value; | ||
| } | ||
|
|
||
| for gl_fence in self.pending.iter() { | ||
| let pending = self.pending.read(); | ||
|
|
||
| for gl_fence in pending.iter() { | ||
| let fence = gl_fence.sync.read(); | ||
| if gl_fence.value <= max_value { | ||
| // We already know this was good, no need to check again | ||
| continue; | ||
| } | ||
| let status = unsafe { gl.get_sync_status(gl_fence.sync) }; | ||
| let status = unsafe { gl.get_sync_status(*fence) }; | ||
| if status == glow::SIGNALED { | ||
| max_value = gl_fence.value; | ||
| } else { | ||
|
|
@@ -82,20 +97,27 @@ impl Fence { | |
| max_value | ||
| } | ||
|
|
||
| pub fn maintain(&mut self, gl: &glow::Context) { | ||
| pub fn maintain(&self, gl: &glow::Context) { | ||
| if self.fence_behavior.is_auto_finish() { | ||
| return; | ||
| } | ||
|
|
||
| let latest = self.get_latest(gl); | ||
| for &gl_fence in self.pending.iter() { | ||
| let mut pending = self.pending.write(); | ||
| for gl_fence in pending.iter() { | ||
| // We don't need to keep around this lock until after the retain - we need to make | ||
| // sure nothing is using it by writing to it, but any new references must come | ||
| // from `self.pending`, which is write-locked, so nothing else can take a | ||
| // copy of this value | ||
| let sync = *gl_fence.sync.write(); | ||
|
|
||
| if gl_fence.value <= latest { | ||
| unsafe { | ||
| gl.delete_sync(gl_fence.sync); | ||
| gl.delete_sync(sync); | ||
| } | ||
| } | ||
| } | ||
| self.pending.retain(|&gl_fence| gl_fence.value > latest); | ||
| pending.retain(|gl_fence| gl_fence.value > latest); | ||
| } | ||
|
|
||
| pub fn wait( | ||
|
|
@@ -115,9 +137,10 @@ impl Fence { | |
| return Ok(true); | ||
| } | ||
|
|
||
| let pending = self.pending.read(); | ||
|
|
||
| // Find a matching fence | ||
| let gl_fence = self | ||
| .pending | ||
| let gl_fence = pending | ||
| .iter() | ||
| // Greater or equal as an abundance of caution, but there should be one fence per value | ||
| .find(|gl_fence| gl_fence.value >= wait_value); | ||
|
|
@@ -130,14 +153,20 @@ impl Fence { | |
| // We should have found a fence with the exact value. | ||
| debug_assert_eq!(gl_fence.value, wait_value); | ||
|
|
||
| let sync = gl_fence.sync.clone(); | ||
|
|
||
| drop(pending); | ||
|
|
||
| let status = unsafe { | ||
| gl.client_wait_sync( | ||
| gl_fence.sync, | ||
| *sync.read(), | ||
| glow::SYNC_FLUSH_COMMANDS_BIT, | ||
| timeout_ns.min(i32::MAX as u32) as i32, | ||
| ) | ||
| }; | ||
|
|
||
| drop(sync); | ||
|
|
||
| let signalled = match status { | ||
| glow::ALREADY_SIGNALED | glow::CONDITION_SATISFIED => true, | ||
| glow::TIMEOUT_EXPIRED | glow::WAIT_FAILED => false, | ||
|
|
@@ -159,9 +188,13 @@ impl Fence { | |
| return; | ||
| } | ||
|
|
||
| for gl_fence in self.pending { | ||
| for gl_fence in self.pending.into_inner() { | ||
| unsafe { | ||
| gl.delete_sync(gl_fence.sync); | ||
| gl.delete_sync( | ||
| Arc::into_inner(gl_fence.sync) | ||
| .expect("A function has failed to drop all its references to this") | ||
| .into_inner(), | ||
| ); | ||
| } | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly like that things are set up this way (or at least, I think it should be documented more clearly what is going on), but I believe that the fence
RwLockis being used not just to protect&mut selfmethods on the fence, but also to ensure mutual exclusion betweensubmitand other things that don't want a submit to happen concurrently. Which will no longer be the case ifsubmitonly acquires a read lock.I also don't particularly like that there are separate locks for the fence and command indices (I feel like the Fence could also have responsibility for giving out command indices), nor do I like that the protection against concurrent submits (see https://github.com/gfx-rs/wgpu/pull/9307/changes#diff-150156a37cf3627465ceb22096ed995ee26ae640007c421e726134bafd499dbeR1679-R1683) is more pessimistic than necessary (the
validate_command_buffersprocessing probably could be done concurrently). But looking for solutions that don't bite off too much refactoring, one strategy might be to switch to using the command indices lock rather than the fence lock to provide mutual exclusion with concurrentsubmits (and document this, since it's non-obvious). If we do that, then I think we could get rid of the fence lock inwgpu_coreentirely and rely on the locking in hal.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had assumed (at least for submission) that command indices was held for that purpose. The only thing which appeared to use fences for exclusion was present, which should still exclude due to using a write.