-
-
Notifications
You must be signed in to change notification settings - Fork 56
Apply targeted MapBufferRange optimization for NoOverwrite #258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FireworkSky
commented
Dec 15, 2025
- Only applies to NoOverwrite operations (not Discard)
- Uses ARB_map_buffer_range extension check for better portability
- More conservative approach to avoid driver-specific issues"
This patch reduces the scope of MapBufferRange optimization: - Only applies to NoOverwrite operations (not Discard) - Uses ARB_map_buffer_range extension check for better portability - Tested with good results on AMD Linux - More conservative approach to avoid driver-specific issues
- Only applies to NoOverwrite operations (not Discard) - Uses ARB_map_buffer_range extension check for better portability - More conservative approach to avoid driver-specific issues
flibitijibibo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, we just need to make sure NVIDIA performance is still okay... @TheSpydog, will you have a chance to try this again soon?
|
Sure thing, I’ll take a look tomorrow morning. |
|
Hi I'm not from china and I'm trying to get the link for rotatingartlauncher |
emm, please don't discuss this here. |
|
So where i can discuss? |
|
Tested the sprite batch stress test again on Nvidia Tegra this morning. The performance with Deferred sprite batches is the same before vs. after the patch, but the Immediate path is more interesting. Instead of a consistent 14ms per frame, I'm now seeing this strange pattern: While the typical batch is more performant (14->10ms), every 6 batches, we pay some sort of >100ms penalty. I don't immediately have an idea of where that might be coming from. Maybe it's some sort of driver heuristic kicking in and forcing a sync point? |
|
Possibly - I also wonder if it's the number of discards happening. Changing it to smaller, weirder batch sizes may have an impact too? |
|
Merged but with an added environment variable on top: This gives all platforms the option, but apps/users need to ask for it explicitly first. Once we have a better grasp on how this really works we can see if it works as default behavior. |
The perfect idea is that Nvidia Tegra should avoid enabling supports_ARB_map_buffer_range. |