Releases: ConfettiFX/The-Forge
Release 1.63 - March 20th, 2025 - Advanced RTX Global Illumination Middleware | Quest Run-time switches to OpenXR | Triangle Visibility Buffer with Programmable MSAA | Ephemeris running on low end mobile devices | Particle System UT now runs on Adreno Devices with lower storage buffer limits
Release 1.63 - March 20th, 2025 - Advanced RTX Global Illumination Middleware | Quest Run-time switches to OpenXR | Triangle Visibility Buffer with Programmable MSAA | Ephemeris running on low end mobile devices | Particle System UT now runs on Adreno Devices with lower storage buffer limits
Advanced RTX Global Illumination Middleware
Over the last three years we developed a solution for Global Illumination based on the RTX / DXR interfaces. This development was fuelled by the games that will use this solution. We are adding it to our arsenal of graphics middleware, which is not publicly available. In case you want more information, please drop us a note.
Here is a feature list and screenshots:
- New sky shading model for more accurate GI
 - Normal texture generation from depth
 - Single pass depth hierarchy from FSSR
 - Screen space ray marching for evaluation of screen space GI
 - Single bounce hybrid ray tracing for GI
 - Multi bounce GI with probe volume cascades
 - World and frustum GPU hash implementation for samples accumulation
 - Denoising with reprojection and a weighted blur
 - Batching rays for indirect dispatches as an optimization
 - Spatiotemporal reservoir sampling
 
Following the trajectory of RTX based Global Illumination approaches, this should be the most advanced and already proven system out there ...
Android Samsung S24 Xclipse 940

Quest Run-time switches to OpenXR
We made the internal switch to finally use OpenXR. A funfact is that we used and helped to develop OpenXR in projects since about 2016 but never implemented it into our own code base ... only customer code bases.
Triangle Visibility Buffer with Programmable MSAA
This is an oldie but goodie :-) ... I (Wolfgang) wrote several blog posts about it and talked on conferences about it and now we have it in our code base. I remember having been involved at Rockstar Games with developing a programmable MSAA approach on XBOX and PS in 2006 (?), we had examples running with it in our code base over the years. We finally found time to bring it also into our open-source code base inside the Triangle Visibility Buffer.
Debug screenshots
Godray samples 4xMSAA

Ephemeris running on low end mobile devices
This was also one of the overdue things. Ephemeris is our skydome system that shipped in game engines before. Most of the time we had it running on PS4 class hardware but now it also supports
low-end mobile phones. It is in our non-public middleware folder. Drop me a note if you want to know more.
Particle System UT now runs on Adreno Devices that do not support bindless Textures
Now there is a fallback for devices with lower storage buffer limits. Also we further optimized the size of the particle data to accommodate more particles in the particle buffer.
Release 1.62 - February 27th, 2025 - C99 Vulkan/DirectX rewrite | Scene resolution using GPUCfg | SRT updates | In-Flight Motion Vector
Release 1.62 - February 27th, 2025 - C99 Vulkan/DirectX rewrite | Scene resolution using GPUCfg | SRT updates | In-Flight Motion Vector
C99 Vulkan/DirectX API rewrite
Our quest to move as much code to C99 as possible is motivated by the idea that small teams deal better with a C99 code base. We are targeting this framework at small teams that need to be agile and quick.
We finished a first pass on the Vulkan and DirectX run-time. There is more to come.
Scene Resolution using GPUCfg
On mobile phones and mobile devices, the scene resolution is quite often very different even for one category of devices like Android Phones. So we have a better system now in place to really define scene and screen resolution with the GPU config system.
FSL 2 improvements
After having shipped a Shader Resource Table based FSL language two weeks ago, we have done a bit more clean-up work and unified and simplified naming conventions.
Quest Support
We make on-going improvements for the Quest support.
Triangle Visibility Buffer 2.0
We found several issues with TVB 2.0 and fixed those. Next step is to make another pass on the architecture and see how much better we can make the memory access patterns to improve performance.
In-Flight Motion Vectors
Many people still store motion vectors in render targets. For the last 15+ years that didn't make much sense anymore because the memory access pattern to write and read those motion vectors was so costly that calculating them on the fly made more sense.
This approach is based on Ben Padget's article in one of the ShaderX books ... he will smile about the fact that after all this time we are still quoting his article ...
Release 1.61 - February 13th, 2025 - FSL 2 | Browserstack | Android / Vulkan | DirectX 11 | Quest | flecs
Release 1.61 - February 13th, 2025 - FSL 2 | Browserstack | Android / Vulkan | DirectX 11 | Quest | flecs
FSL 2
We are enforcing now a better memory access pattern for root signatures. We unified root signature usage, so that in best case only one or two need to be used for a game. To do this we added a unified shader resource table that is shared between FSL and C++.
We wrote a more thorough documentation here:
https://github.com/ConfettiFX/The-Forge/wiki/FSL-Programming-Guide
This is a good example how shader languages should evolve. Instead of mimicking the misguided efforts in writing a C++ shader language, a shader language should mimic the memory access patterns of a GPU and guide the user towards the "best and most performant" results.
From a practical standpoint, unreliable and non-functioning shader compilers are a bigger problem than any language syntax to please some twisted abstraction that has no performance benefit and happens for no good reason.
Browserstack
For testing mobile phones we integrate Browserstack more and more into our workflows.
Android / Vulkan
After having finished our more than four year stint on the Warzone Mobile project, making and keeping the game run on Android phones, this same team is now making sure our internal Android / Vulkan run-time lives up to the same or higher expectations. Browserstack is used to test a larger number of phones now. Higher-end phones will now support our Triangle Visibility Buffer unit tests.
It appears that a lot of our priorities in the game industry are shifting towards mobile and also to a lesser extend consoles, as the most important gaming platforms now.
We are trying to find ways to make sure mobile is the first class citizen.
DirectX 11
We removed DirectX 11 support with the retirement of Windows 10.
Quest Support
We help developing the Quest since 2016 now. We somehow missed to take care of our own Quest run-time :-) ... we are currently catching up on all the missed opportunities here and updating and upgrading it. Making it a better part of our test suite and adding more unit test support.
flecs
We improved our flecs integration and upgraded to latest.
Release 1.60 - October 11th, 2024 - GPU Work Graphs | Filesystem Refactor | Window System Refactor Phase 1
Release 1.60 - October 11th, 2024 - GPU Work Graphs | Filesystem Refactor | Window System Refactor Phase 1
GPU Work Graphs
We are testing GPU Work Graphs now for a while. We see opportunities to implement more complex compute driven interactions, which helps us to move towards GPU-driven rendering more and more. The current example runs the clear buffers and triangle culling in a GPU Work Graph if GPU Work Graphs are supported. Otherwise it takes the old path. So visually there is no change.
Filesystem Refactor
We data drive the everything in a game engine. For the file system, we finally managed to implement that too in the public repository. We also removed unnecessary copies and symlinks. Please note our file system consists of "two" file systems. One for the run-time and one for tools. You can't ship a tools file system in a game :-) I know that this is well known but people always want to check if a path is right, a directory exists, a file exists, before they load this one file ... yeaahh ... so our run-time file system prevents that from happening.
Window System Refactor Phase 1
With the on-going challenges in the area of upscaling and general window management on Operating systems that support windows, we decided to refactor our window system to bring it up to existing standards and allow us to integrate upscalers easier. Currently most upscalers do not consider the pixel center, they were written by people who never studied Bresenham's algorithm. We wanted to make sure our upscaler actually works without introducing Moire patterns or strong staircase effects like the ones currently promoted by hardware vendors.
We will add more functionality in Phase 2 ...
Steamdeck
We are pushing forward in making Steamdeck a first class citizen. It represents our Linux run-time efforts, so we can switch if its necessary to any Linux distro ...
Release 1.59 - September 6th, 2024 - STAR WARS™: Bounty Hunter™ | Replaced Gainput with our own Input library | Removed Vulkan from Windows Run-Time | Removed API Switch | Third-Party Integration
Release 1.59 - September 6th, 2024 - STAR WARS™: Bounty Hunter™ | Replaced Gainput with our own Input library | Removed Vulkan from Windows Run-Time | Removed API Switch | Third-Party Integration
STAR WARS™: Bounty Hunter™
Bounty Hunter was ported with the help of The Forge Framework to all the platforms mentioned in the screenshot:

Replaced Gainput with our own Input library
We wrote a new input library from scratch in C. Its design follows the architecture of the rendering API. So one high-level interface file IInput.h and then platform specific files for each of the target devices. It has less lines of code compared to gainput and is easier to maintain for a small team. We are still testing it as we speak. Let us know if you see any bugs.
Removed Vulkan from Windows Run-time
Over the last couple of years in many of our projects, it became apparent that the best option to ship a PC / Windows game is DirectX 12. The main reason is the reduced QA effort and reliability. The other reason is that we constantly get forced to upgrade on PC to a newer version, while mobile -which is the more important platform- stays far behind. So we are not officially supporting the Vulkan run-time on Windows anymore. We have an internal version that we test and obviously we support Vulkan on Android, Switch and Steamdeck (native support).
Removed API Switch from the Run-time
Concluding that Vulkan is not a good API for PC anymore, we removed the switching functionality we had in there before to switch between DirectX 12 and Vulkan on PC and OpenGL ES 2.0 and Vulkan on Android.
For Android we utilized the switching functionality to switch between OpenGL ES 2.0 and Vulkan for a business class application that we helped to build (Facebook Application framework). This was necessary to run on billions of mobile devices. Looking at the latest numbers it doesn't make sense for us to support OpenGL ES 2.0 and therefore we dropped support and also don't need switching anymore.
Removed Commercial Middleware from GitHub
We decided to remove our commercial middleware from GitHub. Development will now happen in our internal repositories only.
Commercial Console licenses
For the last seven years we offered TF for free to anyone asking. We are going to change this now for the console platforms XBOX, Playstation and Switch. You will require a commercial license to use those from here on.
Removed unit tests
- 09a_HybridRaytracing
 
Third-Party Libraries
We are improving our Third-Party Library integration substantially by making the ones we use more integrated into The Forge Eco System. While doing so we removed the ones we do not use anymore. Here is a list:
- soloud
 - rmem
 - cjson
 - MTuner
 - TinyXML
 
Release 1.58 - June 17th, 2024 Behemoth | Compute-Driven Mega Particle System | Triangle Visibility Buffer 2.0
Release 1.58 - June 17th, 2024 Behemoth | Compute-Driven Mega Particle System | Triangle Visibility Buffer 2.0 |
Announce trailer for Behemoth
We helped Skydance Interactive to optimize Behemoth last year. Click on the image below to see the announce trailer:
Compute-Based Mega Particle System
This unit test was based on some of our research into software rasterization and GPU-driven rendering. A particle system completely running in very few compute shaders with one large buffer holding most of the data. Like with all things GPU-Driven, the trick is to execute one compute shader once on one buffer to reduce read / write memory bandwidth. Although this is not new wisdom, you will be surprised how many particle systems get this still wrong ... having compute shaders for each stage of the particle life time or even worse doing most of the particle work on the CPU.
This particle system was demoed last year in a few talks in September on a Samsung S22. Here are the slides:
http://www.conffx.com/WolfgangEngelParticleSystem.pptx
It is meant to be used to implement next-gen Mega Particle systems in which we simulate always 100000th or millions of particles at once instead of the few dozen ones contemporary systems simulate.
Android Samsung S22 1170x540 resolution
This screenshot shows 4 million firefly-like particles, with 10000 lights attached to them and a shadow for the directional light. Those numbers were thought to be not possible on mobile phones before.

Android Samsung S23 1170x540 resolution
Same setting as above but this time also with 8 Shadows from Point Lights additionally.

Android Samsung S24 1170x540 resolution
Same setting as above but this time also with 8 Shadows from Point Lights additionally.

PS5 running at 4K
Windows with AMD RX 6400 at 1080p
Triangle Visibility Buffer 2.0
we have the new compute based TVB 2.0 approach now running on all platforms (on Android only S22). You can download slides from the I3D talk from
Release 1.57 - May 8th, 2024 Visibility Buffer 2.0 Prototype | Visibility Buffer 1.0 One Draw call
Release 1.57 - May 8th, 2024 Visibility Buffer 2.0 Prototype | Visibility Buffer 1.0 One Draw call
Visibility Buffer Research - I3D talk
We are giving a talk about our latest Visibility Buffer research on I3D. Here is a short primer what it is about:
The original idea of the Triangle Visibility Buffer is based on an article by [[burns2013]. [schied15] and [schied16] extended what was described in the original article. Christoph Schied implemented a modern version with an early version of OpenGL (supporting MultiDrawIndirect) into The Forge rendering framework in September 2015.
We ported this code to all platforms and simplified and extended it in the following years by adding a triangle filtering stage following [chajdas] and [wihlidal17] and a new way of shading.
Our on-going improvements simplified the approach incrementally and the architecture started to resemble what was described in the original article by [burns2013] again, leveraging the modern tools of the newer graphics APIs.
In contrast to [burns2013], the actual storage of triangles in our implementation of a Visibility Buffer happens due to the triangle removal and draw compaction step with an optimal “massaged” data set.
By having removed overdraw in the Visibility Buffer and Depth Buffer, we run a shading approach that shades everything with one regular draw call. We called the shading stage Forward++ due to its resemblance to forward shading and its usage of a tiled light list for applying many lights. It was a step up from Forward+ that requires numerous draw calls.
We described all this in several talks at game industry conferences, for example on GDCE 2016 [engel16] and during XFest 2018, showing considerable performance gains due to reduced memory bandwidth compared to traditional G-buffer based rendering architectures.
A blog post that was updated over the years for what we call now Triangle Visibility Buffer 1.0 (TVB 1.0) can be found here [engel18].
Over the last years we extended this original idea with a Order-Independent Transparency approach (it is more efficient to sort triangle IDs in a per-pixel linked list compared to storing layers of a G-Buffer), software VRS and then we developed a Visibility Buffer approach that doesn't require draw calls to fill the depth and Visibility Buffer and one that requires much less draw calls in parallel.
This release offers -what we call- an updated Triangle Visibility Buffer 1.0 (TVB 1.0) and a prototype for the Triangle Visibility Buffer 2.0 (TVB 2.0).
The changes to TVB 1.0 are evolutionary. We used to map each mesh to an indirect draw element. This reuqired the use of DrawID to map back to the per-mesh data. When working on a game engine with a very high amount of draw calls, it imposed a limitation on the number of "draws" we could do, due to having only a limited number of bits available in the VB.
Additionally, instancing was implemented using a separate instanced draw for each instanced mesh. We refactored the data flow between the draws and the shade pass.
There is now no reliance on DrawID and instances are handled transparently using the same unified draw. This both simplifies the flow of data and allows us to draw more "instanced" meshes.
Apart from being able to use a very high-number of draw calls, the performance didn't change.
The new TVB 2.0 approach is revolutionary in a sense that it doesn't use draw calls anymore to fill the depth and visibility buffer. There are two compute shader invocations that filter triangles and eventually fill the depth and visibility buffer.
Not using draw calls anymore, makes the whole code base more consistent and less convoluted -compared to TVB 1.0-.
You can find now the new Visibilty Buffer 2 approach in
The-Forge\Examples_3\Visibility_Buffer2
This is still in an early stage of development. We only support a limited number of platforms: Windows D3D12, PS4/5, XBOX, and macOS / iOS.
Sanitized initRenderer
we cleaned up the whole initRenderer code. Merged GPUConfig into GraphicsConfig and unified naming.
Metal run-time improvements
We improved the Metal Validation Support.
Art
Everything related to Art assets is now in the Art folder.
Bug fixes
Lots of fixes everywhere.
References:
[burns2013] Christopher A. Burns, Warren A. Hunt, "The Visibility Buffer: A Cache-Friendly Approach to Deferred Shading", 2013, Journal of Computer Graphics Techniques (JCGT) 2:2, Pages 55 - 69.
[schied2015] Christoph Schied, Carsten Dachsbacher, "Deferred Attribute Interpolation for Memory-Efficient Deferred Shading" , Kit Publication Website: http://cg.ivd.kit.edu/publications/2015/dais/DAIS.pdf
[schied16] Christoph Schied, Carsten Dachsbacher, "Deferred Attribute Interpolation Shading", 2016, GPU Pro 7, Pages
[chajdas] Matthaeus Chajdas, GeometryFX, 2016, AMD Developer Website http://gpuopen.com/gaming-product/geometryfx/
[wihlidal17] Graham Wihlidal, "Optimizing the Graphics Pipeline with Compute", 2017, GPU Zen 1, Pages 277--320
[engel16] Wolfgang Engel, "4K Rendering Breakthrough: The Filtered and Culled Visibility Buffer", 2016, GDC Vault: https://www.gdcvault.com/play/1023792/4K-Rendering-Breakthrough-The-Filtered
[engel18] Wolfgang Engel, "Triangle Visibility Buffer", 2018, Wolfgang Engel's Diary of a Graphics Programmer Blog http://diaryofagraphicsprogrammer.blogspot.com/2018/03/triangle-visibility-buffer.html
Release 1.56 - April 4th, 2024 I3D | Warzone Mobile | Visibility Buffer | Aura on macOS | Ephemeris on Switch | GPU breadcrumbs | Swappy in Android | Screen-space Shadows | Metal Debug Markers improved
Release 1.56 - April 4th, 2024 I3D | Warzone Mobile | Visibility Buffer | Aura on macOS | Ephemeris on Switch | GPU breadcrumbs | Swappy in Android | Screen-space Shadows | Metal Debug Markers improved
I3D
We are sponsoring I3D again. Come by and say hi! We also will be giving a talk on the new development around Triangle Visibility Buffer.
Warzone Mobile launched
We work on Warzone Mobile since August 2020. The game launched on March 21, 2024.
Visibility Buffer
We removed CPU cluster culling and simplified the animation data usage. Now traingle filtering only takes one dispatch each frame again.
Swappy frame pacer is now vailable in Android/Vulkan
We integrated the Swappy frame pacer into the Android / Vulkan eco system.
GPUCfg system improved with more ids and less string compares
we did another pass on the GPUCfg system and now we can generate the vendor Ids and model Ids with a python script to keep the *_gpu.data list easily up to date for each platform.
We removed most of the name comparisons and replaced them with the id comparisons which should speed up parsing time and is more specific.
Screen-Space Shadows in UT9
We added to the number of shadow approaches in that unit test screen-space shadows. These are complementary to regular shadow mapping and add more detail. We also fixed a number of inconsistencies with the other shadow map approaches.
PS5 - Screen-Space Shadows off

GPU breadcrumbs on all platforms
Now you can have GPU crash reports on all platforms. We skipped OpenGL ES and DX11 so ...
A simple example of a crash report is this:
2024-04-04 23:44:08 [MainThread     ] 09a_HybridRaytracing.cp:1685   ERR| [Breadcrumb] Simulating a GPU crash situation (RAYTRACE SHADOWS)...
2024-04-04 23:44:10 [MainThread     ] 09a_HybridRaytracing.cp:2428  INFO| Last rendering step (approx): Raytrace Shadows, crashed frame: 2
We will extend the reporting a bit more over time.
Ephemeris now also runs on Switch ...
Release 1.55 - March 1st, 2024 - Ephemeris | gpu.data | Many bug fixes and smaller improvements
Release 1.55 - March 1st, 2024 - Ephemeris | gpu.data | Many bug fixes and smaller improvements
Ephemeris 2.0 Update
We improved Ephemeris again and support it now on more platforms. Updating some of the algorithms used and adding more features.
Now we are supporting PC, XBOX'es, PS4/5, Android, Steamdeck, iOS (requires iPhone 11 or higher (so far not Switch)
IGraphics.h
We changed the graphics interface for cmdBindRenderTargets
// old
DECLARE_RENDERER_FUNCTION(void, cmdBindRenderTargets, Cmd* pCmd, uint32_t renderTargetCount, RenderTarget** ppRenderTargets, RenderTarget* pDepthStencil, const LoadActionsDesc* loadActions, uint32_t* pColorArraySlices, uint32_t* pColorMipSlices, uint32_t depthArraySlice, uint32_t depthMipSlice)
// new
DECLARE_RENDERER_FUNCTION(void, cmdBindRenderTargets, Cmd* pCmd, const BindRenderTargetsDesc* pDesc)
Instead of a long list of parameters we now provide a struct that gives us enough flexibility to pack more functionality in there.
Variable Rate Shading
We added Variable Rate Shading to the Visibility Buffer OIT example test 15a. This way we have a better looking test scene with St. Miguel.
VRS allows rendering parts of the render target at different resolution based on the auto-generated VRS map, thus achieving higher performance with minimal quality loss. It is inspired by Michael Drobot's SIGGRAPH 2020 talk: https://docs.google.com/presentation/d/1WlntBELCK47vKyOTYI_h_fZahf6LabxS/edit?usp=drive_link&ouid=108042338473354174059&rtpof=true&sd=true
The key idea behind the software-based approach is to render everything in 4xMS targets and use a stencil buffer as a VRS map. VRS map is automatically generated based on the local image gradients.
It could be used on a way wider range of platforms and devices than the hardware-based approach since the hardware VRS support is broken or not supported on many platforms. Because this software approach utilizes 2x2 tiles we could also achieve higher image quality compared to hardware-based VRS.
Shading rate view based on the color per 2x2 pixel quad:
- White – 1 sample (top left, always shaded);
 - Blue – 2 horizontal samples;
 - Red – 2 vertical samples;
 - Green – all 4 samples;
 
Debug Output with the original Image on PC

Debug Output with the original Image on PC

Debug Output with the original Image on Android

Debug Output with the original Image on Android

UI description:
- Toggle VRS – enable/disable VRS
 - Draw Cubes – enable/disable dynamic objects in the scene
 - Toggle Debug View – shows auto-generated VRS map if VRS is enabled
 - Blur kernel Size – change blur kernel size of the blur applied to the background image to highlight performance benefits of the solution by making fragment shader heavy enough.
Limitations:
Relies on programmable sample locations support – not widely supported on Android devices. 
Supported platforms:
PS4, PS5, all XBOXes, Nintendo Switch, Android (Galaxy S23 and higher), Windows(Vulkan/DX12), macOS/iOS.
gpu.data
You want to check out those files. They are now dedicated per supported platform. So it is easier for us to differ between different Playstations, XBOX'es, Switches, Android, iOS etc..
Unlinked Multi GPU
The Unlinked Multi GPU example was broken on AMD 7x GPUs with Vulkan. This looks like a bug. We had to disable DCC to make that work.
Vulkan
we track GPU memory now and will extend this to other platforms.
Vulkan mobile support
We support now the VK_EXT_ASTC_DECODE_MODE_EXTENSION_NAME extension
Remote UI
Various bug fixes to make this more stable. Still alpha ... will crash.
Retired:
- 35 Variable Rate Shading ... this went into the Visibility Buffer OIT example 15a.
 - Basis Library - after not having found any practical usage case, we remove Basis again.
 
Release 1.54 - February 2nd, 2024 - Remote UI Control | Shader Server | Visibility Buffer | Asset Pipeline | GPU Config System | macOS/iOS | Lots more ...
Release 1.54 - February 2nd, 2024 - Remote UI Control | Shader Server | Visibility Buffer | Asset Pipeline | GPU Config System | macOS/iOS | Lots more ...
Our last release was in October 2022. We were so busy that we lost track of time. In March 2023 we planned to make the next release. We started testing and fixing and improving code up until today. The amount of improvements coming back from the -most of the time- 8 - 10 projects we are working on where so many, it was hard to integrate all this, test it and then maintain it. To a certain degree our business has higher priority than making GitHub releases but we realize that letting a lot of time pass makes it substantially harder for us to get the whole code base back in shape, even with a company size of nearly 40 graphics programmers. So we cut down functional or unit tests, so that we have less variables. We also restructured large parts of our code base so that it is easier to maintain. One of the constant maintenance challenges were the macOS / iOS run-time (More about that below).
We invested a lot in our testing environment. We have more consoles now for testing and we also have a much needed screenshot testing system. We outsource testing to external service providers more. We removed Linux as a stand-alone target but the native Steamdeck support should make up for this.
We tried to be conservative about increasing API versions because we know on many platforms our target group will use older OS or API implementations. Nevertheless we were more adventurous this year then before. So we bumped up with a larger step than in previous years.
Our next release is planned for in about four weeks time. We still have work to do to bring up a few source code parts but now the increments are much smaller.
In the meantime some of the games we worked on, or are still working on, shipped:
Forza Motorsport has launched in the meantime:
Starfield has launched:
No Man Sky has launched on macOS:
Internal automated testing setup on our internal GitLab server
- Our automated testing setup that tests all the platforms now takes 38 minutes for one run. At some point it was more. We revamped this substantially since the last release adding now screenshot comparisons and a few extra steps for static code analysis.
 
Visibility Buffer
- the Visibility Buffer went through a lot of upgrades since October 2022. I think the most notable ones are:
- Refactored the whole code so that it is easier to re-use in all our examples, there is now a dedicated Visibility Buffer directory holding this code
 - Animation of characters is now integrated
 - Tangent and Bi-Tangent calculation is moved to the pixel shader and we removed the buffers
 
 
Software Variable Rate Shading
This Unit test represents software-based variable rate shading (VRS) technique that allows rendering parts of the render target at different resolution based on the auto-generated VRS map, thus achieving higher performance with minimal quality loss. It is inspired by Michael Drobot's SIGGRAPH 2020 talk: https://docs.google.com/presentation/d/1WlntBELCK47vKyOTYI_h_fZahf6LabxS/edit?usp=drive_link&ouid=108042338473354174059&rtpof=true&sd=true
The key idea behind the software-based approach is to render everything in 4xMS targets and use a stencil buffer as a VRS map. The VRS map is automatically generated based on the local image gradients.
The advantage of this approach is that it runs on a wider range of platforms and devices than the hardware-based approach since the hardware VRS support is broken or not supported on many platforms. Because this software approach utilizes 2x2 tiles we can also achieve higher image quality compared to hardware-based VRS.
Shading rate view based on the color per 2x2 pixel quad:
- White – 1 sample (top left, always shaded);
 - Blue – 2 horizontal samples;
 - Red – 2 vertical samples;
 - Green – all 4 samples;
 
UI description:
- Toggle VRS – enable/disable VRS
 - Draw Cubes – enable/disable dynamic objects in the scene
 - Toggle Debug View – shows auto-generated VRS map if VRS is enabled
 - Blur kernel Size – change blur kernel size of the blur applied to the background image to highlight performance benefits of the solution by making fragment shader heavy enough.
Limitations:
Relies on programmable sample locations support – not widely supported on Android devices. 
Supported platforms:
PS4, PS5, all XBOXes, Nintendo Switch, Android (Galaxy S23 and higher), Windows(Vulkan/DX12).
Implemented on MacOS/IOS, but doesn’t give expected performance benefits due to the issue with stencil testing on that platform
Shader Server
To enable re-compilation of shaders during run-time we implemented a cross-platform shader server that allows to recompile shaders by pressing CTRL-S or a button in a dedicated menu.
You can find the documentation in the Wiki in the FSL section.
Remote UI Control
When working remotely, on mobile or console  it can cumbersome to control the development UI.
We added a remote control application in Common_3\Tools\UIRemoteControl which allows control of all UI elements on all platforms.
It works as follows:
- Build and Launch the Remote Control App located in Common_3/Tools/UIRemoteControl
 - When a unit test is started on the target application (i.e. consoles), it starts listening for connections on a part (8889 by default)
 - In the Remote Control App, enter the target ip address and click connect
 
This is alpha software so expect it to crash ...
VK_EXT_device_fault support
This extension allows developers to query for additional information on GPU faults which may have caused device loss, and to generate binary crash dumps.
Ray Queries in Ray Tracing
We switched to Ray Queries for the common Ray Tracing APIs on all the platforms we support. The current Ray Tracing APIs increase the amount of memory necessary substantially, decrease performance and can't add much visually because the whole game has to run with lower resolution, lower texture resolution and lower graphics quality (to make up for this, upscalers were introduced that add new issues to the final image).
Because Ray Tracing became a Marketing term valuable to GPU manufacturers, some game developers support now Ray Tracing to help increase hardware sales. So we are going with the flow here by offering those APIs.
iPhone 11 (Model A2111) at resolution 896x414

We do not have a denoiser for the Path Tracer.
GPU Configuration System
This is a cross-platform system that can track GPU capabilities on all platforms and switch on and off features of a game for different platforms. To read a lot more about this follow the link below.
New macOS / iOS run-time
We think the Metal API is a well balanced Graphics API that walks the path between low-level and high-level very well. We ran into one general problem with the Metal API for both platforms. It is hard to maintain the code base. There is an architectural problem that was probably introduced due to lack in experience in shipping games.
In essence what Apple decided to do is have calls like this:
https://developer.apple.com/documentation/swift/marking-api-availability-in-objective-c
Anything a hardware vendor describes as available and working might not be working with the next upgrade of the operating system, hardware or just the API or XCode.
If you have a few hundred of those macros in your code, it becomes a lottery what works and what not on a variety of hardware. On some hardware one is broken, on the other hardware something else.
So there are two ways to deal with this: for every @available macro you start adding a #define to switch off or replace that code based on the underlying hardware and software platform. You would have to manually track if what the macro says is true on a wide range of platforms with different outcome.
So for example on macOS 10.13 running on a certain Macbook Pro (I make this up) with an Intel GPU it is broken but then a very similar Macbook Pro that has additionally a dedicated GPU actually runs it. Now you have to track what "class of Macbook Pro" we are talking about and if the Macbook Pro in question has an Intel or an AMD GPU.
We track all this data already so that is not a problem. We know exactly what piece of hardware we are looking at (see above GPU Config system).
The problem is that we have to guard every @available macro with som...








































