Update DebugBreak proposal langauge#871
Conversation
This is mostly clarifications that the behavior isn't actually dependent on a debugger being attached, but rather on runtime state (which will be set by a debugger, but may also be set separately). I've also renamed `dx::IsDebuggerAttached` to `dx::IsDebuggingEnabled` to more clearly reflect the change.
damyanp
left a comment
There was a problem hiding this comment.
@JoeCitizen can you file an issue to make the corresponding changes to the implementation in DXC please?
t-tye
left a comment
There was a problem hiding this comment.
I think the name change from debugger to debugging makes a lot of sense. The facilities are about controlling information intended for debugging. The control of when that information is produced could be due to a debugger being attached, or it could be controlled by the driver or runtime inspecting options the user has supplied when executing the application.
| Additionally, shader authors need a way to conditionally enable expensive debug | ||
| checks only when a debugger is attached, avoiding runtime overhead in production | ||
| scenarios. | ||
| checks only when debugging is enabled in the runtime, avoiding runtime overhead |
There was a problem hiding this comment.
Suggest s/in the runtime/at runtime/ . Do not require that the checking has to be done in a runtime, as it may be the device driver that does the checking. Both these cases are happening at runtime.
| Triggers a breakpoint if a graphics debugger is attached. If no debugger is | ||
| attached or the runtime does not support this operation, it is treated as a | ||
| Triggers a breakpoint if debugging is enabled in the runtime. If debugging is | ||
| not enabled or the runtime does not support this operation, it is treated as a |
There was a problem hiding this comment.
Suggest s/or the runtime does not support this operation/or the device does not support this operation/ . Do not require that the checking has to be done in a runtime, as it may be the device driver that does the checking. Both these cases are providing support for the specific device executing the code.
| void main(uint GI : SV_GroupIndex) { | ||
| // Conditional expensive debug checks | ||
| if (dx::IsDebuggerPresent()) { | ||
| if (dx::IsDebuggingEnabled()) { |
There was a problem hiding this comment.
The if statement has 2 spaces of indentation but should have 4 spaces to match the rest of the function body.
| motion. Should not be marked `readonly` or `readnone`. If no debugger is | ||
| attached, this is a no-op. | ||
| motion. Should not be marked `readonly` or `readnone`. If debugging is not | ||
| enabled, this is a no-op. |
There was a problem hiding this comment.
I suggest IsDebuggingEnabled/DebugBreak should not be marked convergent. There is no communication between the dynamic execution instances of a convergent version of IsDebuggingEnabled/DebugBreak — as long as each lane executes a IsDebuggingEnabled/DebugBreak, it does not matter if they are reported by the same instruction. As long as each lane that is active correctly executes the code to report a DebugBreak that is all that is required. Marking it convergent would limit code motion optimization unnecessarily. For example, the compiler may decide to clone basic blocks when optimizing complex IF conditions, resulting in the DebugBreak code being replicated. Each lane may execute a different path through the optimized code, but will still report the DebugBreak, and the active lane mask indicates which lane(s) are actually executing it. So instead of all lanes executing a single instruction to report the DebugBreak, some lanes may execute one instruction, and other lanes may report at a different cloned version of the instruction. Collectively, all the lanes are reporting. My view is that diagnostic code should not inhibit optimization as far as possible, provided it does execute under the correct conditions.
However, what I think is needed instead is to treat IsDebuggingEnabled/DebugBreak as "volatile read+write" in the sense that languages like C define it. It indicates communication with something outside the abstract machine model of the language, which is exactly what the debugger is here. A debugger can attach at any time, so each execution of the IsDebuggingEnabled/DebugBreak needs to happen using the current machine state (which could have been modified by the debugger).
This prevents code motion: each instance of IsDebuggingEnabled/DebugBreak in the source program needs to execute, so it cannot be moved out of loops (which would change the number of times it executes). Similarly, CSE is prevented. Generally, volatile and read + write semantics also inhibit other code motion so that IsDebuggingEnabled/DebugBreak will tend to remain where they are relative to other instructions that have observable effects, such as other load and stores. This is similar to how the time() builtin needs to be handled as the value it returns is also volatile and there is a desire to keep it in the same place that the source code implies.
Suggest replacing "Must be treated as convergent" with a description of volatile semantics. Possible change to this paragraph for IsDebuggingEnabled/DebugBreak:
Must be treated as a
volatilememory read and write to prevent code motion and to ensure the current runtime debugging state value is always used. The runtime debugging state can change through the execution of the application due to attaching a debugger and changing the setting. Should not be markedreadonlyorreadnonesince it is reading the current state of the debugging state.
I would delete the "If debugging is not enabled, this is a no-op." since the purpose of it is to determine is debugging is enabled, and it is a function that must always return a boolean value. Instead, perhaps:
If compilation requests that debugging code is eliminated, then must be treated as unconditionally returning
FALSE. The compiler is then permitted to remove all code guarded byIsDebuggingEnabledas dead code.
| Returns `true` (1) if a debugger is attached, `false` (0) otherwise. Marked | ||
| `readonly` as it only queries state. | ||
| Returns `true` (1) if debugging is enabled, `false` (0) otherwise. Marked | ||
| `readonly` as it only queries state. |
There was a problem hiding this comment.
As mentioned above, it is proposed that IsDebuggingEnabled should also have volatile read and write semantics rather than being marked readonly. This would ensure the current debugging status is queried each time the operation is executed. This is important to support the ability to attach a debugger to an already-executing shader — a common approach for debugging shaders that do not terminate due to synchronization errors. In these cases, TDR detection is often disabled and a debugger is attached to the running hung shader. Making the semantics include write to unknown memory address will help prevent the compiler optimization from reordering the calls with other source statements so it will stay where the user expected from a source language perspective.
The operation should also be marked as expected to return FALSE (e.g., the equivalent of __builtin_expect(expr, 0) in Clang). This will give the compiler the information to allow it to move any conditionally-executed debugging code out of the inline path as "cold" code to reduce instruction cache pressure and improve branch prediction. The compiler may choose to limit optimizations in the cold code to minimize the impact on the hot code. For example, limiting scheduling so that register pressure in the cold path does not exceed that in the hot path which could cause occupancy and other performance issues.
The desire is that debugging code can be left in production code and have negligible performance impact.
There was a problem hiding this comment.
See the discussion above about convergent and volatile.
| Behavioral changes may include: | ||
|
|
||
| - Breaking regardless of a debugger being attached | ||
| - Breaking regardless of a debugging enabled |
There was a problem hiding this comment.
Grammar: suggest "Breaking regardless of debugging being enabled" or "Breaking unconditionally".
| - Disabling debug break instructions entirely | ||
|
|
||
| It is expected that the driver compiler will alter behavior during lowering | ||
| It is expected that the driver compiler will alter behavior during lowering |
There was a problem hiding this comment.
I am not sure this section belongs in the HLSL compiler specification. Runtime finalization is not the only compilation model — shaders may be compiled offline, in which case it is not known at compilation time whether debugging will be enabled when the generated code is executed. The runtime could provided ways for the user to control when debugging is enabled independently of the compilation process. An advantage is that the developer does not have to wait for a re-finalization each time they change the setting: a single code object can execute in either mode.
If the intent is to allow the compiler to eliminate debugging code, that is reasonable but suggest it is specified differently: compile out all DebugBreak operations and treat IsDebuggingEnabled as the constant FALSE, then perform dead code elimination. This is distinct from the compiler needing to know about runtime behavior during lowering.
| supported by NVIDIA's NSight and can be safely ignored by Vulkan runtimes. | ||
|
|
||
| No SPIR-V lowering is defined for `dx::IsDebuggerPresent()`. | ||
| No SPIR-V lowering is defined for `dx::IsDebuggingEnabled()`. |
There was a problem hiding this comment.
Is there a plan to pursue adding an equivalent to IsDebuggingEnabled for SPIR-V? Could SPIR-V's Specialization Constants be used for this?
|
|
||
| * Consider introducing the `convergent` attribute to DXIL. | ||
| * This should be "cheap" and would potentially address pre-existing bugs. | ||
| * This would preserve the requirement that these operations not be moved |
There was a problem hiding this comment.
Adding convergent to DXIL may be a good thing for other primitives that cause communication between dynamic instances of an operation across the lanes of a wave, but IsDebuggingEnabled/DebugBreak are not such operations.
I thought we had already discussed and done this...
This is mostly clarifications that the behavior isn't actually dependent on a debugger being attached, but rather on runtime state (which will be set by a debugger, but may also be set separately).
I've also renamed
dx::IsDebuggerAttachedtodx::IsDebuggingEnabledto more clearly reflect the change.