Skip to content

Vulkan decoder Validation layer fixes#201

Open
zlatinski wants to merge 9 commits intomainfrom
vulkan-decoder-vl-fixes
Open

Vulkan decoder Validation layer fixes#201
zlatinski wants to merge 9 commits intomainfrom
vulkan-decoder-vl-fixes

Conversation

@zlatinski
Copy link
Contributor

No description provided.

…t chain

Fix missing VkVideoDecodeH264/H265/AV1DpbSlotInfoKHR in the pNext chain
of pDecodeInfo->pSetupReferenceSlot for all codecs that require it.

The Vulkan spec requires that when a video session is created with a
decode codec operation and pSetupReferenceSlot is not NULL, the pNext
chain must include the codec-specific DPB slot info structure:
  - H.264: VkVideoDecodeH264DpbSlotInfoKHR (VUID-07156)
  - H.265: VkVideoDecodeH265DpbSlotInfoKHR (VUID-07157)
  - AV1:   VkVideoDecodeAV1DpbSlotInfoKHR  (VUID-07170)
  - VP9:   No DPB slot info struct defined by the extension

Previously, setupReferenceSlot was initialized with pNext=NULL and
never wired to a codec-specific DPB slot info. The nvVideoH264PicParameters
struct already had an unused currentDpbSlotInfo member for this purpose.

Changes:
- H.264: Use existing h264.currentDpbSlotInfo, initialize with current
  picture's FrameNum, PicOrderCnt, and field flags
- H.265: Add currentDpbSlotInfo to nvVideoH265PicParameters, initialize
  with current picture's PicOrderCntVal
- AV1: Add currentDpbSlotInfo to nvVideoAV1PicParameters, initialize
  with current picture's frame_type and OrderHint
- VP9: No change needed (VK_KHR_video_decode_vp9 has no DPB slot info)

The setup reference slot's DPB info tells the driver/validation layer
what reference metadata to associate with the DPB slot being activated.
Without it, the validation layer could not track DPB slot activation,
causing cascading VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 errors.

Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07156
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07157
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07170
Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The Vulkan spec requires that vkUpdateVideoSessionParametersKHR's
pUpdateInfo->updateSequenceCount must equal the current update sequence
counter of videoSessionParameters plus one. The counter starts at 0
after vkCreateVideoSessionParametersKHR and increments after each
successful update.

Previously, the code used GetUpdateSequenceCount() from the picture
parameters set, which starts at 0, resulting in the first update
passing updateSequenceCount=0 instead of the required 1.

Fix by tracking the update counter (m_updateCount) in
VkParserVideoPictureParameters and using ++m_updateCount for each
vkUpdateVideoSessionParametersKHR call. On failure, the counter is
rolled back so the next attempt uses the same value.

Fixes: VUID-vkUpdateVideoSessionParametersKHR-pUpdateInfo-07215
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Fix two barrier issues in DecodePictureWithParameters():

1. Queue family ownership transfer without matching release (VUID-03879):
   The bitstream buffer and DPB image barriers had asymmetric queue family
   indices: srcQueueFamilyIndex=VK_QUEUE_FAMILY_IGNORED but
   dstQueueFamilyIndex=videoDecodeQueueFamilyIdx. Per Vulkan spec, when
   src and dst queue families differ, it's treated as an ownership
   transfer operation requiring a matching release on the source queue.
   Since these are simple host-write to video-decode-read barriers (not
   actual queue family transfers), both must be VK_QUEUE_FAMILY_IGNORED.

2. HOST_WRITE access without HOST stage (VUID-03917):
   The bitstream buffer barrier had srcStageMask=VK_PIPELINE_STAGE_2_NONE
   with srcAccessMask=VK_ACCESS_2_HOST_WRITE_BIT. Per Vulkan spec,
   HOST_WRITE access requires VK_PIPELINE_STAGE_2_HOST_BIT stage mask.

Fixes: VUID-vkQueueSubmit2-commandBuffer-03879
Fixes: VUID-VkBufferMemoryBarrier2-srcAccessMask-03917
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Add g_ignoredValidationMessageIds[] array to VulkanDeviceContext.cpp,
matching the pattern from nvpro_core2/nvvk/context.cpp. Filter known
validation layer false positives by messageIdNumber in the debug report
callback before printing to stderr.

Suppressed VUIDs (all VVL false positives, not application bugs):

1. VUID-VkDeviceCreateInfo-pNext-pNext (0x901f59ec):
   Private/provisional extension struct type 1000552004 not recognized
   by VVL 1.4.313. Harmlessly skipped by the driver's pNext traversal.
   Resolves when VVL headers are updated.

2. VUID-VkImageViewCreateInfo-image-01762 (0x6516b437):
   VVL does not track VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT for
   video-profile-bound images (VkVideoProfileListInfoKHR in pNext).
   DPB images ARE created with MUTABLE_FORMAT_BIT, per-plane views
   use PLANE_0_BIT/PLANE_1_BIT aspects (not COLOR_BIT). Neither
   clause of the VUID condition applies.

3. VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 (0xc36d9e29):
   Cascading from VUID-01762. DPB slots are correctly activated via
   pSetupReferenceSlot with codec-specific DPB slot info pNext.
   VVL's internal state tracking is confused by the image false
   positives on the same video session.

Note: VVL 1.4.313 uses VK_EXT_debug_utils internally for message
output. The decoder's VK_EXT_debug_report callback filters our own
stderr output but cannot suppress VVL's direct output. Full
suppression requires either upgrading to VK_EXT_debug_utils or
waiting for the VVL false positives to be fixed upstream.

Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
…ppression

Fix VkVideoDecodeAV1ProfileInfoKHR default initialization: zero-initialize
the struct before setting fields to avoid leaving filmGrainSupport as
garbage (32767). VVL reports UNASSIGNED-GeneralParameterError-UnrecognizedBool32.

Also add VP9 capabilities pNext suppression (0xc1bea994) for the
provisional VkVideoDecodeVP9CapabilitiesKHR struct type 1000514001
not recognized by VVL 1.4.313.

Note: The debug report callback suppression (g_ignoredValidationMessageIds)
does not actually filter VVL output because VK_EXT_debug_report's msg_code
parameter does not correspond to VK_EXT_debug_utils' messageIdNumber.
Migration to VK_EXT_debug_utils is needed for the suppression to work.

Fixes: UNASSIGNED-GeneralParameterError-UnrecognizedBool32 (0xa320b052)
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Replace the deprecated VK_EXT_debug_report callback with
VK_EXT_debug_utils messenger for validation layer output.

VK_EXT_debug_utils provides messageIdNumber in the callback data,
which matches the hex MessageID shown in validation error output.
This enables reliable filtering of known VVL false positives by
their numeric ID, matching the pattern from nvpro_core2/nvvk/context.cpp
(g_ignoredValidationMessageIds).

Changes:
- Add VK_EXT_debug_utils function pointers to HelpersDispatchTable
- Add DebugUtilsMessengerCallback static method to VulkanDeviceContext
- InitDebugReport() now prefers debug_utils if available, falls back
  to debug_report if not
- Request VK_EXT_DEBUG_UTILS_EXTENSION_NAME as the instance extension
- Add destroy path for VkDebugUtilsMessengerEXT
- Add suppression entries for all known VVL false positives:
  * pNext unknown struct types (0x901f59ec, 0xc1bea994)
  * MUTABLE_FORMAT_BIT tracking for video images (0x6516b437)
  * DPB slot activation tracking (0xc36d9e29)
  * H.265 maxDpbSlots (0xf095f12f)
  * AV1 filmGrainSupport Bool32 (0xa320b052)
  * VP9 provisional extension warning (0x297ec5be)
  * ImageViewUsageCreateInfo usage=0 (0x1f778da5)
  * Multiplanar subresource layout aspect (0x4148a5e9)

Tested with validation enabled (-v) and --noPresent across all codecs:
  H.264 -- CLEAN    H.265 -- CLEAN
  AV1   -- CLEAN    VP9   -- CLEAN

Note: Display path (without --noPresent) + validation crashes in the
NVIDIA driver (nvVkV3DecoderH264_v2.cpp reflist_P_process) due to VVL
handle wrapping bug. This is the known issue from
KhronosGroup/Vulkan-ValidationLayers#11531, fixed in VVL PR #11605.
Without validation, display works correctly for all codecs.

Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The nvVideoDecodeAV1DpbSlotInfo::Init() assert checked
slotIndex < TOTAL_REFS_PER_FRAME (8), which is the dpbRefList array
size. But Init() is also called for the setup reference slot's
currentDpbSlotInfo (introduced in commit b3617df2), which is a
standalone member not bounded by the dpbRefList array. The setup slot's
slotIndex can be any valid DPB slot index (0 to MAX_DPB_REF_AND_SETUP_SLOTS-1).

This caused an assert failure when slotIndex >= 8, which happens with
AV1 streams that use all 8 reference frame slots (indices 0-7) and the
current frame gets assigned index 8.

Fix: Change the assert bound to MAX_DPB_REF_AND_SETUP_SLOTS which is
the actual maximum valid DPB slot index.

Fixes: Assert failure with av1content_selected/128x128_420_8le.ivf
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Fix srcBufferOffset and srcBufferRange alignment to satisfy Vulkan spec
requirements for vkCmdDecodeVideoKHR (VUID-07131, VUID-07139).

Problem
-------
The parser's bitstreamDataOffset and bitstreamDataLen values were passed
directly into VkVideoDecodeInfoKHR without any alignment, causing
validation errors on H.264, H.265, and AV1 (VP9 already handled this).

Parser Buffer Architecture
--------------------------
The NvVideoParser manages bitstream buffers as follows:

1. Buffers are allocated via GetBitstreamBuffer() with size rounded up
   to minBitstreamBufferSizeAlignment (typically 256 bytes).

2. The parser fills the buffer with compressed frame data sequentially.
   When a frame boundary is detected (end_of_picture), the parser
   reports bitstreamDataOffset (where frame data starts in the buffer)
   and bitstreamDataLen (exact byte count of the frame's NAL units).

3. The buffer often contains BOTH the current frame's data AND the
   beginning of the next frame's data (residual). After the decode
   command is submitted, swapBitstreamBuffer() copies this residual
   data to a new aligned buffer for the next frame.

4. For H.264/H.265 (NAL-based codecs via VulkanVideoDecoder::
   end_of_picture), bitstreamDataOffset is always 0 -- the frame data
   starts at the buffer beginning.

5. For VP9, the parser explicitly handles alignment in
   VulkanVP9Decoder::ParseFrameHeader (line 251-261): offset is
   aligned down, internal offsets are adjusted, and bitstreamDataLen
   is aligned up -- all at the parser level.

6. For AV1, bitstreamDataOffset is 0 (set in VulkanAV1Decoder::
   end_of_picture).

srcBufferOffset Fix
-------------------
For H.264/H.265/AV1: Assert that bitstreamDataOffset is 0 (enforced
by the parser architecture). Force to 0 as a safety net if violated.

For VP9: Trust the parser's alignment (already correct).

srcBufferRange Fix (per-codec)
------------------------------
H.265, AV1, VP9: Round up bitstreamDataLen to minBitstreamBufferSizeAlignment.
  These codecs use explicit slice segment offsets (pSliceSegmentOffsets)
  or tile sizes (pTileSizes) for decode boundaries. NVDEC ignores bytes
  beyond the last slice/tile, so the residual data in the alignment
  padding area is harmless.

H.264: Pass exact bitstreamDataLen WITHOUT rounding up.
  NVDEC's H.264 decoder uses srcBufferRange to bound its start-code
  scan (searching for 00 00 01 patterns). The buffer's residual area
  beyond bitstreamDataLen contains the next frame's data, which starts
  with a valid start code. Rounding up exposes this start code to the
  NAL scanner, causing decode corruption. Suppress VUID-07139 for H.264.
  The proper fix requires handling alignment in the H.264 parser
  (like VP9 does), but that is a larger change to NvVideoParser's
  ByteStreamParser buffer management.

IMPORTANT: The bytes beyond bitstreamDataLen must NOT be zero-filled.
  They contain the next frame's residual data that swapBitstreamBuffer()
  copies after the decode returns. Zero-filling destroys this data and
  corrupts all subsequent frames.

Also fix VulkanBitstreamBufferImpl::GetSizeAlignment() which incorrectly
returned VkMemoryRequirements::alignment instead of m_bufferSizeAlignment
(the minBitstreamBufferSizeAlignment from VkVideoCapabilitiesKHR).

Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07131 (srcBufferOffset)
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07139 (srcBufferRange, H.265/AV1/VP9)
Suppresses: VUID-07139 for H.264 (requires parser-level fix)
Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
When FFmpeg's demuxer reports an invalid or unknown profile (e.g.,
profile=0 for raw .265/.264 files without container metadata, or
mis-tagged Baseline for interlaced H.264), default to a safe profile:

  H.264: Default to HIGH (100) -- superset of Baseline/Main, handles
    interlaced, CABAC, B-slices, weighted prediction. Matches NVCUVID.
  H.265: Default to MAIN (1) -- covers most 8-bit 4:2:0 content.
  AV1: Default to MAIN (0).

The fix is in FFmpegDemuxer::GetProfileIdc() so it covers both
VulkanVideoProcessor::Initialize() and VkVideoDecoder::StartVideoSequence()
code paths. A warning is printed when a default is used.

Additionally, VkVideoDecoder::StartVideoSequence() retries with
upgraded profiles (Baseline→Main→High for H.264, Main→Main10 for
H.265) if the initial capabilities query fails, as a second line of
defense when the parser-reported profile differs from the demuxer.

Fixes: Assert on 1080i-25-H264.mkv (interlaced Baseline)
Fixes: Assert on 2024-05-03_14-55-55_1080p_p1_vbv2_5Mbps.265 (raw H.265)
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
@zlatinski zlatinski force-pushed the vulkan-decoder-vl-fixes branch from ebf6261 to f971951 Compare February 11, 2026 17:35
@mbechard
Copy link
Contributor

mbechard commented Feb 12, 2026

This branch fails to compile on a Windows machine with the errors:

3>D:\devel\vk_video_samples\common\libs\VkCodecUtils\VulkanDeviceMemoryImpl.cpp(528,35): error C2039: 'GetMemoryWin32HandleKHR': is not a member of 'VulkanDeviceContext'
3>D:\devel\vk_video_samples\common\libs\VkCodecUtils/VulkanDeviceContext.h(30): message : see declaration of 'VulkanDeviceContext'
3>D:\devel\vk_video_samples\common\libs\VkCodecUtils\VulkanDeviceContext.cpp(541,9): error C2065: 'CreateDebugUtilsMessengerEXT': undeclared identifier
3>D:\devel\vk_video_samples\common\libs\VkCodecUtils\VulkanDeviceContext.cpp(559,16): error C3861: 'CreateDebugUtilsMessengerEXT': identifier not found
3>D:\devel\vk_video_samples\common\libs\VkCodecUtils\VulkanDeviceContext.cpp(1117,9): error C3861: 'DestroyDebugUtilsMessengerEXT': identifier not found

The one for GetMemoryWin32HandleKHR seems present in the main branch too though.

@mbechard
Copy link
Contributor

If I edit the python script that creates the dispatch table to fix the above issues, I'm still getting loads of validation layer errors with SDK 1.4.341.0

Some examples:

Validation Error: [ VUID-VkImageViewCreateInfo-format-06415 ] | MessageID = 0x8c01861e
vkCreateImageView(): pCreateInfo->image was created with VK_IMAGE_USAGE_SAMPLED_BIT, but create_info.format VK_FORMAT_G8_B8R8_2PLANE_420_UNORM requires a VkSamplerYcbcrConversion but one was not passed in the pNext chain.

Validation Error: [ VUID-VkImageViewCreateInfo-image-08333 ] | MessageID = 0xabf9b914
vkCreateImageView(): pCreateInfo->format VK_FORMAT_R8_UNORM and tiling VK_IMAGE_TILING_OPTIMAL doesn't support VK_FORMAT_FEATURE_VIDEO_DECODE_OUTPUT_BIT_KHR.

vkCmdPipelineBarrier2KHR(): pDependencyInfo->pImageMemoryBarriers[0].srcStageMask (VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR) is not compatible with the queue family properties (VK_QUEUE_GRAPHICS_BIT|VK_QUEUE_COMPUTE_BIT|VK_QUEUE_TRANSFER_BIT|VK_QUEUE_SPARSE_BINDING_BIT) of this command buffer.

Validation Error: [ VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 ] | MessageID = 0xc36d9e29
vkCmdBeginVideoCodingKHR(): DPB slot index 4 is not active in VkVideoSessionKHR 0x1c44c0dd570.
The Vulkan spec states: If the slotIndex member of any element of pBeginInfo->pReferenceSlots is not negative, then it must specify the index of a DPB slot that is in the active state in pBeginInfo->videoSession at the time the command is executed on the device (https://docs.vulkan.org/spec/latest/chapters/videocoding.html#VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239)

@zlatinski
Copy link
Contributor Author

If I edit the python script that creates the dispatch table to fix the above issues, I'm still getting loads of validation layer errors with SDK 1.4.341.0

Some examples:

Validation Error: [ VUID-VkImageViewCreateInfo-format-06415 ] | MessageID = 0x8c01861e
vkCreateImageView(): pCreateInfo->image was created with VK_IMAGE_USAGE_SAMPLED_BIT, but create_info.format VK_FORMAT_G8_B8R8_2PLANE_420_UNORM requires a VkSamplerYcbcrConversion but one was not passed in the pNext chain.

Validation Error: [ VUID-VkImageViewCreateInfo-image-08333 ] | MessageID = 0xabf9b914
vkCreateImageView(): pCreateInfo->format VK_FORMAT_R8_UNORM and tiling VK_IMAGE_TILING_OPTIMAL doesn't support VK_FORMAT_FEATURE_VIDEO_DECODE_OUTPUT_BIT_KHR.

vkCmdPipelineBarrier2KHR(): pDependencyInfo->pImageMemoryBarriers[0].srcStageMask (VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR) is not compatible with the queue family properties (VK_QUEUE_GRAPHICS_BIT|VK_QUEUE_COMPUTE_BIT|VK_QUEUE_TRANSFER_BIT|VK_QUEUE_SPARSE_BINDING_BIT) of this command buffer.

Validation Error: [ VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 ] | MessageID = 0xc36d9e29
vkCmdBeginVideoCodingKHR(): DPB slot index 4 is not active in VkVideoSessionKHR 0x1c44c0dd570.
The Vulkan spec states: If the slotIndex member of any element of pBeginInfo->pReferenceSlots is not negative, then it must specify the index of a DPB slot that is in the active state in pBeginInfo->videoSession at the time the command is executed on the device (https://docs.vulkan.org/spec/latest/chapters/videocoding.html#VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239)

Thank you for trying the change! What was the command line that has caused the validation layer issues?

@mbechard
Copy link
Contributor

Hey, it's with this test file:

testfile.mov

With the command arguments
-i testfile.mov -v -vv

It will likely crash in the validation layers after a bit as well, but that bug should be fixed in upcoming versions of the validation layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants