Vulkan validation layer and decoder errors and assert fixes#199
Vulkan validation layer and decoder errors and assert fixes#199
Conversation
|
dabrain34, @srinathkr-nv and @krajunv, please review these changes. |
|
|
||
| // For show existing frame, skip stale reference invalidation and return early. | ||
| // Show existing frames don't have a valid frameEncodeEncodeOrderNum since they | ||
| // don't represent a new encoded frame - they just display a previously encoded one. |
There was a problem hiding this comment.
Is this a new problem. I did not see any issue when encoding. Either it was holding correct state or initialized to some non-negative value.
There was a problem hiding this comment.
I just noticed that the assert() was added with commit#1a883255 and I think this check uncovered the issue in debug mode.
Your change looks good to me.
| // Show existing frames don't have a valid frameEncodeEncodeOrderNum since they | ||
| // don't represent a new encoded frame - they just display a previously encoded one. | ||
| if (pFrameInfo->bShowExistingFrame) { | ||
| pFrameInfo->stdPictureInfo.refresh_frame_flags = (uint8_t)m_dpbAV1->GetRefreshFrameFlags(pFrameInfo->bShownKeyFrameOrSwitch, pFrameInfo->bShowExistingFrame); |
There was a problem hiding this comment.
Adding this line is fine, but updating refresh_frame_flags isn’t required because it isn’t encoded when show_existing_frame is 1.
There was a problem hiding this comment.
Thank you for looking at the changes, @krajunv! Can you please be more specific on the change you want me to make?
There was a problem hiding this comment.
This line doesn’t affect the show_existing_frame=1 frame header encoding and can be safely removed.
There was a problem hiding this comment.
Yes, the refresh_frame_flags is not present in the bitstream when show_existing_frame=1. But we would still want to keep this line because:
Vulkan API validation - The driver may still validate the struct fields even if it doesn't encode them. Setting it to 0 ensures no validation warnings about unexpected non-zero values.
Internal DPB tracking - While the driver won't write it to the bitstream, it may still use the value internally to verify no reference buffer updates are being requested.
Defensive coding - If a future Vulkan implementation or validation layer starts checking this field, having it correctly set to 0 prevents issues.
Bottom line: It's technically unnecessary for bitstream generation, but harmless and follows the principle of keeping the struct consistent with what the encode operation semantically means (no reference updates).
Do you have a strong objection on this, @krajunv?
When creating combined image views for multi-planar YCbCr formats that have STORAGE_BIT in their image usage, the view must restrict its usage via VkImageViewUsageCreateInfo since multi-planar formats don't support storage operations (only per-plane R8/RG8 views do). This fixes VUID-VkImageViewCreateInfo-usage-02275 validation errors. Reference: https://gitlab.khronos.org/vulkan/vulkan/-/issues/4624 Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
…YCbCr Multi-planar formats require VkSamplerYcbcrConversion for sampling. Combined image views created without YCbCr conversion cannot have SAMPLED_BIT in their usage. Use VkImageViewUsageCreateInfo to exclude both STORAGE_BIT and SAMPLED_BIT from combined views. This fixes VUID-VkImageViewCreateInfo-format-06415 validation errors. Reference: https://gitlab.khronos.org/vulkan/vulkan/-/issues/4624 Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
…formats The samplerYcbcrConversion feature must be enabled during device creation to use VkSamplerYcbcrConversion objects with multi-planar YCbCr formats. This fixes VUID-vkCreateSamplerYcbcrConversion-None-01648 validation error. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
1. Set layerCount=0 when rateControlMode is DEFAULT or DISABLED (fixes VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-08248) 2. Ensure idrPeriod >= gopFrameCount by setting idrPeriod=0 when gopFrameCount is larger (e.g., UINT32_MAX for infinite GOP) (fixes VUID-VkVideoEncodeH264RateControlInfoKHR-idrPeriod-08284) Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
When a new sequence starts (not a reconfigure), the DPB is reset via Deinit() which sets m_dpbMaxSize=0. However, m_pictureToDpbSlotMap and m_dpbSlotsMask still contained old values from the previous sequence. When ResetPicDpbSlots() tried to free old slots, it called FreeSlot() with slot indices that were valid in the old DPB but exceed the new m_dpbMaxSize, causing an assertion failure: 'assert((uint8_t)slot < m_dpbMaxSize)' failed The fix clears m_pictureToDpbSlotMap and m_dpbSlotsMask before initializing the new DPB when starting a fresh sequence. This fixes the H264 clip-c test which has multiple resolution changes (352x288 -> 176x144) within the same stream. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
For SVC (scalable video coding) content with multiple spatial layers, the tile count across all layers in a single frame can exceed the 64-element limit of the tileOffsets[] and tileSizes[] arrays. Without bounds checking, writes beyond element 64 would overflow into the pStdSps pointer field, corrupting it with tile offset/size data. This caused segfaults when decoding SVC content like L2T1 (2 spatial layers, 1 temporal layer) streams. GDB analysis showed: - tileCount reached 70-170 across multi-layer frames - pStdSps was corrupted to values like 0x31d2000031d0 (tile offsets) - Crash occurred when dereferencing the corrupted pointer Fix: Add bounds check in ParseObuTileGroup() to reject frames with more than 64 tiles, preventing the buffer overflow. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
…dling)
When encoding AV1 with B-frames (consecutiveBFrameCount > 0), the
encoder creates "show existing frame" P-frames that display previously
encoded frames. These frames have bShowExistingFrame=true.
The issue was in ProcessDpb():
- frameEncodeEncodeOrderNum is only assigned in StartOfVideoCodingEncodeOrder
when bShowExistingFrame=false
- ProcessDpb called InvalidateStaleReferenceFrames() with the uninitialized
frameEncodeEncodeOrderNum (UINT64_MAX) before checking bShowExistingFrame
- This triggered the assertion: frameEncodeEncodeOrderNum <= UINT32_MAX
Fix: Move the bShowExistingFrame early return check before the assertion
and InvalidateStaleReferenceFrames() call. Show existing frames don't need
stale reference invalidation since they don't encode a new frame.
Command line that triggered this crash:
vk-video-enc-test -i input.yuv -c av1 --inputWidth 352 --inputHeight 288 \
--inputChromaSubsampling 420 --numFrames 30 -o output.ivf \
--gopFrameCount 16 --consecutiveBFrameCount 3
This fixes the AV1_gop_16_b3 encoder test.
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
b68bce2 to
3ef4fd1
Compare
No description provided.