Skip to content

Encoder profiles json and drm format mod support#205

Open
zlatinski wants to merge 63 commits intomainfrom
encoder-profiles-json-and-drm-format-mod
Open

Encoder profiles json and drm format mod support#205
zlatinski wants to merge 63 commits intomainfrom
encoder-profiles-json-and-drm-format-mod

Conversation

@zlatinski
Copy link
Contributor

No description provided.

…t chain

Fix missing VkVideoDecodeH264/H265/AV1DpbSlotInfoKHR in the pNext chain
of pDecodeInfo->pSetupReferenceSlot for all codecs that require it.

The Vulkan spec requires that when a video session is created with a
decode codec operation and pSetupReferenceSlot is not NULL, the pNext
chain must include the codec-specific DPB slot info structure:
  - H.264: VkVideoDecodeH264DpbSlotInfoKHR (VUID-07156)
  - H.265: VkVideoDecodeH265DpbSlotInfoKHR (VUID-07157)
  - AV1:   VkVideoDecodeAV1DpbSlotInfoKHR  (VUID-07170)
  - VP9:   No DPB slot info struct defined by the extension

Previously, setupReferenceSlot was initialized with pNext=NULL and
never wired to a codec-specific DPB slot info. The nvVideoH264PicParameters
struct already had an unused currentDpbSlotInfo member for this purpose.

Changes:
- H.264: Use existing h264.currentDpbSlotInfo, initialize with current
  picture's FrameNum, PicOrderCnt, and field flags
- H.265: Add currentDpbSlotInfo to nvVideoH265PicParameters, initialize
  with current picture's PicOrderCntVal
- AV1: Add currentDpbSlotInfo to nvVideoAV1PicParameters, initialize
  with current picture's frame_type and OrderHint
- VP9: No change needed (VK_KHR_video_decode_vp9 has no DPB slot info)

The setup reference slot's DPB info tells the driver/validation layer
what reference metadata to associate with the DPB slot being activated.
Without it, the validation layer could not track DPB slot activation,
causing cascading VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 errors.

Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07156
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07157
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07170
Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The Vulkan spec requires that vkUpdateVideoSessionParametersKHR's
pUpdateInfo->updateSequenceCount must equal the current update sequence
counter of videoSessionParameters plus one. The counter starts at 0
after vkCreateVideoSessionParametersKHR and increments after each
successful update.

Previously, the code used GetUpdateSequenceCount() from the picture
parameters set, which starts at 0, resulting in the first update
passing updateSequenceCount=0 instead of the required 1.

Fix by tracking the update counter (m_updateCount) in
VkParserVideoPictureParameters and using ++m_updateCount for each
vkUpdateVideoSessionParametersKHR call. On failure, the counter is
rolled back so the next attempt uses the same value.

Fixes: VUID-vkUpdateVideoSessionParametersKHR-pUpdateInfo-07215
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Fix two barrier issues in DecodePictureWithParameters():

1. Queue family ownership transfer without matching release (VUID-03879):
   The bitstream buffer and DPB image barriers had asymmetric queue family
   indices: srcQueueFamilyIndex=VK_QUEUE_FAMILY_IGNORED but
   dstQueueFamilyIndex=videoDecodeQueueFamilyIdx. Per Vulkan spec, when
   src and dst queue families differ, it's treated as an ownership
   transfer operation requiring a matching release on the source queue.
   Since these are simple host-write to video-decode-read barriers (not
   actual queue family transfers), both must be VK_QUEUE_FAMILY_IGNORED.

2. HOST_WRITE access without HOST stage (VUID-03917):
   The bitstream buffer barrier had srcStageMask=VK_PIPELINE_STAGE_2_NONE
   with srcAccessMask=VK_ACCESS_2_HOST_WRITE_BIT. Per Vulkan spec,
   HOST_WRITE access requires VK_PIPELINE_STAGE_2_HOST_BIT stage mask.

Fixes: VUID-vkQueueSubmit2-commandBuffer-03879
Fixes: VUID-VkBufferMemoryBarrier2-srcAccessMask-03917
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Add g_ignoredValidationMessageIds[] array to VulkanDeviceContext.cpp,
matching the pattern from nvpro_core2/nvvk/context.cpp. Filter known
validation layer false positives by messageIdNumber in the debug report
callback before printing to stderr.

Suppressed VUIDs (all VVL false positives, not application bugs):

1. VUID-VkDeviceCreateInfo-pNext-pNext (0x901f59ec):
   Private/provisional extension struct type 1000552004 not recognized
   by VVL 1.4.313. Harmlessly skipped by the driver's pNext traversal.
   Resolves when VVL headers are updated.

2. VUID-VkImageViewCreateInfo-image-01762 (0x6516b437):
   VVL does not track VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT for
   video-profile-bound images (VkVideoProfileListInfoKHR in pNext).
   DPB images ARE created with MUTABLE_FORMAT_BIT, per-plane views
   use PLANE_0_BIT/PLANE_1_BIT aspects (not COLOR_BIT). Neither
   clause of the VUID condition applies.

3. VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 (0xc36d9e29):
   Cascading from VUID-01762. DPB slots are correctly activated via
   pSetupReferenceSlot with codec-specific DPB slot info pNext.
   VVL's internal state tracking is confused by the image false
   positives on the same video session.

Note: VVL 1.4.313 uses VK_EXT_debug_utils internally for message
output. The decoder's VK_EXT_debug_report callback filters our own
stderr output but cannot suppress VVL's direct output. Full
suppression requires either upgrading to VK_EXT_debug_utils or
waiting for the VVL false positives to be fixed upstream.

Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
…ppression

Fix VkVideoDecodeAV1ProfileInfoKHR default initialization: zero-initialize
the struct before setting fields to avoid leaving filmGrainSupport as
garbage (32767). VVL reports UNASSIGNED-GeneralParameterError-UnrecognizedBool32.

Also add VP9 capabilities pNext suppression (0xc1bea994) for the
provisional VkVideoDecodeVP9CapabilitiesKHR struct type 1000514001
not recognized by VVL 1.4.313.

Note: The debug report callback suppression (g_ignoredValidationMessageIds)
does not actually filter VVL output because VK_EXT_debug_report's msg_code
parameter does not correspond to VK_EXT_debug_utils' messageIdNumber.
Migration to VK_EXT_debug_utils is needed for the suppression to work.

Fixes: UNASSIGNED-GeneralParameterError-UnrecognizedBool32 (0xa320b052)
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Replace the deprecated VK_EXT_debug_report callback with
VK_EXT_debug_utils messenger for validation layer output.

VK_EXT_debug_utils provides messageIdNumber in the callback data,
which matches the hex MessageID shown in validation error output.
This enables reliable filtering of known VVL false positives by
their numeric ID, matching the pattern from nvpro_core2/nvvk/context.cpp
(g_ignoredValidationMessageIds).

Changes:
- Add VK_EXT_debug_utils function pointers to HelpersDispatchTable
- Add DebugUtilsMessengerCallback static method to VulkanDeviceContext
- InitDebugReport() now prefers debug_utils if available, falls back
  to debug_report if not
- Request VK_EXT_DEBUG_UTILS_EXTENSION_NAME as the instance extension
- Add destroy path for VkDebugUtilsMessengerEXT
- Add suppression entries for all known VVL false positives:
  * pNext unknown struct types (0x901f59ec, 0xc1bea994)
  * MUTABLE_FORMAT_BIT tracking for video images (0x6516b437)
  * DPB slot activation tracking (0xc36d9e29)
  * H.265 maxDpbSlots (0xf095f12f)
  * AV1 filmGrainSupport Bool32 (0xa320b052)
  * VP9 provisional extension warning (0x297ec5be)
  * ImageViewUsageCreateInfo usage=0 (0x1f778da5)
  * Multiplanar subresource layout aspect (0x4148a5e9)

Tested with validation enabled (-v) and --noPresent across all codecs:
  H.264 -- CLEAN    H.265 -- CLEAN
  AV1   -- CLEAN    VP9   -- CLEAN

Note: Display path (without --noPresent) + validation crashes in the
NVIDIA driver (nvVkV3DecoderH264_v2.cpp reflist_P_process) due to VVL
handle wrapping bug. This is the known issue from
KhronosGroup/Vulkan-ValidationLayers#11531, fixed in VVL PR #11605.
Without validation, display works correctly for all codecs.

Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The nvVideoDecodeAV1DpbSlotInfo::Init() assert checked
slotIndex < TOTAL_REFS_PER_FRAME (8), which is the dpbRefList array
size. But Init() is also called for the setup reference slot's
currentDpbSlotInfo (introduced in commit b3617df2), which is a
standalone member not bounded by the dpbRefList array. The setup slot's
slotIndex can be any valid DPB slot index (0 to MAX_DPB_REF_AND_SETUP_SLOTS-1).

This caused an assert failure when slotIndex >= 8, which happens with
AV1 streams that use all 8 reference frame slots (indices 0-7) and the
current frame gets assigned index 8.

Fix: Change the assert bound to MAX_DPB_REF_AND_SETUP_SLOTS which is
the actual maximum valid DPB slot index.

Fixes: Assert failure with av1content_selected/128x128_420_8le.ivf
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Fix srcBufferOffset and srcBufferRange alignment to satisfy Vulkan spec
requirements for vkCmdDecodeVideoKHR (VUID-07131, VUID-07139).

Problem
-------
The parser's bitstreamDataOffset and bitstreamDataLen values were passed
directly into VkVideoDecodeInfoKHR without any alignment, causing
validation errors on H.264, H.265, and AV1 (VP9 already handled this).

Parser Buffer Architecture
--------------------------
The NvVideoParser manages bitstream buffers as follows:

1. Buffers are allocated via GetBitstreamBuffer() with size rounded up
   to minBitstreamBufferSizeAlignment (typically 256 bytes).

2. The parser fills the buffer with compressed frame data sequentially.
   When a frame boundary is detected (end_of_picture), the parser
   reports bitstreamDataOffset (where frame data starts in the buffer)
   and bitstreamDataLen (exact byte count of the frame's NAL units).

3. The buffer often contains BOTH the current frame's data AND the
   beginning of the next frame's data (residual). After the decode
   command is submitted, swapBitstreamBuffer() copies this residual
   data to a new aligned buffer for the next frame.

4. For H.264/H.265 (NAL-based codecs via VulkanVideoDecoder::
   end_of_picture), bitstreamDataOffset is always 0 -- the frame data
   starts at the buffer beginning.

5. For VP9, the parser explicitly handles alignment in
   VulkanVP9Decoder::ParseFrameHeader (line 251-261): offset is
   aligned down, internal offsets are adjusted, and bitstreamDataLen
   is aligned up -- all at the parser level.

6. For AV1, bitstreamDataOffset is 0 (set in VulkanAV1Decoder::
   end_of_picture).

srcBufferOffset Fix
-------------------
For H.264/H.265/AV1: Assert that bitstreamDataOffset is 0 (enforced
by the parser architecture). Force to 0 as a safety net if violated.

For VP9: Trust the parser's alignment (already correct).

srcBufferRange Fix (per-codec)
------------------------------
H.265, AV1, VP9: Round up bitstreamDataLen to minBitstreamBufferSizeAlignment.
  These codecs use explicit slice segment offsets (pSliceSegmentOffsets)
  or tile sizes (pTileSizes) for decode boundaries. NVDEC ignores bytes
  beyond the last slice/tile, so the residual data in the alignment
  padding area is harmless.

H.264: Pass exact bitstreamDataLen WITHOUT rounding up.
  NVDEC's H.264 decoder uses srcBufferRange to bound its start-code
  scan (searching for 00 00 01 patterns). The buffer's residual area
  beyond bitstreamDataLen contains the next frame's data, which starts
  with a valid start code. Rounding up exposes this start code to the
  NAL scanner, causing decode corruption. Suppress VUID-07139 for H.264.
  The proper fix requires handling alignment in the H.264 parser
  (like VP9 does), but that is a larger change to NvVideoParser's
  ByteStreamParser buffer management.

IMPORTANT: The bytes beyond bitstreamDataLen must NOT be zero-filled.
  They contain the next frame's residual data that swapBitstreamBuffer()
  copies after the decode returns. Zero-filling destroys this data and
  corrupts all subsequent frames.

Also fix VulkanBitstreamBufferImpl::GetSizeAlignment() which incorrectly
returned VkMemoryRequirements::alignment instead of m_bufferSizeAlignment
(the minBitstreamBufferSizeAlignment from VkVideoCapabilitiesKHR).

Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07131 (srcBufferOffset)
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07139 (srcBufferRange, H.265/AV1/VP9)
Suppresses: VUID-07139 for H.264 (requires parser-level fix)
Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: #183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
When FFmpeg's demuxer reports an invalid or unknown profile (e.g.,
profile=0 for raw .265/.264 files without container metadata, or
mis-tagged Baseline for interlaced H.264), default to a safe profile:

  H.264: Default to HIGH (100) -- superset of Baseline/Main, handles
    interlaced, CABAC, B-slices, weighted prediction. Matches NVCUVID.
  H.265: Default to MAIN (1) -- covers most 8-bit 4:2:0 content.
  AV1: Default to MAIN (0).

The fix is in FFmpegDemuxer::GetProfileIdc() so it covers both
VulkanVideoProcessor::Initialize() and VkVideoDecoder::StartVideoSequence()
code paths. A warning is printed when a default is used.

Additionally, VkVideoDecoder::StartVideoSequence() retries with
upgraded profiles (Baseline→Main→High for H.264, Main→Main10 for
H.265) if the initial capabilities query fails, as a second line of
defense when the parser-reported profile differs from the demuxer.

Fixes: Assert on 1080i-25-H264.mkv (interlaced Baseline)
Fixes: Assert on 2024-05-03_14-55-55_1080p_p1_vbv2_5Mbps.265 (raw H.265)
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
…cripts

Add comprehensive documentation for the DRM format modifier test suite:

Design & Architecture:
- DESIGN.md: Test application architecture and data flow
- DRM_Format_Mod_Architecture.svg: Visual architecture diagram
- README.md: Usage guide, CLI options, environment variables,
  compression testing instructions

Validation Layer:
- VALIDATION_LAYER_CRASH_REPORT.md: Validation layer crash report with
  spec analysis (NVIDIA-specific, Intel unaffected)
- 0001-state_tracker-*.patch: NULL check fix for plane_info in
  UpdateBindImageMemoryInfo
- run_tests_with_patched_layer.sh: Script to build and run tests with
  the patched validation layer
…sync

New vulkan_video_encoder_ext.h adds:
- VkVideoEncoderConfig: Structured config (alternative to argc/argv)
- VkVideoEncodeInputFrame: External frame descriptor with VkImage,
  format, layout, frame ID, PTS, force-IDR, QP override, and
  wait/signal semaphore arrays for timeline semaphore synchronization
- VkVideoEncodeResult: Encoded frame result with bitstream pointer,
  size, picture type, IDR flag, and DTS
- VulkanVideoEncoderExt: Extended interface with InitializeExt(),
  SubmitExternalFrame(), GetEncodedFrame(), Flush(), Reconfigure()
- CreateVulkanVideoEncoderExt() factory function

This extends the existing VulkanVideoEncoder without breaking backward
compatibility. The base file-based interface remains for existing apps.
The extended interface is for cross-process encoder services that
receive frames via DMA-BUF import with timeline semaphore sync.
ENCODER_EXTERNAL_FRAME_INPUT_DESIGN.md documents:
- Current frame path analysis (LoadNextFrame -> StageInputFrame ->
  EncodeFrameCommon -> AssembleBitstreamData) with all internal types
- Three input paths slicing through the pipeline:
  A) Optimal YCbCr: zero-copy direct inject into srcEncodeImageResource
  B) Linear YCbCr: inject into srcStagingImageView, upload only
  C) RGBA (any tiling): inject + filter (RGBA->YCbCr via compute)
- Path selection logic based on format and tiling
- New VkVideoEncoder internal methods: SetExternalInputFrame(),
  SetExternalInputImage() with wait/signal semaphore arrays
- New VkVideoEncodeFrameInfo fields: externalInputImage (ref-counted),
  isExternalInput, inputWait/SignalSemaphores
- Async bitstream retrieval: RequestBitstreamBuffer() returns fence,
  PollBitstreamReady() non-blocking check, GetBitstreamData() read
- Per-frame timeline semaphore synchronization flow diagram
- Future thread pool design for bitstream assembly
- 4-phase implementation plan
- Backward compatibility analysis

Updated vulkan_video_encoder_ext.h with:
- PollEncodeComplete(frameId) for non-blocking completion check
- ReleaseEncodedFrame(frameId) for explicit buffer pool return
- GetEncodeFence(frameId) for external fence wait (thread pool)
Updated ENCODER_EXTERNAL_FRAME_INPUT_DESIGN.md with detailed analysis
of common operations embedded in each pipeline stage that are NOT
related to file I/O or staging but MUST be replicated by the new
external frame input interface:

LoadNextFrame() side-effects:
- frameInputOrderNum assignment (monotonic counter)
- lastFrame flag, QP map loading, pool acquisition

StageInputFrame() side-effects:
- srcEncodeImageResource pool acquisition
- inputCmdBuffer acquisition and fence reset
- Image layout transitions, row/col replication padding
- AQ subsampled image acquisition
- QP map staging in same command buffer
- CRITICAL: calls EncodeFrameCommon() at the end (line 519)

EncodeFrameCommon() side-effects:
- constQp, videoSession/Parameters, qualityLevel assignment
- GOP position calculation, codec-specific EncodeFrame()
- Rate control commands, QP map processing, intra-refresh
- Bitstream buffer acquisition, AQ processing, EnqueueFrame()

SubmitStagedInputFrame() side-effects:
- Binary semaphore + fence signal, queue submit
- Where external wait/signal semaphores must be injected

Added "Summary: What SetExternalInputFrame Must Do" pseudocode showing
exactly how the new method replicates these operations for each path.
Key design decisions:

1. EncodeFrameCommon() is NOT a problem - it stays unchanged as the
   common tail both file-based and external paths converge into.

2. The new SetExternalInputFrame() must replicate side-effects from
   LoadNextFrame() and StageInputFrame() only:
   - frameInputOrderNum/lastFrame/inputTimeStamp assignment
   - External QP map staging (from IPC, not file)
   - Wrapping external VkImage as VulkanVideoImagePoolNode

3. ALL input paths route through StageInputFrame(), even optimal YCbCr
   (Path A). Rationale:
   - DPB safety: encoder may hold srcEncodeImageResource as reference
     across multiple frames. Copying to internal pool image releases
     the external image immediately after the staging copy.
   - Minimal changes: reuse existing pool acquisition, cmd buffer
     management, and the chain to EncodeFrameCommon().
   - Single semaphore injection point: SubmitStagedInputFrame().

4. SubmitStagedInputFrame() is the ONLY function that needs modification:
   add external wait/signal semaphores from encodeFrameInfo into
   VkSubmitInfo2KHR.

5. New helper: WrapExternalImage() -> VulkanVideoImagePoolNode
   Creates non-owning wrapper around external VkImage/VkDeviceMemory.
   Requires new CreateFromExternalImage() on VkImageResource and
   CreateExternal() on VulkanVideoImagePoolNode.
Core implementation of the extended encoder interface for accepting
externally-provided VkImages with timeline semaphore synchronization.

VkImageResource:
- Add CreateFromExternal() static factory for non-owning wrappers
- Add m_ownsResources flag; Destroy() skips vkDestroyImage when false

VulkanVideoImagePoolNode:
- Add CreateExternal() static factory for non-owning pool nodes
- Sets up m_imageResourceView and m_pictureResourceInfo from external
  image, with no parent pool (won't return to pool on release)

VkVideoEncodeFrameInfo (VkVideoEncoder.h):
- Add isExternalInput flag
- Add inputWaitSemaphores/Values/DstStageMasks vectors
- Add inputSignalSemaphores/Values vectors
- Add ClearExternalInputSync() helper

VkVideoEncoder:
- Add SetExternalInputFrame(): replicates LoadNextFrame() bookkeeping
  (frameInputOrderNum, lastFrame, pts), wraps external image as
  non-owning pool node, stores sync semaphores, calls StageInputFrame()
- Add WrapExternalImage(): creates VkImageResource + VkImageResourceView
  + VulkanVideoImagePoolNode non-owning wrappers from raw VkImage
- Modify SubmitStagedInputFrame(): injects external wait semaphores
  into VkSubmitInfo2KHR waitSemaphoreInfos and external signal
  semaphores into signalSemaphoreInfos

All paths route through StageInputFrame() for DPB safety - even optimal
YCbCr gets copied to internal pool image so external frame can be
released immediately after staging copy completes.

Build verified: shared lib, static lib, and test app all compile clean.
Concrete implementation of the VulkanVideoEncoderExt public interface
that wraps VkVideoEncoder for cross-process encoder service use.

VulkanVideoEncoderExtImpl implements all methods:

InitializeExt(VkVideoEncoderConfig):
- Builds EncoderConfig from structured config (bridges to argc/argv
  internally until EncoderConfig supports direct field assignment)
- Initializes VulkanDeviceContext with encode+compute+transfer queues
- Creates VkVideoEncoder via CreateVideoEncoder()
- Sets streaming mode (numFrames=UINT32_MAX, no input file)

SubmitExternalFrame(VkVideoEncodeInputFrame):
- Gets available pool node from encoder
- Delegates to VkVideoEncoder::SetExternalInputFrame() with the
  external VkImage, sync semaphores, frame ID, and PTS
- Tracks submitted frames in m_pendingFrames deque for async retrieval
- Returns VK_NOT_READY if pool is full (caller retries)

PollEncodeComplete(frameId):
- Checks vkGetFenceStatus on the encode command buffer's fence

GetEncodedFrame(result):
- FIFO: checks oldest pending frame's fence
- Fills VkVideoEncodeResult with frame metadata
- Frame stays in pending queue until ReleaseEncodedFrame()

ReleaseEncodedFrame(frameId):
- Releases encodeFrameInfo ref (returns resources to pools)
- Removes from pending queue

GetEncodeFence(frameId):
- Returns the encode command buffer's fence for external wait

Flush():
- WaitForThreadsToComplete() + drain pending queue

Also implements backward-compatible Initialize()/EncodeNextFrame()
for file-based encoding via the base VulkanVideoEncoder interface.

Factory: CreateVulkanVideoEncoderExt() exported from shared library.

Build verified: shared lib, static lib, test app all compile clean.
When the encoder service also has a display window, both the encode
staging copy and the display blit read the same imported external
image. The release semaphore must only fire after BOTH operations
complete, otherwise the producer could overwrite the frame while
the display is still reading it.

Solution: SubmitExternalFrame returns the staging completion binary
semaphore via optional pStagingCompleteSemaphore output. The encoder
service chains its display submit to wait on this semaphore, then
signals the release semaphore from the display submit:

  Encode staging submit:
    wait: graphSemaphore (frame ready)
    cmd: copy imported -> encoder internal pool
    signal: stagingCompleteSem (binary, returned to caller)
    // NO release semaphore here

  Display submit (chained after):
    wait: stagingCompleteSem (encoder done reading)
    wait: imageAvailableSem (swapchain)
    cmd: blit imported -> swapchain
    signal: releaseSemaphore = frameId (NOW safe to reuse)
    signal: renderFinishedSem (for present)

This ensures the producer only gets the release signal after both
the encoder and display are done reading the external image.
For optimal YCbCr input (NV12/P010), the external image now goes
directly to vkCmdEncodeVideoKHR as srcEncodeImageResource — no staging
copy, no filter, zero intermediate processing. This is correct because
the encoder's DPB (reconstructed reference frames) is managed internally
in separate dpbImageResources[]; the input image is only read once as
the source picture for that frame.

SetExternalInputFrame() now has two paths:
- Path A (optimal YCbCr): WrapExternalImage -> srcEncodeImageResource,
  skip StageInputFrame, go directly to EncodeFrameCommon()
- Path B/C (linear or RGBA): WrapExternalImage -> srcStagingImageView,
  go through StageInputFrame as before

SubmitVideoCodingCmds() changes:
- Enable encodeCmdBuffer's binary semaphore for Path A (was VK_NULL_HANDLE)
- Inject external wait semaphores (graph sem) when isExternalInput &&
  !inputCmdBuffer (direct encode, no staging)
- Increase wait/signal max counts (4->8 wait, 1->4 signal)
- Fix duplicate signalSemaphoreInfo assignment

SubmitExternalFrame() pStagingCompleteSemaphore now returns:
- Path A: encodeCmdBuffer's semaphore (signaled when encode done reading)
- Path B/C: inputCmdBuffer's semaphore (signaled when staging done)

Sync chain for encoder service with display:
  Encode (encode queue):
    wait: graphSemaphore -> cmd: vkCmdEncodeVideoKHR -> signal: encodeInputDoneSem
  Display (graphics queue, chained after):
    wait: encodeInputDoneSem -> cmd: blit -> signal: releaseSemaphore

Updated design doc: Path A skips StageInputFrame, DPB is separate.
Critical:
- Query index (Path A external): When srcEncodeImageResource->GetImageIndex()
  is negative (external pool node, m_parentIndex == -1), use query slot 0
  instead of (uint32_t)-1 in SubmitVideoCodingCmds and in the result
  retrieval path. Avoids invalid query pool index.

- VkImageResource CreateFromExternal null deref:
  - In constructor: skip vulkanDeviceMemory->GetMemoryPropertyFlags() and
    host-visible layout block when vulkanDeviceMemory is null (external
    wrapper has no VulkanDeviceMemoryImpl).
  - GetDeviceMemory() / GetImageDeviceMemory(): return VK_NULL_HANDLE when
    m_vulkanDeviceMemory is null instead of dereferencing.

High:
- Output path: VkVideoEncoderConfig now has outputPath (const char*).
  BuildEncoderConfig() adds --output and path to argv when set so
  per-encoder output flows from ThreadedRenderingVk/encoder instance to
  encoder library. EncoderInstance sets encConfig.outputPath in
  ThreadedRenderingVk_Standalone repo.
…ernalInputFrame

- Add imageTiling to VkVideoEncodeInputFrame (default VK_IMAGE_TILING_OPTIMAL)
- SubmitExternalFrame uses frame.imageTiling for path selection (direct vs staging)
- Fixes wrong path when external frame is LINEAR or DRM_FORMAT_MODIFIER_EXT
CreateDebugUtilsMessengerEXT and DestroyDebugUtilsMessengerEXT are
extension functions and were not in the dispatch table. Load them in
InitDebugReport and store in m_createDebugUtilsMessengerEXT and
m_destroyDebugUtilsMessengerEXT; use these in InitDebugReport and
destructor.
DecoderConfig.h: parse deviceID and CRC init values with std::from_chars.
Helpers.h: parse UUID hex bytes with std::from_chars.
Avoids reliance on glibc strtoul/strtoull for portability.
…bs/json

- Add vk_video_encoder/json_config/ with schema, example, default JSON and defaults doc
- Add EncoderConfigJsonLoader (simdjson) in vk_video_encoder/libs/json/
- VkEncoderConfig: LoadFromJsonFile(), --encoderConfig; help/docs point to json_config/
- Add vk_video_encoder/json_config/nvidia/ with preset JSONs aligned to
  NVIDIA Video Codec SDK tuning (High Quality, Low Latency, Ultra-low
  Latency, Lossless) and P1–P7 presets.
- high_quality_p1..p7.json: VBR, 250 GOP, 3 B-frames, qualityPreset 1–7.
- low_latency_p1..p3.json: CBR, 0 B-frames, 30 GOP.
- ultra_low_latency_p1.json: CBR, minimal VBV, 15 GOP.
- lossless.json: CQP with QP 0 for I/P/B.
- README.md: documents tuning/preset model and usage.
- PreferredSettings_extracted.md: extracted reference from
  PreferredSettings (2).xlsx (sheet1–4: tuning params, preset names,
  per-tuning settings, legacy NVENC mapping).
On Intel (vendor 0x8086), re-importing a DMA-BUF that was exported from
a single-plane LINEAR image returns VK_ERROR_INVALID_EXTERNAL_HANDLE.
Multi-plane LINEAR (NV12, P010) export/import works. This is a known
driver limitation.

- Cache physical device vendor ID in init() (m_vendorID).
- In runExportImportTest(), when useLinear is true and the format is
  single-plane (planeCount == 1) and vendor is Intel, skip the test
  with message: "Intel: single-plane LINEAR DMA-BUF import returns
  VK_ERROR_INVALID_EXTERNAL_HANDLE (driver limitation)".

Result: 12 previously failing TC3_ExportImport_*_LINEAR tests become
SKIP on Intel; no failures. NV12/P010 LINEAR still pass.
Profile fixes (align with PreferredSettings / NVIDIA Video Codec SDK ToT):
- Set gopLength 250, idrPeriod 250 for high-quality presets (was 60).
- Map qualityPreset to SDK P1–P7 (e.g. high_quality_p4 → qualityPreset 3).
- Add tuningMode to preset JSONs (highquality, lowlatency, ultralowlatency, lossless).

Documentation:
- Expand vk_video_encoder/json_config/nvidia/README.md with tuning/preset
  tables, usage, and references to PreferredSettings_extracted.md.
- Add preset_review_report.md with preset-to-parameter review.
- Add scripts/run_encoder_profile_tests.py: runs all NVIDIA preset JSON configs
  (vk_video_encoder/json_config/nvidia/*.json) against vk-video-enc-test;
  supports local and SSH-remote (e.g. GPU VM), optional validation, codec/
  profile filters; output format mirrors run_encoder_tests.py.
- Add scripts/run_encoder_profile_tests.sh: shell wrapper for the profile runner.
- Add docs/VIDEO_TEST_SUITE.md: overview of decoder/encoder test scripts and
  codec support, including the new encoder profile sweep.
- Update common/libs/tests/drm_format_mod/docs/STATUS_REPORT.md and
  vk_video_encoder/json_config/nvidia/README.md with status and usage.
VIDEO_TEST_SUITE: add Encoder Test Content section note and example for
ThreadedRenderingVk_Standalone/scripts/generate_encoder_yuv.sh.
run_encoder_profile_tests.py: docstring points to that script for generating
YUV at required resolutions and names.
…0R10)

The RGBA2YCBCR shader generator always emitted separate outputImageY/Cb/Cr
writes which fails for packed formats (Y410 = A2B10G10R10_UNORM_PACK32)
since YcbcrVkFormatInfo() returns nullptr for non-multiplanar formats.

Fix: detect packed output (outputMpInfo == nullptr) in InitRGBA2YCBCR and
emit a single packed write to outputImageRGB with correct channel mapping
(A2B10G10R10: R=Cr, G=Y, B=Cb, A=1).

Also relax the planeNum assert in UpdateImageDescriptorSets: for packed
formats with VK_IMAGE_ASPECT_COLOR_BIT at curImageAspect==0, only the
combined view is used (GetImageView()), so plane count doesn't matter.
…410)

When outputMpInfo is null (packed format like A2B10G10R10), the dispatch
grid defaulted to chromaHorzRatio=2, chromaVertRatio=2 (4:2:0 assumption).
For Y410 this is wrong — it's 4:4:4, so the ratio should be 1:1.

Result was only 1/4 of the output image getting written (top-left quarter).

Fix: default to ratio 1 when outputMpInfo is null. All 4 dispatch sites
in the file had this bug (image→image, buffer→image, and AQ variants).
CLI -c overrides JSON codec field, so each profile JSON serves as a base
config that gets tested with all 3 codecs. --codec flag filters to one.

Test names now: nvidia/high_quality_p4/h264, nvidia/high_quality_p4/h265, etc.
Output files: 1920x1080_420_8le_h265_nvidia_high_quality_p4.265

Tested: 13 profiles × 3 codecs = 33 passed, 2 skipped (P6/P7 unsupported).
run_decoder_roundtrip.py: decodes all bitstreams in a directory using
vulkan-video-dec-test to verify encode→decode roundtrip. Discovers .264,
.265, .ivf files and reports pass/fail for each.

Supports --filter, --verbose, --timeout, auto-detects decoder path.

Tested: 33 bitstreams (11 profiles × 3 codecs), all decoded successfully.
play_encoded_ffplay.py: plays .264/.265/.ivf files with ffplay.
Supports --filter, --play (single file loop), sequential batch playback.
Press Q to advance, Ctrl+C to stop.
README sections added/updated:
- §14: encoder profile tests with multi-codec, auto-detect, options table
- §15: decoder roundtrip (run_decoder_roundtrip.py) + visual playback
  (play_encoded_ffplay.py) with options tables
- §16: complete end-to-end workflow (generate → verify → encode → decode → play)
Add reference to verify_yuv_ffplay.sh and encoder_yuv_generation.md docs.
Partial fix for encoder instance IPC mode where frames come from the
renderer via DMA-BUF import (no input file):

VkEncoderConfig.cpp:
  - Skip input file requirement when no -i specified (external frame mode)
  - Skip numFrames clamping to file frame count in external mode
  - Replaced assert with error return + args dump for debugging

vulkan_video_encoder_ext.cpp:
  - Pass -i /dev/null as workaround (doesn't work - mmap fails)
  - Only pass --qpI/--qpP when > 0 (avoid unnecessary CQP mode)
  - Use 1000000 frames instead of UINT32_MAX for streaming mode
  - Added debug argv dump

KNOWN ISSUE: ParseArguments still fails because the input file handler
tries to mmap the file. Need proper refactor to separate file-based vs
external frame input paths in EncoderConfig initialization.
Three bugs fixed:

1. vulkan_video_encoder_ext.cpp: Wrong CLI arg names in BuildEncoderConfig.
   --averageBitRate/--maxBitRate → --averageBitrate/--maxBitrate (lowercase 'r')
   --frameRateNum/--frameRateDen removed (no such CLI args in ParseArguments)
   These caused unknown positional args → DoParseArguments returns -1.

2. VkEncoderConfig.cpp: Skip input file requirement for external frame mode.
   When no -i is given, skip file handler mmap/validation, use --inputWidth/
   --inputHeight directly. Skip numFrames clamping to file frame count.

3. VkEncoderConfig.cpp: Replace assert with error return + args dump for
   debugging when ParseArguments fails.

Status: ParseArguments now succeeds for IPC mode. EncoderInstance initializes
and receives frame FDs. Next step: encoder frame import + encode pipeline.
Identifies that InitVulkanDevice (Vulkan instance + device creation)
is the failing step when the encoder library runs in the child process.
BuildEncoderConfig succeeds, but InitVulkanDevice never returns.
Step-by-step stderr+fflush traces identify InitPhysicalDevice as the
failing call in the encoder child process. LoadVk, InitVkInstance,
InitDebugReport all succeed, but InitPhysicalDevice never returns.
VulkanVideoEncoderExt:
- Fix NULL pointer crash in DeviceUuidUtils ctor (memcpy from nullptr)
- Fix deviceId=0 filtering out GPUs (use -1 for auto-select)
- Always request compute queue (VulkanFilter needs it)
- Add GetVkDevice/GetVkPhysicalDevice/GetVkInstance to ext API
- Correctly strip video bits from DRM modifier format query

VkVideoEncoder:
- Add DRM_FORMAT_MODIFIER_EXT to isDirectlyEncodable check
- Add STORAGE_BIT to WrapExternalImage usage
- Fix WrapExternalImage layout to TRANSFER_SRC_OPTIMAL for staging

VulkanDeviceContext:
- Add fprintf traces in InitPhysicalDevice for debugging

drm_format_mod_test:
- Add --video-encode and --video-decode flags
- Enable VK_KHR_video_queue/encode_queue/decode_queue extensions
- Add VIDEO_ENCODE_SRC/VIDEO_DECODE_DST to image creation
- Skip LINEAR modifier for video (NVDEC/NVENC require tiled)
- Add drm-video-issue-readme.txt documenting the driver issue
TC5 (runVideoFormatQueryTest):
- Queries vkGetPhysicalDeviceVideoFormatPropertiesKHR separately
  for encode (VIDEO_ENCODE_SRC) and decode (VIDEO_DECODE_DST)
- Uses H.264 High 4:2:0 8-bit profile for the query
- Verifies the target format is in the returned list
- Prints format properties (tiling, usage, flags) in verbose mode

TC6 (runPlaneLayoutTest):
- Creates exportable image (tiled or linear)
- Queries plane layouts (offset, size, rowPitch, arrayPitch, depthPitch)
- Exports DMA-BUF, imports with same parameters
- Queries imported image layouts
- Compares export vs import: offset, rowPitch, size must match
- Validates plane offsets are increasing for multi-planar formats
- Reports mismatches as FAIL

Both tests are dispatched in runAllTests when --video-encode or
--video-decode is specified.
importDmaBufImage was destroying the imported image and returning
VK_SUCCESS with a null outImage. This caused TC6 (PlaneLayoutTest)
to segfault when querying the imported image's plane layouts.

Fix: wrap the imported image+memory in VkImageResource::CreateFromExternal
and return it to the caller. Also add null check for dstImage before
accessing it in TC6.
importDmaBufImage creates raw VkImage + VkDeviceMemory handles and
wraps them in VkImageResource::CreateFromExternal (non-owning). The
wrapper's destructor doesn't destroy the raw handles.

Fix: track raw handles in m_importedHandles vector. The new
destroyImportedImage() method looks up the handles by VkImage and
destroys both image and memory. The destructor also cleans up any
remaining handles.

Previously leaked 8 allocations (6304 bytes), now 0 leaks.
Add VkImageResource::CreateFromImport() that takes ownership of both
the VkImage and VkDeviceMemory handles. When the last VkSharedBaseObj
reference drops, the handles are destroyed automatically via the
existing ref-counted RAII pattern (VkVideoRefCountBase).

VulkanDeviceMemoryImpl: add public constructor from pre-allocated
VkDeviceMemory handle. Deinitialize() calls vkFreeMemory as usual.

drm_format_mod_test: replace manual handle tracking (m_importedHandles
vector + destroyImportedImage) with CreateFromImport. The imported
images are now cleaned up automatically when VkSharedBaseObj goes
out of scope. Zero leaks, zero manual cleanup code.
Y4M C420p16 outputs decoder P010 (MSB) as-is. No conversion needed.
VkVideoFrameToFile wrote a hardcoded F24:1 in the Y4M header regardless
of the actual stream frame rate. This caused ffmpeg PSNR comparisons to
fail due to timebase mismatch (e.g. 30fps stream vs 24fps Y4M).

Changes:
- VkVideoFrameOutput.h: add virtual SetFrameRate(num, den) with empty
  default so existing consumers are unaffected
- VkVideoFrameToFile.cpp: store m_frameRateNum/m_frameRateDen (default
  30/1), use them in Y4M header instead of hardcoded F24:1
- VulkanVideoProcessor.cpp: call SetFrameRate() with the stream's
  frame_rate from VkParserDetectedVideoFormat before each OutputFrame()
When encoding externally-imported frames (e.g. from DMA-BUF), the
StageInputFrame() transition used VK_IMAGE_LAYOUT_UNDEFINED as the old
layout. This discards image contents per the Vulkan spec and produces
scrambled encoded output.

Changes:
- VkVideoEncodeFrameInfo: add srcExternalImageLayout field to track the
  layout the producer left the image in (e.g. GENERAL for compute output)
- SetExternalInputFrame(): accept srcImageCurrentLayout parameter and
  store it in encodeFrameInfo
- StageInputFrame(): use srcExternalImageLayout (not UNDEFINED) as the
  old layout for external inputs, preserving image contents through the
  transition to TRANSFER_SRC_OPTIMAL
- SubmitExternalFrame(): forward frame.currentLayout to the new parameter

This fixes scrambled encode output when the renderer's PostProcessFilter
writes frames in VK_IMAGE_LAYOUT_GENERAL and exports them via DMA-BUF.
…aces

Adds ability to create encoder input images with specific DRM format
modifier tiling instead of VK_IMAGE_TILING_OPTIMAL. This reproduces the
scrambled output bug seen in the renderer-encoder pipeline.

Changes:
- VkEncoderConfig: add drmFormatModifierIndex (-1=disabled) and
  selectedDrmFormatModifier fields, --drmFormatModifierIndex CLI param
- VkVideoEncoder: add SelectDrmFormatModifier() that queries available
  modifiers with VIDEO_ENCODE_SRC usage, skips LINEAR, prints decoded
  NVIDIA modifier info (compression, GOB height, pageKind, etc.)
- VulkanVideoImagePool::Configure: add optional drmFormatModifier param
  that creates images with VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT and
  proper VUID-compliant pNext chain (format list + modifier list)

Test results on RTX 5080 (dev driver 610.01):
- Without --drmFormatModifierIndex: PSNR ~40 dB (correct, OPTIMAL)
- With --drmFormatModifierIndex 0 (compressed BL): PSNR ~11 dB (broken)
- With --drmFormatModifierIndex 5 (uncompressed BL): PSNR ~11 dB (broken)
- Both compressed and uncompressed block-linear produce identical garbled
  output, confirming the bug is in DRM modifier tiling + video encode,
  not specific to compression.
…xternal input

Fix Bug 3 (WrapExternalImage multiplanar view assert):
- UpdateImageDescriptorSets: trim validImageAspects to match the view's
  actual plane count. The default m_inputImageAspects includes all 3 plane
  bits (PLANE_0|PLANE_1|PLANE_2), but 2-plane formats like NV12/P010 only
  have 2 planes. The loop iterated into PLANE_2 causing the assert
  planeNum < imageView->GetNumberOfPlanes() to fire (2 < 2).

WrapExternalImage: add MUTABLE_FORMAT_BIT, EXTENDED_USAGE_BIT, and
planeUsageOverride for multiplanar per-plane storage views needed by
the compute filter.

StageInputFrame: skip the encoder's preprocess compute filter when
isExternalInput is true. External DMA-BUF frames from the renderer
are already in the target NV12 format. The compute filter would do
storage reads on the imported DRM modifier image which causes GPU
faults on compressed block-linear memory.

TransitionImageLayout: add GENERAL->TRANSFER_SRC_OPTIMAL,
UNDEFINED->GENERAL, and GENERAL->VIDEO_ENCODE_SRC_KHR transitions
needed by the external input staging path.
Each passed profile now reports total time, ms/frame, and enc-fps
inline. Summary includes a timing table for all passed profiles.
The encoder library's VkDevice was missing VK_KHR_external_semaphore
and VK_KHR_external_semaphore_fd extensions. When the encoder service
runs in headless mode, it reuses the encoder library's VkDevice for
semaphore export (release semaphore sent to parent via READY handshake).
Without these extensions, vkGetSemaphoreFdKHR was NULL after
volkLoadDevice(), crashing the child process before READY was sent.
Extend the decoder library to support streaming decoded YCbCr surfaces
to external consumers (presenters, encoders) via DMA-BUF export:

- SemSyncTypeIdx: add PRESENTER/ENCODER tokens, shift 2→4
- VulkanVideoFrameBuffer: external consumer semaphore array,
  CPU wait before slot reuse, AddExternalConsumer(),
  ExportFrameCompleteSemaphoreFd()
- VulkanDisplayFrame: numExternalConsumers + doneValues tracking
- VkVideoDecoder: ENABLE_EXTERNAL_CONSUMER_EXPORT flag
- VulkanVideoProcessor: wire enableExternalConsumerExport from config
- vulkan_video_decoder: GetDevice()/GetPhysicalDevice()/GetInstance()
  getters, auto-create VkVideoFrameOutput from config outputFileName
…s used

Set enableExternalConsumerExport=true in the --remotePresent handler so
decoded images get SAMPLED_BIT + TRANSFER_SRC_BIT usage flags, required
for DMA-BUF export to external presenter/encoder consumers.
Add VkExportSemaphoreCreateInfo to the frameComplete timeline semaphore
creation chain so external consumers can import it via opaque FD for
cross-process GPU synchronization.
Add forwarding methods through the interface chain so external
decoder services can register consumer release semaphores:

- VulkanVideoDecoder (interface): AddExternalConsumer, ExportFrameCompleteSemaphoreFd
- VulkanVideoDecoderImpl: forward to VulkanVideoProcessor
- VulkanVideoProcessor: forward to VulkanVideoFrameBuffer

The frame buffer CPU-waits on registered external consumer semaphores
before reusing decoded frame slots, preventing slot overwrite when
the decoder runs faster than consumers can display.
…nabled

When enableExternalConsumerExport is set, use VkImageResource::CreateExportable()
instead of Create() for decoded frame images. This sets up:
- VkExternalMemoryImageCreateInfo with DMA_BUF handle type
- VkExportMemoryAllocateInfo for proper DMA-BUF export
- VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT with selected modifier
- Memory plane layouts queryable via GetMemoryPlaneLayout()

Applies to DPB images in coincide mode, output images in distinct mode,
and both for AV1 film grain.

ImageSpec: add exportHandleTypes + exportDrmModifier fields.
VulkanVideoFrameBuffer::CreateImage: use CreateExportable() when set.
Add two options for DRM modifier selection when exporting decoded
surfaces to external consumers:

1. Block height: prefer smallest (default) or largest GOB height
2. Compression: prefer compressed (c>0) or uncompressed (c=0, default)

If an explicit DRM modifier index is specified (--drmModifierIndex),
it is used unconditionally, bypassing the preference logic.

Log all available modifiers with decoded NVIDIA parameters
(compression, block height, plane count, features).

DecoderConfig: exportPreferCompressed, exportPreferSmallestBlockHeight
VkVideoDecoder: SetExportPreferences(), m_exportDrmModifierIndex
Consolidate DRM format modifier handling that was duplicated across the
encoder (VkVideoEncoder.cpp), decoder (VkVideoDecoder.cpp), and the
ThreadedRenderingVk pipeline (ExternalMemory.h) into a single shared
header-only utility class.

New file: common/libs/VkCodecUtils/VkDrmFormatModifierUtils.h

The class provides:

  Static methods (all platforms):
  - Vendor-aware modifier decoding (NVIDIA, AMD, Intel, ARM, QCOM)
  - NVIDIA block-linear field extraction: block height, page kind,
    generation, sector layout, compression type
  - PrintModifierInfo() and ModifierToString() for debug output
  - IsLinear(), IsCompressed() with vendor-aware detection

  Instance methods (Linux only, guarded with #ifdef __linux__):
  - QueryModifiers(): enumerate DRM modifiers via
    vkGetPhysicalDeviceFormatProperties2 with
    VkDrmFormatModifierPropertiesListEXT
  - SelectModifier(): select best modifier with configurable preferences
    for block height (smallest/largest), compression (prefer/avoid),
    explicit index override, and linear fallback
  - DumpAvailableModifiers(): debug dump with per-modifier feature flags
    and NVIDIA field decoding

Encoder changes (VkVideoEncoder.cpp):
  - Remove duplicated PrintNvidiaDrmModifierInfo() static function
  - Replace SelectDrmFormatModifier() body with VkDrmFormatModifierUtils
    calls (DumpAvailableModifiers + SelectModifier + PrintModifierInfo)
  - Guard function body with #ifdef __linux__

Decoder changes (VkVideoDecoder.cpp):
  - Remove duplicated inline lambdas (getNvModCompression,
    getNvModBlockHeight), ModCandidate struct, and sorting logic
  - Replace ~120 lines of modifier selection with VkDrmFormatModifierUtils
    calls (~20 lines)
  - Guard external consumer export block with #ifdef __linux__

Tested on NVIDIA GeForce RTX 5080 (VM):
  - DRM format modifier tests: 123 passed, 0 failed (default)
  - DRM format modifier tests: 131 passed, 0 failed (compression)
  - Decoder service tests: 50/50 PASS (H.264, H.265, AV1, all resolutions)
  - DRM modifier cycling (--cycle-drm-modifiers): 8/8 PASS
VkVideoDecoder's constructor initialises m_enableExportPreferCompressed
to VK_TRUE, expressing the intent that L2-compressed DRM modifiers are
preferred for DMA-BUF export.  However, VulkanVideoProcessor always
calls SetExportPreferences(programConfig.exportPreferCompressed, ...)
immediately after construction, overriding that default.  Because
DecoderConfig::Reset() initialised exportPreferCompressed to false, and
no command-line argument ever sets it to true, m_enableExportPreferCompressed
was silently forced to VK_FALSE on every run.

Result: SelectModifier picked the c=0 (uncompressed) block-linear
modifier even though compressed variants with identical block height
were available and supported — confirmed by the dump output showing
modifier [5] (c=0,h=1) selected instead of [0] (c=1,h=1).

Fix: align the DecoderConfig default with the VkVideoDecoder
constructor intent by initialising exportPreferCompressed = true.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant