Skip to content

Conversation

@ayushdg
Copy link
Contributor

@ayushdg ayushdg commented Jan 14, 2026

Reverts #1202 due to some issues we see with Xenna's GPU usage vs the allocated resources via Ray

@ayushdg ayushdg requested a review from abhinavg4 January 14, 2026 20:58
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 14, 2026

Greptile Summary

This PR reverts #1202 to restore NVENC/NVDEC hardware encoder/decoder support for video transcoding stages.

  • Dependency Downgrade: Reverts cosmos-xenna from 0.1.8 to 0.1.2 to use the older resource allocation API
  • Resource Model: Re-adds nvdecs and nvencs fields to the Resources dataclass for fine-grained GPU hardware unit allocation
  • GPU Partitioning: Restores dynamic GPU resource calculation in ClipTranscodingStage that queries actual GPU capabilities via _get_local_gpu_info() and _make_gpu_resources_from_gpu_name()
  • Backend Compatibility: Updates Xenna adapter to pass nvdec/nvenc/entire_gpu fields to XennaResources; adds validation in Ray Data adapter to reject unsupported nvdec/nvenc requests
  • Import Path Changes: Updates imports from cosmos_xenna.pipelines.private.resources to cosmos_xenna.ray_utils.resources

The revert addresses GPU resource allocation issues observed with Xenna's GPU usage vs allocated resources via Ray.

Confidence Score: 4/5

  • This is a clean revert of a previous PR with no custom modifications. The dependency downgrade is the primary risk factor.
  • Score of 4 because: (1) This is a straightforward revert with no manual conflict resolution, (2) The changes are well-scoped to GPU resource handling, (3) Tests were appropriately updated. Minor concerns: dependency downgrade may affect other features added in cosmos-xenna 0.1.8, and the MockGpuResources class appears unused.
  • nemo_curator/stages/video/clipping/clip_extraction_stages.py - uses cosmos_xenna internal APIs for GPU detection which may fail without GPUs present

Important Files Changed

Filename Overview
nemo_curator/stages/resources.py Re-adds nvdecs and nvencs fields to Resources dataclass for NVIDIA hardware encoder/decoder support. Removes the entire_gpu auto-conversion to gpus=1.0 logic. Updates docstrings accordingly.
nemo_curator/stages/video/clipping/clip_extraction_stages.py Restores GPU resource partitioning logic using _get_local_gpu_info() and _make_gpu_resources_from_gpu_name() from cosmos_xenna. When hardware encoding is enabled, dynamically calculates nvencs and gpu_memory_gb based on available GPU resources.
nemo_curator/backends/xenna/adapter.py Changes import paths from cosmos_xenna.pipelines.private.resources to cosmos_xenna.ray_utils.resources for compatibility with cosmos-xenna 0.1.2. Re-adds nvdecs, nvencs, and entire_gpu fields to XennaResources mapping.
pyproject.toml Downgrades cosmos-xenna dependency from 0.1.8 to 0.1.2 to revert to older API that supports nvdecs/nvencs resource allocation.
tests/stages/video/clipping/test_clip_transcoding_stage.py Adds MockGpuResources class (currently unused). Removes two hwaccel tests that validated the old gpus-based resource allocation.

Sequence Diagram

sequenceDiagram
    participant User as User
    participant Pipeline as Pipeline
    participant ClipTranscodingStage as ClipTranscodingStage
    participant Resources as Resources
    participant XennaAdapter as XennaStageAdapter
    participant Xenna as XennaResources

    User->>Pipeline: run()
    Pipeline->>ClipTranscodingStage: __post_init__()
    
    alt h264_nvenc or use_hwaccel
        ClipTranscodingStage->>ClipTranscodingStage: _get_local_gpu_info()
        ClipTranscodingStage->>ClipTranscodingStage: _make_gpu_resources_from_gpu_name()
        ClipTranscodingStage->>Resources: Resources(nvencs=..., gpu_memory_gb=...)
    else CPU encoding
        ClipTranscodingStage->>Resources: Resources(cpus=...)
    end
    
    Pipeline->>XennaAdapter: required_resources
    XennaAdapter->>Resources: Get cpus, gpus, nvdecs, nvencs, entire_gpu
    XennaAdapter->>Xenna: XennaResources(cpus, gpus, nvdecs, nvencs, entire_gpu)
    Xenna-->>XennaAdapter: Resource allocation
    XennaAdapter-->>Pipeline: Execute stage with resources
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. tests/stages/video/clipping/test_clip_transcoding_stage.py, line 31-42 (link)

    style: Both MockGpuInfo and MockGpuResources classes are defined but never used in any tests. Consider removing them or adding tests that utilize these mocks.

7 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants