Skip to content

Conversation

@mzient
Copy link
Contributor

@mzient mzient commented Dec 30, 2025

Category:

Bug fix (non-breaking change which fixes an issue)
Other (dependencies)

Description:

This PR fixes the handling of DALI enums in NDD tensors and batches.
Prior to this change, when dtype was passed to Tensor constructor, DALI enums were not properly converted and an error was raised. This PR fixes that and adds tests for Tensor and Batch.

This PR also adds numpy as an explicit DALI dependency (we require it anyway to perform basic tasks).

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

The tests check that:
a) enums can be converted with automatic type deduction
b) enums can be converted with explicit dtype
c) integers can be converted to enums with explicit dtype

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Michał Zientkiewicz <[email protected]>
@greptile-apps
Copy link

greptile-apps bot commented Dec 30, 2025

Greptile Summary

Fixed enum handling in NDD (NVIDIA DALI Dynamic) Tensor and Batch constructors to properly convert DALI enum types when explicit dtype parameter is provided.

Key Changes:

  • In _tensor.py: Added logic to detect enum dtypes (dtype.kind == DType.Kind.enum), convert data to int32, create TensorCPU/GPU storage, then reinterpret with correct enum type_id
  • In _batch.py: Added dtype normalization to convert non-DType objects to DType instances before tensor processing
  • In setup.py.in: Added numpy as explicit dependency
  • Moved numpy import to module-level in _tensor.py for consistency
  • Added comprehensive tests covering three scenarios: auto enum detection, explicit enum dtype, and integer-to-enum conversion with explicit dtype

Confidence Score: 5/5

  • This PR is safe to merge with no identified issues
  • The fix is well-implemented with proper enum type handling, comprehensive test coverage for all three scenarios (auto detection, explicit dtype, int-to-enum conversion), and follows existing code patterns. The numpy dependency addition is appropriate given its usage.
  • No files require special attention

Important Files Changed

Filename Overview
dali/python/nvidia/dali/experimental/dynamic/_tensor.py Added enum handling for dtype parameter - converts enums to int32, creates TensorCPU, then reinterprets with correct type_id. Moved numpy import to top-level.
dali/python/nvidia/dali/experimental/dynamic/_batch.py Added dtype normalization converting non-DType objects to DType using _dtype() before processing tensors.
dali/python/setup.py.in Added numpy as explicit dependency (previously optional/implicit).

Sequence Diagram

sequenceDiagram
    participant User
    participant Tensor/Batch
    participant Constructor
    participant DType
    participant numpy
    participant Backend
    
    User->>Tensor/Batch: Create with dtype=enum
    Tensor/Batch->>Constructor: __init__(data, dtype=enum)
    
    alt dtype is not DType instance
        Constructor->>DType: _dtype(enum)
        DType-->>Constructor: DType object
    end
    
    alt dtype.kind == DType.Kind.enum
        Constructor->>numpy: Convert data to int32
        numpy-->>Constructor: int32 array
        Constructor->>Backend: Create TensorCPU/GPU(int32_array)
        Backend-->>Constructor: Storage with int32 dtype
        Constructor->>Backend: storage.reinterpret(dtype.type_id)
        Backend-->>Constructor: Storage with enum type_id
    else dtype is regular type
        Constructor->>numpy: Convert data to numpy_type
        numpy-->>Constructor: typed array
        Constructor->>Backend: Create TensorCPU/GPU(typed_array)
        Backend-->>Constructor: Storage with correct dtype
    end
    
    Constructor-->>Tensor/Batch: Initialized object
    Tensor/Batch-->>User: Tensor/Batch with enum dtype
Loading

@greptile-apps
Copy link

greptile-apps bot commented Dec 30, 2025

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40970165]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40970165]: BUILD PASSED


self._storage = _backend.TensorCPU(
np.array(data, dtype=nvidia.dali.types.to_numpy_type(dtype.type_id)),
np.array(data, dtype=numpy_type),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
np.array(data, dtype=numpy_type),
np.asarray(data, dtype=numpy_type),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I actually need this copy.

@rostan-t rostan-t self-assigned this Dec 30, 2025
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40997933]: BUILD STARTED

@mzient mzient merged commit 69d499e into NVIDIA:main Dec 31, 2025
6 of 7 checks passed
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40997933]: BUILD PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants