Skip to content

no support for OR in partition predicates #1923

@echai58

Description

@echai58

Environment

Delta-rs version: 0.13.0

Binding: python

Environment:

  • Cloud provider: n/a
  • OS: ubuntu 22.04
  • Other: testing locally via jupyter notebook

Bug

What happened:
There are inconsistencies between the python type-hints, docstring comments, and rust implementation for partition predicates.

In table.py, we see partitions: Optional[List[Tuple[str, str, Any]]] (https://github.com/delta-io/delta-rs/blob/main/python/deltalake/table.py#L704). There is also a docstring comment here referencing a DeltaTable.files_by_partitions, which seems to be a deprecated method, so I'm going to assume that is an out of date comment.

In _internal.pyi, we see Optional[FilterType] (https://github.com/delta-io/delta-rs/blob/main/python/deltalake/_internal.pyi#L86), which is defined as a DNF or Conjuction (https://github.com/delta-io/delta-rs/blob/main/python/deltalake/_internal.pyi#L717-L720).

In lib.rs, we see PartitionFilterValue (https://github.com/delta-io/delta-rs/blob/main/python/src/lib.rs#L655), which is defined as a string, or a vector of strings (https://github.com/delta-io/delta-rs/blob/main/python/src/lib.rs#L60-L63).

To my understanding, these three, which should all be defining the same partition filter type, all have different typing. The one in table.py and lib.rs are relatively congruent with each other, if we treat Any as Union[str,List[str]], which sort of makes sense.

The bigger difference is with _internal.pyi, which presents a whole different form of filter. This DNF style would allow for OR in the predicate filter, but this seems to be impossible in the python or rust typing. Would it be difficult to add support for OR in partition predicates via this DNF style or another?

What you expected to happen:
At the minimum, in the above code's current standing, I am not sure what the correct typing for predicate filters are. If the python one is incorrect, that should be fixed to provide accurate type hints and allow for correct static type checking when using python bindings.

Perhaps this should be raised in a Feature Request, but I'm also wondering why there isn't support for OR in predicate filters?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions