Skip to content

[Data] from_daft() PyArrow version check is obsolete, blocks all modern Daft installations #63883

@jiangxt2

Description

@jiangxt2

Problem

ray.data.from_daft() raises RuntimeError when PyArrow >= 14.0.0:

# read_api.py line 3256
if pyarrow_version >= parse_version("14.0.0"):
    raise RuntimeError("`from_daft` only works with PyArrow 13 or lower.")

This check was added because Daft previously had compatibility issues with PyArrow >= 14 (see #53278). However, Daft has since fixed this — as of March 2026, Daft's minimum PyArrow requirement is >= 15.0.0.

Impact

Any user with a modern Daft installation (which requires PyArrow >= 15) will hit this error and cannot use from_daft() at all. The previous advice to "just upgrade pyarrow" (#53278) is now inverted — users cannot downgrade PyArrow below 14 without also breaking Daft.

Proposed Fix

  1. Remove the PyArrow version check and RuntimeError from from_daft()
  2. Remove the unused get_pyarrow_version and parse_version imports from read_api.py
  3. Remove the test_from_daft_raises_error_on_pyarrow_14 test
  4. Remove the @pytest.mark.skipif from test_daft_round_trip

The function body simply calls df.to_ray_dataset(), which works fine with modern Daft + PyArrow combinations.

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions