Skip to content

[SPARK-56612][PYTHON] Unify verify_result and container-type checks into verify_return_type helper#55532

Open
Yicong-Huang wants to merge 12 commits intoapache:masterfrom
Yicong-Huang:SPARK-56612
Open

[SPARK-56612][PYTHON] Unify verify_result and container-type checks into verify_return_type helper#55532
Yicong-Huang wants to merge 12 commits intoapache:masterfrom
Yicong-Huang:SPARK-56612

Conversation

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang commented Apr 24, 2026

What changes were proposed in this pull request?

Introduce verify_return_type(result, expected_type) in worker.py:

  • concrete type (e.g. pa.Table) — isinstance check, return result.
  • Iterator[T] — check result is an Iterator, return a lazy iterator that checks each element against T on consumption.

Apply it to:

  • SQL_MAP_ARROW_ITER_UDF and SQL_SCALAR_ARROW_ITER_UDF (replacing the old verify_result factory).
  • verify_arrow_table / verify_arrow_batch — inlined at their 3 call sites and removed.

Runtime now strictly requires an Iterator (was: any iterable), matching the declared contract ArrowMapIterFunction = Callable[[Iterator[...]], Iterator[...]] and the _name == "Iterator" convention in pyspark/sql/pandas/typehints.py.

Why are the changes needed?

Part of SPARK-55388. One well-named helper unifies the iterator factory and scattered inline isinstance checks, enabling further eval-type refactors.

Does this PR introduce any user-facing change?

No. Minor tightening: a UDF returning a non-Iterator iterable (e.g. list) is now rejected with UDF_RETURN_TYPE, aligning runtime with the documented Iterator[...] signatures.

How was this patch tested?

Existing integration coverage: test_arrow_map.py::test_other_than_recordbatch_iter exercises the iterator path; other verify_arrow_* callers (mapInArrow, grouped-map Arrow UDFs) exercise the concrete-type path.

Was this patch authored or co-authored using generative AI tooling?

No

@Yicong-Huang
Copy link
Copy Markdown
Contributor Author

@gaogaotiantian please kindly share your thoughts!

cc @zhengruifeng @HyukjinKwon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant