Skip to content

The meta passed to map_partitions is not used during graph optimization #571

@chuyuanliu

Description

@chuyuanliu

When using map_partitions with a known meta provided, the function will still be evaluated with typetracers if optimize_graph is turned on. The example below will print "test_func called with typetracer" once.

import awkward as ak
import dask_awkward as dak


def test_func(array):
    if ak.backend(array) == "typetracer":
        print("test_func called with typetracer")
    else:
        print("test_func called with array")
    return array


array = ak.Array({"test": [[1, 2, 3], [4, 5], [6, 7, 8]]})
test = dak.from_awkward(array, npartitions=1)
meta = ak.Array(array.layout.to_typetracer(forget_length=True))
result = test.map_partitions(test_func, meta=meta)
result.compute(optimize_graph=True)

Is this behavior expected? Will it be possible to store a copy of meta somewhere and return it during the optimization? This is useful when some operations inside the function do not accept typetracers as argument but the structure of the final returned array is determinate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions