-
Notifications
You must be signed in to change notification settings - Fork 145
Description
Using Structures and the nanobind Python/C++ binding library, we can outline a path to support tuples, dictionaries, and lists natively in SDFGs. The use of nanobind for Python-called functions may also reduce the overhead of CompiledSDFG calls, as a corollary.
A potential plan can be:
First, refactor dace/data.py into a dace/data/... folder that contains pydata.py (as well as tensor.py and other files).
In pydata, implement the following data container types:
class PythonList(Array): Represented as a 1D array but implemented/code-generated asnb::list.class PythonTuple(Array): Same as list.class PythonClass(Structure): Represents a general Python class. Accessing fields in the class act similarly toStructurefields - to access, simply add a connector with the field's name. This also solves an issue where scalar fields in objects cannot be updated when using DaCe.class PythonDict(Structure): Similarly to classes, dictionary keys can be encoded as connectors. Control flow structures could iterate over items/keys/values (aPythonDictIteratordata container, a subclass ofPythonGenerator, might be introduced for this purpose).- Alternatively, and more generally (preferred!),
PythonDictcould extend a generalKeyValueStoredata container type that we will generically introduce to DaCe.
- Alternatively, and more generally (preferred!),
class PythonGenerator(Stream): General stateful memory that, upon accessing with a read memlet, will generate a different value every time. The semantics are similar to DaCe streams (FIFO queues). Edges that write into a generator (without asetconnector) are disallowed.
The data container classes are strongly typed (i.e., PythonDict has a specific Data entry for key and value). In order to deal with Pythonic weak typing, a PythonUnion data container class might be introduced, but discouraged. nanobind should be good at throwing exceptions if we evaluate the wrong type at runtime.
Code generation will be adapted to emit nb::dict/nb::array/nb::list/nb::object etc. Classes will contain fields that are captured at marshalling time.
Note that this solution is not intended to generate the highest-performing code, but in order to create useful shims to/from existing Python codes.