The current pipeline for orchestration suffers two major issues for production:
1/ very long build times
2/ unable to properly cache a full-program optimzer.
On 1/, @romanc has done some profiling preliminary work snipping some repeating offender. The conclusion seems that everything is "kinda slow", as python codes tend to be. Going into cython or C is not an option considering the complexity of the DaCe parsing & compiling framework.
On 2/ the issue is done to parsing. The only reliable way to get a has would be to has the SDFG coming out of parsing, but that requires parsing first which itself is slow. A quick brainstorm didn't led to any other idea on how to parse short of a bookeeping of file edit times recorded during the original parsing, which sounds brittle.
Next:
- profile with a better control to separate the analysis between: parse, optimize and compile
- clean up the
DebugInfo fix and look for more
- reconvene with SPCL on a way to make parsing structurally faster
- check if freezing SDFGs can be done different / more efficient
- double-check the oir optimization pipeline: are there unnecessary passes that we could skip (in orchestration)?
- in the stencil oir, we loose
elif and else branches (they are all converted to separate if statements). With many branches, this could lead to overly complicated code to parse down the line (e.g. for the else branches we have to parse the negated condition of the corresponding if statement; if we retained that information, we wouldn't have to do that)
loose reference to an old issue with partly outdated, partly related issues in the (previous) gt4py/dace bridge.
The current pipeline for orchestration suffers two major issues for production:
1/ very long build times
2/ unable to properly cache a full-program optimzer.
On 1/, @romanc has done some profiling preliminary work snipping some repeating offender. The conclusion seems that everything is "kinda slow", as python codes tend to be. Going into cython or C is not an option considering the complexity of the DaCe parsing & compiling framework.
On 2/ the issue is done to parsing. The only reliable way to get a has would be to has the SDFG coming out of parsing, but that requires parsing first which itself is slow. A quick brainstorm didn't led to any other idea on how to parse short of a bookeeping of file edit times recorded during the original parsing, which sounds brittle.
Next:
DebugInfofix and look for moreelifandelsebranches (they are all converted to separateifstatements). With many branches, this could lead to overly complicated code to parse down the line (e.g. for theelsebranches we have to parse the negated condition of the correspondingifstatement; if we retained that information, we wouldn't have to do that)loose reference to an old issue with partly outdated, partly related issues in the (previous) gt4py/dace bridge.