PyKokkos team parallel dispatches do not support
argument parsing. For example, consider https://github.com/kokkos/pykokkos/blob/main/examples/kokkos-tutorials/standalone/team_policy.py.
If we change the workunit to
@pk.workunit
def yAx(team_member, acc, cols, y_view, x_view, A_view):
j: int = team_member.league_rank()
def inner_reduce(i: int, inner_acc: pk.Acc[float], c: float):
inner_acc += A_view[j][i] * x_view[i] + c
temp2: float = pk.parallel_reduce(
pk.TeamThreadRange(team_member, cols), inner_reduce, c=0.5
)
if team_member.team_rank() == 0:
acc += y_view[j] * temp2
that is, if we pass in a key-word argument to inner_reduce, the translation fails with
/work/09661/gkk345/vista/pykokkos/pk_cpp/examples/kokkos-tutorials/standalone/team_policy/team_policy_yAx/types_a5d3098b014ebac4e68e6114dd8b833e/Cuda/../functor.hpp(29): error: identifier "inner_acc" is undefined
inner_acc += ((((A_view(j, i)) * (x_view(i)))) + (c));
^
/work/09661/gkk345/vista/pykokkos/pk_cpp/examples/kokkos-tutorials/standalone/team_policy/team_policy_yAx/types_a5d3098b014ebac4e68e6114dd8b833e/Cuda/../functor.hpp(29): error: identifier "c" is undefined
inner_acc += ((((A_view(j, i)) * (x_view(i)))) + (c));
^
2 errors detected in the compilation of "/work/09661/gkk345/vista/pykokkos/pk_cpp/examples/kokkos-tutorials/standalone/team_policy/team_policy_yAx/types_a5d3098b014ebac4e68e6114dd8b833e/Cuda/bindings.cpp".
gmake[2]: *** [CMakeFiles/kernel.cpython-313-aarch64-linux-gnu.dir/build.make:79: CMakeFiles/kernel.cpython-313-aarch64-linux-gnu.dir/bindings.cpp.o] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:87: CMakeFiles/kernel.cpython-313-aarch64-linux-gnu.dir/all] Error 2
gmake: *** [Makefile:136: all] Error 2
This is orthogonal to how outer loop argument parsing works, and it should be fixed with either better error messages or functionality support.
PyKokkos team parallel dispatches do not support
argument parsing. For example, consider https://github.com/kokkos/pykokkos/blob/main/examples/kokkos-tutorials/standalone/team_policy.py.
If we change the workunit to
that is, if we pass in a key-word argument to
inner_reduce, the translation fails withThis is orthogonal to how outer loop argument parsing works, and it should be fixed with either better error messages or functionality support.