Open
Description
This is to document a known issue with Slingshot 11 network. Legion runs hit an error if you use only 1 node:
*** FATAL ERROR (proc 0): in gasnetc_ofi_init() at .../gasnet_ofi.c:946: fi_domain failed: -38(Function not implemented)
I have been told that this is an issue with the SLURM integration, and therefore is not something that Legion/GASNet are in a position to directly address.
In the meantime, I'm aware of two workarounds:
- Use 2 or more nodes
- Run with
srun --network=single_node_vni
I will update this issue when the workarounds are no longer required.
Edit: I understand that the issue is related to SLURM settings at OLCF, not necessarily to Slingshot 11 per se.
Metadata
Metadata
Assignees
Labels
No labels