-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Chip-to-chip links can be blacklisted for two reasons:
- the link itself is misbehaving,
- the chip in one end of the link is dead (or blacklisted).
Links on blacklists are always associated with a chip, which means that the two ends of a link are independent and can be treated differently. To avoid an inconsistent state, both ends of a link should be blacklisted. This is possible only if the two connected chips are on the same board, given that blacklists are local to a board and do not cross board boundaries. The broken link should be blacklisted in the board where the broken end lives but not in the other board, given that boards can move and the other board could end up connected to a board with a working link.
One way to address the issue of the two connected chips being on neighbouring boards is to disable the link in the FPGA on the board where the broken end of the blacklisted link lives, e.g., where the dead chip lives. This does not blacklist the other end but makes sure that the link is disabled by scamp in the chip on the other end during link probing. This results in a consistent disabled link state on both ends.
The mechanism to disable links is already implemented in the FPGAs (see command xreg). Support for blacklist management needs to be added to the BMP API.
This issue complements issues:
- SCAMP: neighbouring chip link checks need a better approach #142
- SpiNNakerManchester/SpiNNaker_hardware_tests#2