Route generated-script tool calls through a resolver [CI 2/9]#1600
Merged
Conversation
FINN's transformations write bash scripts that invoke vivado, v++, vitis_hls, and vitis-run by bare name. Route these calls through a resolver so a site that wraps the tools (for example, a HPC cluster or LSF bsub dispatcher that feeds a farm) can intercept them through a single override. Add resolve_xilinx_tool(tool_name): with FINN_TOOL_DIR_OVERRIDE set it resolves to <override>/<tool_name>, otherwise the bare name, and raises FileNotFoundError if the result is not found. Route the vivado, v++, vitis_hls, and vitis-run call sites in CreateStitchedIP, MakeZynqProject, the Alveo build and the FINN loop RTL backend through it. The wrapping site then needs only one shim directory whose filenames mathc the tool names. With the override unset the embedded command is the bare tool name, same as before. The four vivado and v++ sites additionally gain a fail-fast check they previously didn't have, so a missing tool surfaces at transform time with a clear message instead of as a later bash failure. Signed-off-by: Marco Blackwell <mblackwe@amd.com>
879aa5f to
7283e4c
Compare
Contributor
Author
|
Changed to stop using tmp_path |
Collaborator
|
Looks good, thank you @merkelmarrow ! |
merkelmarrow
added a commit
to merkelmarrow/finn
that referenced
this pull request
Jun 12, 2026
The XSI rtlsim compile step floats SystemVerilog packages ahead of the
modules that import them, because xelab aborts with VRFC 10-2989
("<pkg> is not declared") and writes no xsimk.so when a package is
analysed too late. It did this with a hardcoded list of package names
that has had to grow reactively: ca70a78 ("[finn_xsi] Sort list of
hdl sources to make sure configs come first") floated swg_pkg, then
240dd53 ("[finnxsi] Add mvu_pkg to be loaded first") added mvu_pkg
when a second finn-rtllib package hit the same thing. Each new package
means another red CI run and another manual edit.
Replace the name list with a stable partition that floats any
*_pkg.{sv,v} to the front by basename and otherwise preserves the
caller's order, so new finn-rtllib packages need no further changes.
The helper lives in a small srcutil module with no xsi C-estension
import so it can be unit tested without a built finn_xsi.
Route the xelab call through resolve_xilinx_tool (the resolver added
in Xilinx#1600) so it can be delegated like the other heavy tools.
Add a check flag to launch_process_helper so a non-zero xelab exit
raises immediately. Previously, a failed xelab returned quietly and
surfaced one line later as a FileNotFoundError on the missing xsimk.so,
hiding the real elaboration error. Also decode with errors="replace"
so noisy tool output cannot abort the run with a decode error.
Signed-off-by: Marco Blackwell <mblackwe@amd.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is PR 2 of 9 of a series intended to make CI faster and more robust.
This patch is the first building block of a new testing infrastructure model. Most of FINN's tests complete relatively quickly on a single machine. The slow parts are almost exclusively the calls to xilinx tools such as Vivado, v++, and vitis-run/hls. This PR introduces a tool resolver hook in basic.py that allows a full testing suite to be centred on a single machine, while intercepting heavy tool calls and delegating to a compute cluster through user-written shims.
I chose this approach over a pure PATH shim entirely on the deploy-side because in practice it introduced obscure problems and unintended shell/subprocess interactions and is less user friendly for those wanting to set up their own testing. This model is explicit for the end user who wants to run their own infrastructure:
Behaviour is mostly identical to before if FINN_TOOL_DIR_OVERRIDE env var isn't set aside from clear fail-fast error if tool is missing.
Xelab note
Xelab is added to the resolver in a separate PR as it requires special handling.
Maintenance note
New heavy tool calls should be added to and called through the resolver. Every time a tool is added (which should be quite rare), it will fail CI tests that use the tool until a corresponding shim is written on the deployment side (assuming FINN_TOOL_DIR_OVERRIDE is set on that deployment). This is the consequence of an explicit model and preferable to a silent fallback that would eventually cause OOMs or load saturation on the single test machine. The tool calls now provide a clear fail-fast error if they are missing.