[AIEX] Postpipeliner Solver #939
Conversation
|
we should keep --aie-postpipeliner-target-ii |
870c9a7 to
90d39d7
Compare
03e29e1 to
377efde
Compare
Add postpipeliner solver tests for 9-instruction and 10-instruction MaxPool2D inner loops. Both loops currently settle at II=7 when the solver uses a computed NS that is too small to find a valid schedule at II=4.
…unt) Previously the post-pipeliner only retried with NS+1 at the boundary case where NS happened to equal MinTripCount. Drop that guard so the solver always gets two extra attempts (SEF on, then SEF off) when the NS computed from MinLength/II is too small. Solver runtime is bounded by the existing per-call timeout.
…arget-ii Introduce --aie-postpipeliner-solver as a simple boolean flag that runs the solver as a fallback after heuristics fail at every II. Keep the existing --aie-postpipeliner-target-ii option (and the loop-pragma initiation interval) with stricter semantics: when TargetII is set the solver runs only at that II, heuristics are skipped for it, and the --aie-postpipeliner-maxii cap is bypassed so the requested II is always attempted (one-shot, no fallback to other IIs). Precedence (highest first): 1. --aie-postpipeliner-target-ii=N (CLI) 2. --aie-postpipeliner-solver (CLI) 3. Loop-pragma initiation interval The hail mary and SEF solver paths are preserved unchanged.
Extend genConflict to take per-instruction cycle offsets (OffsetA, OffsetB) and add SWPSolver::resourceConflicts(Data) which derives Required/Reserved FU uses from each instruction's itinerary and forbids any pair from co-occupying the same modular cycle. Reserved-Reserved pairs are intentionally not constrained, matching FuncUnitWrapper::conflict() semantics. Memory-bank conflicts keep working unchanged via the offset=0 default. The producer in PostPipeliner::createSolverData walks each MI's InstrItineraryData stages with cumulative cycle offsets and registers every (FU, offset) into SolverData. AIE redefines FUNCUNIT_REPRESENTATION(x) to (x) in MCTargetDesc/AIEMCTargetDesc.cpp, so InstrStage::getUnits() returns the FU's bit position rather than a mask; the producer lifts the position to a one-bit mask once before storing. CHECK lines for the 3 solver-driven tests whose schedules legitimately shifted under the new constraints are regenerated in the same commit.
377efde to
cce4989
Compare
| // II == TargetII is handled inside the post-pipeliner. | ||
| BS.FixPoint.II = PostSWP.isTargetIIHardLimit() | ||
| ? PostSWP.getTargetII() | ||
| : PostSWP.getResMII(*BS.TheBlock); |
There was a problem hiding this comment.
I think we should simplify. When we drive an example from the command line, we want to say where we start, where we stop. Orthogonal to that, we want to say which approaches to enable. In my branch I have introduced a MinII CLI.
| if (++BS.FixPoint.II <= PostPipelinerMaxII && | ||
| ++BS.FixPoint.IITries <= PostPipelinerMaxTryII) { | ||
| return SchedulingStage::Pipelining; | ||
| } |
There was a problem hiding this comment.
So, a one-shot attempt can be build from MinII and MaxII and enabling the algorithms that you want to act on it.
| if (!Itin || Itin->isEmpty()) { | ||
| return; | ||
| } | ||
| const unsigned SchedClass = MI->getDesc().getSchedClass(); |
There was a problem hiding this comment.
No. TII->getSchedClass().
There was a problem hiding this comment.
we use the current pattern all across the codebase.
Should I be changing them all across the AIE target, when we have TII available?
| } | ||
| Cycle += IS.getNextCycles(); | ||
| } | ||
| } |
There was a problem hiding this comment.
And I'm not too happy about this parallel implementation of resource queries.
Perhaps we can pusblish anyStage() in AIEHazardRecogniser.
| int CycleOffset; // relative to the instruction's issue cycle | ||
| BitVector Required; | ||
| BitVector Reserved; | ||
| }; |
There was a problem hiding this comment.
You're reimplmenting one component of FuncUnitWrapper. anyStage() in AIEHazardRecognizer yields the entire FuncUnitWrapper, which has a conflict() (overlap()?) method.
Even more direct, in your instruction x instruction matrix, you could generate a full scoreboard for one instruction (II + pipelinedepth) and check each other instruction for the modulo distance where they conflict in this scoreboard.
There was a problem hiding this comment.
talked offline: we do not need worry too much about adding too many constraints since they are cheap. We can reuse the AIEHazardRecognizer and add any conflict()
This PR shifts the Solver to prototyping Software and gives a best effort estimation without the need to have prior knowledge about the Problem.
Increase Solver capabilities by allowing