Skip to content

Conversation

@katrinafyi
Copy link
Member

Replaces every line of IrreducibleLoops.scala. This reimplements the algorithm in a way much closer to the paper. This new code does much less explicit bookkeeping work. It replaces some old code that went through the dfsp_pos with new code that does some kind of topologically-ordered dynamic programming. In doing so, it is my hope that it is more obviously correct.

Also:

  • fixes certain bugs in the transform relating to topological order
  • removes state machine
  • adds mutability in data structures to make it closer to the paper
  • fixes analysis and transform for nested irreducible loops

Closes #410

youre gonan need a final there
- topological sorting based on dfspos to build nodes set
- construct edge sets?? find out what the transform needs. maybe just
  needs to identify internal vs external edges into a header node.
- in transform, what of assume conditions?? if a header has assumes,
  do we need to "hoist" that to the new reducible header?
subsequent loops after transforming earlier loops? also check assume
statements
i think we have to redirect them via N.

but only for those which originate at nodes which are /not/
the header???????????????????????????????????????????????
outermost loop where they remain valid. that is, they should be
hoisted to the outermost loop containing "to" where "from" is not
internal to the loop.

i think to do this we need some notion of "self nodes", the nodes which
appear in /only/ a particular cycle and no other subcycles. is a "self
node" simply defined by iloop_header??????

much to think about
This reverts commit ef4b8ec.
@katrinafyi
Copy link
Member Author

katrinafyi commented Oct 1, 2025

For those curious, here is an example run through the new pipeline. This is Fig 3 from the paper (there's a copy in Zulip).

image

It has an obvious irreducible loop in {a,b,c,d}, but it also has sub-loops. The subsets {b,c,d}, {c,d} are also their own loops because they each form a strongly-connected component. These loops are nested within each other, as depicted by the loop nesting forest. In the paper's tree diagram, the round loop nodes denote (primary) loop headers, and the square nodes are blocks within a loop which are not headers. To read off which nodes are "internal" to a particular loop or sub-loop, you would simply look at all descendants of the loop's header.

This figure is in the IrreducibleLoop test suite in the fig3 test case. You can run it with

./mill -w test.testOnly IrreducibleLoop -- -z 'paper fig3'

and this will print the BlockLoopInfo for each block in the CFG:

IrreducibleLoop:
- paper fig3
  + BlockLoopInfo(%S,None,1,Set(),Set()) 
  + BlockLoopInfo(%a,None,2,Set(%a, %d),Set(%a, %b, %c, %d)) 
  + BlockLoopInfo(%b,Some(%a),3,Set(%b, %d),Set(%b, %c, %d)) 
  + BlockLoopInfo(%c,Some(%b),4,Set(%c, %d),Set(%c, %d)) 
  + BlockLoopInfo(%d,Some(%c),5,Set(),Set()) 
  + BlockLoopInfo(%E,None,5,Set(),Set()) 

The fields in BlockLoopInfo, in order, are:

  • the block,
  • optional parent node in the loop nesting forest,
  • (less important rn) DFS order number, used in topological ordering,
  • set of loop headers, including secondary (irreducible) headers, and
  • set of loop nodes.

You can see that the information printed here matches the paper's loop nesting forest, with the small addition of irreducible loop headers which aren't in the paper diagram. Note that irreducible entries can enter any of the sub-loops of the irreducible header.

Now, we will run the transform and show the result in fig3.dot.pdf. This is harder to reason about because the paper doesn't cover the transform.

However, you can try to follow certain valid paths in the original graph and observe that they remain possible in the transformed graph. You can also try invalid paths and observe that they are still not possible.

  • a -> b -> c -> d is still possible. It just goes through all the normalised headers first.
  • a -> d is still impossible. Although there are edges to facilitate this path, upon reaching a you would set b_loop_from: int := 0x0; which is not an allowed value in the assume statements of d.

Running the loop analysis again gives us that all the loops are now reducible; the header sets have at most one element.

  + BlockLoopInfo(%S,None,1,Set(),Set()) 
  + BlockLoopInfo(%a_loop_N,None,2,Set(%a_loop_N),HashSet(%d, %b, %a_loop_N, %b_loop_N, %a, %c, %c_loop_N)) 
  + BlockLoopInfo(%a,Some(%a_loop_N),3,Set(),Set()) 
  + BlockLoopInfo(%b_loop_N,Some(%a_loop_N),4,Set(%b_loop_N),HashSet(%d, %b, %b_loop_N, %c, %c_loop_N)) 
  + BlockLoopInfo(%b,Some(%b_loop_N),5,Set(),Set()) 
  + BlockLoopInfo(%c_loop_N,Some(%b_loop_N),6,Set(%c_loop_N),Set(%c_loop_N, %d, %c)) 
  + BlockLoopInfo(%E,None,6,Set(),Set()) 
  + BlockLoopInfo(%c,Some(%c_loop_N),7,Set(),Set()) 
  + BlockLoopInfo(%d,Some(%c_loop_N),8,Set(),Set()) 

Finally, as a comparison with the old code, this is the same example run through the old loop detector:

IrreducibleLoop:
- paper fig3
  + %a: Header: block.a, Body: HashSet((block.a, block.b)), Nodes: HashSet(%a, %b), Reentries: HashSet((block.S, block.d)) 
  + %b: Header: block.b, Body: HashSet(), Nodes: HashSet(), Reentries: HashSet((block.S, block.d)) 
  + %c: Header: block.c, Body: HashSet(), Nodes: HashSet(), Reentries: HashSet((block.S, block.d)) 
  + AFTER 
  + %c_loop_N: Header: block.c_loop_N, Body: HashSet((block.c_loop_N, block.c), (block.c, block.b_loop_N), (block.c_loop_N, block.d)), Nodes: HashSet(%c_loop_N, %b_loop_N, %c, %d), Reentries: HashSet() 

This has some problems. The biggest problem is that the sets of internal nodes and body edges are mostly incomplete. They seem to be missing edges and nodes which appear in multiple loops. As a result of this (or other bugs), applying the transform leads to a broken CFG. After the transform, blocks a and b have no predecessors, and the exit block E is entirely disconnected from the rest graph ;-; This is rather surprising. I think the old algorithm was mostly broken when loops became nested.

This is the patch to add this test case on top of the old loop detector: 0001-add-irred-fig3-to-old-loop-detetcor.patch.

Comment on lines 160 to 167
case class BlockLoopState(
val b: Block,
var iloop_header: Option[Block],
var dfsp_pos: Int,
var dfsp_pos_max: Int,
var is_traversed: Boolean,
var headers: Set[Block]
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is a case class instead of a class? It doesn't seem to use any case class features and having mutability in a case class can cause unintuitive behaviour so it's generally best avoided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was probably because I wanted a nice tostring while I was working on it. I think this is nice to keep for debugging in future. I think the only problem is that hashcode may change, and that is only a problem if using it in a hashed collection.

I can add a comment warning against that if you want. Would that be enough, or should it be uncased?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanest approach is probably to just write a simple toString method to have that for debugging purposes and then make it a class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be done, loath as I am to add more lines :')

@l-kent
Copy link
Contributor

l-kent commented Nov 3, 2025

This seems generally fine, apart from the one question I have about the mutable case class.

@katrinafyi katrinafyi merged commit 2a864c9 into main Nov 4, 2025
20 of 21 checks passed
@katrinafyi katrinafyi deleted the fused-irred-loop-fix branch November 4, 2025 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect irreducible loop llvm_transform loops sharing nodes

3 participants