-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add shared basic block library #18497
base: main
Are you sure you want to change the base?
Conversation
1c9e7cc
to
5bfeff6
Compare
7112872
to
b313a48
Compare
b313a48
to
8b20b0d
Compare
I think we should do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work, great to see even more shared code 🎉
Thanks for the review and comments. I've addressed all but one. |
* Gets an `id` of `node`. This is used to order the predecessors of a join | ||
* basic block. | ||
*/ | ||
bindingset[node] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the bindingset
here appropriate? My thinking was that we don't need IDs of all nodes, only the few that end up as the first in a basic block prior to a join block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the actual equivalenceRelation
based hacks will be able to benefit, so we can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, no. I don't think there's any way to implement a numbering scheme that can make use of a pre-bound node. So you're only risking inlining and recomputation.
bindingset[node] |
I'll do that in a follow up PR 👍 |
@@ -134,6 +134,10 @@ private module Implementation implements CfgShared::InputSig<Location> { | |||
SuccessorType getAMatchingSuccessorType(Completion c) { result = c.getAMatchingSuccessorType() } | |||
|
|||
predicate isAbnormalExitType(SuccessorType t) { none() } | |||
|
|||
int idOfAstNode(AstNode node) { none() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that getJoinBlockPredecessor
isn't implemented in the Actions basic block library, so this instantiation is actually fine. And we could move Actions to the shared BB library in a future PR.
@@ -64,6 +80,7 @@ module Make<InputSig Input> { | |||
BasicBlock getAPredecessor(SuccessorType t) { result.getASuccessor(t) = this } | |||
|
|||
/** Gets the control flow node at a specific (zero-indexed) position in this basic block. */ | |||
cached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be defined inside the Cached
module, and then just referenced here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to reference them with the
BasicBlocks:: prefix
?
Yes, that works. I didn't think of that 🙈
* arbitrary order. | ||
*/ | ||
cached | ||
JoinPredecessorBasicBlock getJoinBlockPredecessor(JoinBasicBlock jb, int i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to reference them with the BasicBlocks::
prefix?
// if x or y | ||
// foo | ||
// else | ||
// bar | ||
// ``` | ||
// | ||
// we have a CFG that looks like | ||
// | ||
// x --false--> [false] x or y --false--> bar | ||
// \ | | ||
// --true--> y --false-- | ||
// \ | ||
// --true--> [true] x or y --true--> foo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CFG is for x and y
- the true/false edges are mixed up. The easiest fix is probably just to replace x or y
with x and y
.
private predicate bbIndex(Node bbStart, Node cfn, int i) = | ||
shortestDistances(startsBB/1, intraBBSucc/2)(bbStart, cfn, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned the other day, I don't think shortestDistances
is optimal. IIRC it can be beat with a QL recursion, but it's something that'll need fresh measurements on some large examples. I expect the C++ implementation to be in the lead in terms of performance on this particular predicate, and C++ uses a straightforward QL recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since that requires some benchmarking I'd prefer to leave that for a future PR.
* Gets an `id` of `scope`. This is used to order the predecessors of a join | ||
* basic block. | ||
*/ | ||
bindingset[scope] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bindingset[scope] |
if this instanceof EntryBasicBlock | ||
then result = this.getFirstNode().getScope() | ||
else result = this.getAPredecessor().getScope() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this recursive and not just the following?
if this instanceof EntryBasicBlock | |
then result = this.getFirstNode().getScope() | |
else result = this.getAPredecessor().getScope() | |
result = this.getFirstNode().getScope() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been wondering the same, it's hard to imagine how they would not be equivalent for correct implementations of getScope
and EntryBasicBlock
, but I simply reproduced the existing code (which used overloading but was otherwise the same) thinking that probably was some reason 😅
final class BasicBlock extends BasicBlockImpl::BasicBlock { | ||
// We extend `BasicBlockImpl::BasicBlock` to add the `getScope`. | ||
/** Gets the scope of this basic block. */ | ||
CfgScope getScope() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on putting Node.getScope
through the BasicBlock input, such that getScope
could be added to BasicBlock directly in its definition inside shared/controlflow/codeql/controlflow/BasicBlock.qll
? That way this could just be an alias rather than having to copy all the member predicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would definitely save some boiler plate, but could make the BasicBlock
module less flexible for instantiations that don't want a getScope
.
predicate immediatelyControls(BasicBlock succ, SuccessorType s) { | ||
succ = this.getASuccessor(s) and | ||
forall(BasicBlock pred | pred = succ.getAPredecessor() and pred != this | | ||
succ.dominates(pred) | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java, this predicate is called dominatingEdge
and it includes bbIDominates(this, succ)
as an additional conjunct. I think that ought to be semantically identical, but it might offer a better join-order since immediate dominance joined with the successor relation yields a much stronger context for the forall
.
Also, in this general library, it's probably worth it to expand the qldoc a bit on the relationship between the concept of controls and dominance. In short: controls equals edge dominance. See also the qldoc for dominatingEdge
in Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the concept of dominating edge aka immediatelyControls
might very well be applicable beyond ConditionBasicBlock
s, so it should possibly be available as a top-level predicate between basic blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, this concept can be put into BasicBlocks.qll
as it's independent of .isCondition()
.
Adds a shared basic block library in the
controlflow
pack and modifies the basic block implementation of Swift, Ruby, C#, and Rust to use it.A few notes:
There's two ways to use the basic block library. Either through the new
codeql.controlflow.BasicBlock
or through the existingcodeql.controlflow.Cfg
. The former is suitable for languages that don't use the control graph library and the latter for those that do. In this PR all languages use the latter, but I've tried instantiatingcodeql.controlflow.BasicBlock
for Go and that seems to work fine as well.The
BasicBlock
class for C# now has both the currentgetASuccessorByType/1
method and a newgetASuccessor/1
method that it inherits from the basic block library and which is the name used in Ruby, Rust, and Swift. We could consider deprecatinggetASuccessorByType/1
in order to not have two methods doing the same and to increase consistency between language libraries.For the
BasicBlock
subclasses inCfg.qll
I've changed the current names a bit, such that they are all consistently of the form${name}BasicBlock
. For instanceJoinBlock
is insteadJoinBasicBlock
. For the existing language-level basic block libraries I've kept the current names for backwards compatibility, so only Rust use the new names.