Skip to content

Document path statements #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
147 changes: 147 additions & 0 deletions docs/concepts/CLQL.md
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,153 @@ method(depth = any):

<br />

# Edge

Facts in AST lexicons refer to nodes in an AST, and the parent/child relationship between facts refers to the parent/child relationship of nodes in the AST. These nodes can have other parent/child relationships that are orthogonal to AST, such as calls. These relationships can be queried with the `edge` keyword.

The following query finds function calls at the top level of a file and follows the `calls` edge to their definition:

```
common.func_call:
edge("calls"):
common.func
```

<br />

# Path

Path statements encapsulate CLQL trees. These subtrees can be repeated with a single argument allowing succint repition of complex patterns. Branched paths can rejoin allowing a fact to match nodes with different kinds of parents.

## Linear

Say we wanted to find triply nested if statements, our query would look like the following:

```clql
common.if_stmt:
common.if_stmt:
common.if_stmt
```

With paths, we can express the same thing like so:

```clql
path(repeat = 3):
common.if_stmt:
pathcontinue
```

Once a query reaches a `pathcontinue` statement it continues from the `path` statement until the path has been repeated the specified number of times.

## Repeat range

Some queries cannot be written with `path` statements. Say we wanted to find all functions called by `someFunc()` and an arbitrarily long chain of calls. Our query would have to explicitly match either directly called functions, or functions with 1, 2, 3 etc intermediaries to infinity.

```clql
common.func:
name == "someFunc"
any_of:
common.func_call(depth = any):
edge("calls"):
common.func
common.func_call(depth = any):
edge("calls"):
common.func:
common.func_call(depth = any):
edge("calls"):
common.func
...
common.func_call(depth = any):
edge("calls"):
common.func:
common.func_call(depth = any):
edge("calls"):
common.func:
...
```

With paths the same query is trivial:

```clql
common.func:
name == "someFunc"
path(repeat = 1:):
common.func_call(depth = any):
edge("calls"):
common.func:
pathcontinue
```

`repeat = 1:` is a range specifying that the path should be repeated one or more times.

## Complex subtrees

Say we wanted to match triply nested if statements that all check the same value, our query would look like the following:

```clql
common.if_stmt:
common.condition:
common.var:
name as varName
common.if_stmt:
common.condition:
common.var:
name == varName
common.if_stmt:
common.condition:
common.var:
name == varName
```

With paths our query has much less repitition:

```clql
common.func:
path(repeat = 3):
common.if_stmt:
common.condition:
common.var:
name as varName
pathcontinue
```

Note that CLQL elements that are children of `path`, not just the `if_stmt`. Also note that repeated definitions of `varName` are replaced with assertions.

## Pathend

Suppose we wanted to match triply nested if statements with a function call inside the innermost if statement. Without paths our query looks like:

```clql
common.if_stmt:
common.if_stmt:
common.if_stmt:
common.func_call
```

with paths our query looks like:

```clql
path(repeat = 3):
common.if_stmt:
pathcontinue
pathend:
common.func_call
```

## Caveats

Branching, where `path` statement has multiple `pathcontinue` statements is currently not supported.

Nested paths are not supported.

Using `any_of` inside a path statement is not supported.
Copy link

@mullikine mullikine Dec 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that I must change this tenet to use a nested any_of? https://github.com/mullikine/codelingo/blob/psr/tenets/codelingo/psr-1/uppercase-class-constants/codelingo.yaml Will the path syntax used in the tenet be supported in the future? Also, is path expanded into a nested any_of before it is evaluated? Knowing this would help me to understand how I'm supposed to use it. For example, I'm questioning if I should put the depth = any inside the path or the path's children. This raises a couple of questions. Can I have multiple arguments to the path fact i.e depth and repeat? Can I have path at the root of the CLQL query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will be supported in the future, that's a key use case.

Can you give an example of where you would be choosing between putting a depth = any inside the path or the path's children?

The path element only takes a repeat argument, facts inside path can take depth arguments. Element, by the way, is the generic term for fact, property, path, any_of, etc.

Yes you can have a path at the root.

Copy link

@mullikine mullikine Dec 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case I had in mind when asking the question was the tenet I linked to above. Here I have "depth = any" specified for both children. But if a repeat was specified in the path then each nested child would have a "depth = any". I'm unsure what this would do to performance, but I'd imagine you might get a tetration thing going. Also, it kind of makes sense to place the "depth = any" within the path fact because then you're only specifying it once. Either that or place the "depth = any" on a zero-width parent to the path fact. Should this go into discuss?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't know what you're suggesting. I can't think of any argument repeat argument that you could pass to path that would replace the need for a depth = any. Can you write up some CLQL?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a topic on discuss to continue the conversation.
https://discuss.codelingo.io/t/clql-syntax-new-features/87


## Decorators

Some decorators such as `@review comment` can only be used once per query. Using them in a repeated path will cause an error.

<br />

# Variables

Facts that do not have a parent-child relationship can be compared by assigning their properties to variables. A query with a variable will only match a pattern in the code if all properties representing that variable are equal.
Expand Down