Skip to content

Commit dabc439

Browse files
committed
Use tree-sitter fork with QueryMatch.finished field
This resolves a bug that can happen when a query contains multiple captures which may be returned greedily. With upstream tree-sitter, there is no way to detect if a `(match, capture_index)` pair returned from the `QueryCaptures` iterator is the last capture in a match. This is known within the query runtime in tree-sitter though, so my fork exposes this as a new `finished` field on `QueryMatch`. We can use this information to reliably un-nest when a pattern is finished matching. An example where the code of the parent commit is buggy: (map "#" @Rainbow "{" @Rainbow "}" @Rainbow) For an Erlang map "#{}". The `#` and `{`/`}` are children of different nodes in the grammar, and `{`/`}` may be captured without the `#` (in a `map_update` node in which the `#` acts as an operator). Therefore the query API returns these captures separately and greedily: // capture_index == 0: # QueryMatch { captures: [ QueryCapture { node: "#", .. } ] } // capture_index == 1: { QueryMatch { captures: [ QueryCapture { node: "#", .. }, QueryCapture { node: "{", .. } ] } // capture_index == 2: } QueryMatch { captures: [ QueryCapture { node: "#", .. }, QueryCapture { node: "{", .. }, QueryCapture { node: "}", .. } ] } So there's no indication whether `capture_index`s 1 and 2 are the final capture in the match. We need to know when this pattern is finished to decrement the rainbow nesting level counter. It also helps us reduce memory usage since we can remove the entry in the HashMap tracking highlights for matches. A possible workaround would be to iterate on QueryMatches rather than QueryCaptures. This would allow us to reliably check for the terminating QueryCapture with capture_index == match_.len() - 1 However it comes at a performance penalty: we would need to consume the entire QueryMatches Iterator in order to sort the `QueryMatch`s by `match_.captures[0].node.start_byte()`. This is because the QueryMatches Iterator returns patterns as they finish. For example: // sample: #{list = []} QueryMatch { captures: [ QueryCapture { node: "[", .. }, QueryCapture { node: "]", .. } ] } QueryMatch { captures: [ QueryCapture { node: "#", .. }, QueryCapture { node: "{", .. }, QueryCapture { node: "}", .. } ] } The second QueryMatch returned is at a lower nesting level, so sorting by start_byte of the initially captured node is necessary. This sorting can be avoided altogether as long as we can reliably detect when a pattern is done being matched.
1 parent 49d08aa commit dabc439

File tree

4 files changed

+6
-5
lines changed

4 files changed

+6
-5
lines changed

Cargo.lock

Lines changed: 1 addition & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

helix-core/Cargo.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,8 @@ unicode-width = "0.1"
2525
unicode-general-category = "0.5"
2626
# slab = "0.4.2"
2727
slotmap = "1.0"
28-
tree-sitter = "0.20"
28+
# tree-sitter = "0.20"
29+
tree-sitter = { git = "https://github.com/the-mikedavis/tree-sitter", rev = "44bef5de7027f44972277b3739c77d0b5afcc887" }
2930
once_cell = "1.13"
3031
arc-swap = "1"
3132
regex = "1"

helix-core/src/syntax.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2115,7 +2115,7 @@ impl<'a> Iterator for RainbowIter<'a> {
21152115
self.match_highlights.insert(match_.id(), next_highlight);
21162116

21172117
Some(next_highlight)
2118-
} else if capture_index == match_.captures.len() - 1 {
2118+
} else if match_.finished {
21192119
// Final capture in the match, remove the entry and decrement
21202120
// nesting level, wrapping around to the last rainbow color.
21212121
self.rainbow_nesting_level = self

helix-loader/Cargo.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ anyhow = "1"
1818
serde = { version = "1.0", features = ["derive"] }
1919
toml = "0.5"
2020
etcetera = "0.4"
21-
tree-sitter = "0.20"
21+
# tree-sitter = "0.20"
22+
tree-sitter = { git = "https://github.com/the-mikedavis/tree-sitter", rev = "44bef5de7027f44972277b3739c77d0b5afcc887" }
2223
once_cell = "1.13"
2324
log = "0.4"
2425

0 commit comments

Comments
 (0)