Open
Description
Did you check existing issues?
- I have read all the tree-sitter docs if it relates to using the parser
- I have searched the existing issues of tree-sitter-typescript
Tree-Sitter CLI Version, if relevant (output of tree-sitter --version
)
No response
Describe the bug
For a given TSX template,
a["b"] = <C d="e">
<F></F>
{ g() }
</C>;
nested jsx_opening_element
on a different line is captured with all whitespaces, as \n <F>
instead of just <F>
.
Steps To Reproduce/Bad Parse Tree
The Parse Tree is correct in both cases, but tree elements' ranges are not.
I have not found a way to include ranges inside the node-based tests with *.txt files, so I've created a Rust test draft:
#[cfg(test)]
mod tests_f_node {
use tree_sitter::Node;
use super::*;
#[test]
fn tsx_tag_parse_ranges() {
let code = r#"
a["b"] = <C d="e">
<F></F>
{ g() }
</C>;
"#;
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&super::language_tsx())
.expect("Error loading TypeScript TSX grammar");
let tree = parser.parse(code, None).unwrap();
let root_node = tree.root_node();
let f_node = get_f_node(root_node, code).expect("<F> node not found");
// Assert the ranges. Modify these values according to the actual positions in your code.
let start_byte = f_node.start_byte();
let end_byte = f_node.end_byte();
assert_eq!(start_byte, 36); // Replace with the correct start byte
assert_eq!(end_byte, 39); // Replace with the correct end byte
let start_position = f_node.start_position();
let end_position = f_node.end_position();
assert_eq!(start_position.row, 2); // Line number containing <F>
assert_eq!(start_position.column, 16); // Column where <F> starts
assert_eq!(end_position.row, 2);
assert_eq!(end_position.column, 19); // Column where <F> ends
}
fn get_f_node<'a>(node: Node<'a>, code: &'a str) -> Option<Node<'a>> {
for child in node.children(&mut node.walk()) {
if child.kind() == "jsx_opening_element"
&& dbg!(child.utf8_text(code.as_bytes()).unwrap()) == "<F>"
{
return Some(child);
}
if let Some(found) = get_f_node(child, code) {
return Some(found);
}
}
None
}
}
which outputs
---- tests_f_node::tsx_tag_parse_ranges stdout ----
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "<C d=\"e\">"
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "\n <F>"
thread 'tests_f_node::tsx_tag_parse_ranges' panicked at bindings/rust/lib.rs:97:50:
<F> node not found
stack backtrace:
on current master
.
Expected Behavior/Parse Tree
I've bisected that to
37ced086ad8bb4fa67e8c53711e9f30e869bb78f is the first bad commit
commit 37ced086ad8bb4fa67e8c53711e9f30e869bb78f (HEAD)
Author: Amaan Qureshi <[email protected]>
Date: Fri Jul 5 23:13:15 2024 -0400
chore: generate
tsx/src/grammar.json | 370 +-
tsx/src/node-types.json | 843 +-
tsx/src/parser.c | 552504 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------------------------------------
typescript/src/grammar.json | 366 +-
typescript/src/node-types.json | 847 +-
typescript/src/parser.c | 530546 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------
6 files changed, 440659 insertions(+), 644817 deletions(-)
and before this commit everything works fine:
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "<C d=\"e\">"
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "<F>"
thread 'tests_f_node::tsx_tag_parse_ranges' panicked at bindings/rust/lib.rs:103:9:
assertion `left == right` failed
// this failures is a cause of my test being a draft, but it's already exposing the issue hence useful in the current state
Repro
See the test above