Description
Discussion
Background
Currently if you have code like:
{%
10
# Foo
20
30
# Bar
40
%}
{%
50
60
%}
The parser is able to properly attribute location information to each number, taking into account the newlines and comments:
{value: "10", line_number: 2}
{value: "20", line_number: 6}
{value: "30", line_number: 7}
{value: "40", line_number: 11}
{value: "50", line_number: 14}
{value: "60", line_number: 15}
However, when used as part of a MacroVerbatim
node like:
macro finished
{% verbatim do %}
{%
10
# Foo
20
30
# Bar
40
%}
{%
50
60
%}
{% end %}
end
NOTE: The
macro finished
isn't really needed to reproduce this problem, but is included for demonstration purposes to make generating the output easier.
The results are now not totally accurate:
{value: "10", line_number: 2}
{value: "20", line_number: 3}
{value: "30", line_number: 4}
{value: "40", line_number: 5}
{value: "50", line_number: 7}
{value: "60", line_number: 8}
This is because the line numbers for each node are now based on the to_s
representation of the MacroExpression
s within the MacroVerbatim
versus the actual source code. E.g.
# There's a newline here
{% 10
20
30
40
%}
{% 50
60
%}
The Problem
#15305 makes things a little bit more accurate by retaining the newline after the {%
is present. E.g.
{value: "10", line_number: 3}
{value: "20", line_number: 4}
{value: "30", line_number: 5}
{value: "40", line_number: 6}
{value: "50", line_number: 9}
{value: "60", line_number: 10}
# There's a newline here
{%
10
20
30
40
%}
{%
50
60
%}
However things are still not quite 100% correct. We can make things a bit better by summing the node's location with the location of the MacroVerbatim
node itself (removing the 0 ||
on like 592 in interpreter.cr
in the appendix diff):
{value: "10", line_number: 4}
{value: "20", line_number: 5}
{value: "30", line_number: 6}
{value: "40", line_number: 7}
{value: "50", line_number: 10}
{value: "60", line_number: 11}
The number 10
now at least has the proper location, but all the others are still not correct. The gist of why the other numbers are off is because notice in the stringified output, all the whitespace and comments in the macro expression have been stripped. Thus they are not taken into consideration when the code is re-parsed as part of the macro expansion logic.
Not having the nodes have the proper location information in this context makes #14880 much harder as it relies on having proper line numbers to map to the source code when generating the coverage report.
Proposed Solution
So far the most robust solution I can think of is to have the parser include MacroLiteral
nodes when parsing macro expressions that represent comments and extra newlines. This would ensure that the stringified macro expression includes these newlines so the re-parsed code is able to map to the line numbers of the source.
WDYT?
EDIT: Is this maybe something we could leverage location pragmas for?
Appendix
Diff used to generate the outputs
diff --git a/src/compiler/crystal/macros/interpreter.cr b/src/compiler/crystal/macros/interpreter.cr
index 978c57470..d7541a8af 100644
--- a/src/compiler/crystal/macros/interpreter.cr
+++ b/src/compiler/crystal/macros/interpreter.cr
@@ -82,6 +82,10 @@ module Crystal
@last = Nop.new
end
+ def is_test_file?
+ @location.try &.original_filename.as?(String).try &.ends_with? "test.cr"
+ end
+
def define_var(name : String, value : ASTNode) : Nil
@vars[name] = value
end
@@ -584,6 +588,10 @@ module Crystal
end
def visit(node : Nop | NilLiteral | BoolLiteral | NumberLiteral | CharLiteral | StringLiteral | SymbolLiteral | RangeLiteral | RegexLiteral | MacroId | TypeNode | Def)
+ if self.is_test_file?
+ pp({value: node.to_s, line_number: (node.location.try(&.line_number) || 0) + (0 || node.location.try(&.macro_location.try(&.line_number)) || 0)})
+ end
+
@last = node.clone_without_location
false
end
diff --git a/src/compiler/crystal/macros/macros.cr b/src/compiler/crystal/macros/macros.cr
index a5d5714f1..1b68e3b53 100644
--- a/src/compiler/crystal/macros/macros.cr
+++ b/src/compiler/crystal/macros/macros.cr
@@ -27,7 +27,18 @@ class Crystal::Program
interpreter = MacroInterpreter.new self, scope, path_lookup || scope, node.location, def: a_def, in_macro: false
interpreter.free_vars = free_vars
node.accept interpreter
- {interpreter.to_s, interpreter.macro_expansion_pragmas}
+ output_str = interpreter.to_s
+
+ if interpreter.is_test_file?
+ puts
+ puts
+
+ puts output_str
+
+ puts
+ puts
+ end
+ {output_str, interpreter.macro_expansion_pragmas}
end
def parse_macro_source(generated_source, macro_expansion_pragmas, the_macro, node, vars, current_def = nil, inside_type = false, inside_exp = false, mode : Parser::ParseMode = :normal, visibility : Visibility = :public)