Skip to content

Function definitions do not parse if their body contains an unclosed block #138

Open
@Diomendius

Description

@Diomendius

The following Rust code

fn foo() {
    if true {}
}

produces the following parse tree (as formatted by nvim-treesitter/playground):

function_item [0, 0] - [2, 1]
  name: identifier [0, 3] - [0, 6]
  parameters: parameters [0, 6] - [0, 8]
  body: block [0, 9] - [2, 1]
    expression_statement [1, 1] - [1, 11]
      if_expression [1, 1] - [1, 11]
        condition: boolean_literal [1, 4] - [1, 8]
        consequence: block [1, 9] - [1, 11]

If the closing brace of the if block is removed, the parser no longer recognizes the enclosing function_item at all:

fn foo() {
    if true {
}
identifier [0, 3] - [0, 6]
parameters [0, 6] - [0, 8]
expression_statement [1, 1] - [2, 1]
  if_expression [1, 1] - [2, 1]
    condition: boolean_literal [1, 4] - [1, 8]
    consequence: block [1, 9] - [2, 1]

Though parsing incorrect syntax usefully is an exercise in futility in the general case, it should still be feasible to recognize the function definition, even if it's impossible to say objectively whether the if statement is missing a closing brace, or the function is.

In practical terms this impacts detecting indent level based on the parse tree, as the function body no longer exists to provide the outer indent level. I'm sure there are other consequences.

It is also strange that the parse tree does not include any ERRORs, even though the parse tree itself could not possibly represent valid Rust code; how can an identifier or parameters exist at the root of the parse tree?

For comparison, the C parser produces this parse tree for a similar function definition and if statement:

void foo() {
    if(true) {}
}
function_definition [0, 0] - [2, 1]
  type: primitive_type [0, 0] - [0, 4]
  declarator: function_declarator [0, 5] - [0, 10]
    declarator: identifier [0, 5] - [0, 8]
    parameters: parameter_list [0, 8] - [0, 10]
  body: compound_statement [0, 11] - [2, 1]
    if_statement [1, 4] - [1, 15]
      condition: parenthesized_expression [1, 6] - [1, 12]
        true [1, 7] - [1, 11]
      consequence: compound_statement [1, 13] - [1, 15]

Removing the closing brace of the if statement's block causes the parser to reinterpret the function block's closing brace as the closing brace of the if statement's block, but otherwise leaves the parse tree unchanged:

void foo() {
    if(true) {
}
function_definition [0, 0] - [2, 1]
  type: primitive_type [0, 0] - [0, 4]
  declarator: function_declarator [0, 5] - [0, 10]
    declarator: identifier [0, 5] - [0, 8]
    parameters: parameter_list [0, 8] - [0, 10]
  body: compound_statement [0, 11] - [2, 1]
    if_statement [1, 4] - [2, 1]
      condition: parenthesized_expression [1, 6] - [1, 12]
        true [1, 7] - [1, 11]
      consequence: compound_statement [1, 13] - [2, 1]

There are no ERRORs, which I suppose makes some sense (what region represents the closing brace that doesn't exist?), but the function is still parsed usefully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions