Refactor: decouple statement splitting from AST tokenizer


  ### Background

  `SqlParser::parse_statements` currently relies on the AST tokenizer
  to find `;` boundaries. This creates a tight coupling: any change
  in tokenizer error-recovery behaviour (e.g. the 0.2.5 change to
  `/*` handling) can silently break statement splitting.

  A pre-scan (`unclosed_block_comment_start`) was added as a
  workaround, but it reimplements a subset of the tokenizer's lexical
  rules (`'`, `"`, `$$`, `--`) and cannot cover all literal forms
  the tokenizer accepts (backtick-quoted identifiers, `@` literals,
  backslash escapes).

  ### Proposal

  Replace the tokenizer-based splitting with a standalone state
  machine whose only job is to find `;` outside of:

  - single/double/backtick-quoted strings (with escape handling)
  - dollar-quoted strings (`$$...$$`)
  - block comments (`/* ... */`)
  - line comments (`-- ...\n`)

  The tokenizer is then only used downstream for syntax validation,
  highlighting, and formatting — never for splitting.

  This also unblocks multi-statement support: the splitter produces
  N statements, and the upper layer decides whether to execute them
  sequentially or in batch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: decouple statement splitting from AST tokenizer #756

Background

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor: decouple statement splitting from AST tokenizer #756

Description

Background

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions