Skip to content

RFC: Add intermediate representation for format syntax #1062

@chhoumann

Description

@chhoumann

We're still working with strings higher up in the layers. I'm thinking it would be a good idea to abstract that away, s.t. higher layers don't work directly on strings.

Problem

Format handling is still regex-driven string replacement across formatter/preflight/etc. New per-token options (e.g., type:multiline) require parsing in multiple layers and make it easy to lose metadata (labels/defaults/type) or regress scripted VALUE injections.

Proposal

Introduce a lightweight Format IR (intermediate representation) parsed once from format strings. The formatter layer would operate on parsed tokens instead of raw strings, and metadata would flow consistently to sequential prompts, one-page modal, and preview formatters.

Goals

  • Single parse/validation step for format syntax and options.
  • Preserve token metadata across prompt flows (sequential + one-page) and display formatting.
  • Reduce regex re-scans and duplication.
  • Make future extensions (e.g., type:single, type:number, per-capture defaults) easier.

Non-goals (initial)

  • Not rewriting all format logic at once; aim for an incremental migration.
  • Not changing user-facing syntax or behavior in v1.

Sketch of IR

  • FormatAst: list of Segment nodes (Text, Token).
  • Token includes kind (VALUE, VDATE, FIELD, etc.), raw, name, options, span (start/end).
  • ValueOptions: label/default/type/custom/options list, normalized.

Incremental Plan

  1. Add parser that returns AST but continue to stringify existing replacements.
  2. Switch VALUE handling (base and named) to use AST + ValueOptions for prompting.
  3. Switch one-page preflight to use AST directly.
  4. Migrate remaining tokens.

Open Questions

  • Best location for parser (utils vs new parser module).
  • How to preserve compatibility with existing regex-based tokens for legacy/speed.
  • How to encode per-token warnings (unknown type, invalid combos).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions