Skip to content

Pipe in terminal regex not working as expected #1414

Open
@sidhiadkoli

Description

@sidhiadkoli

What is your question?

Facing an issue with a pipe in terminal regex.

Here is a subset of the grammar in question:

from lark import Lark

grammar = """
start: START

START: QUARTER [WS+ YEAR]
QUARTER: /q[1-4]/
WS: /\s/
YEAR: /(19[0-9]{2})|(20[0-3][0-9])/
"""

print(Lark(grammar).parse("q1 1923"))    # works
print(Lark(grammar).parse("q1 2023"))    # doesn't work

However, when we add parenthesis around the full YEAR regex, both the string examples get parsed correctly.

This works:

from lark import Lark

grammar = """
start: START

START: QUARTER [WS+ YEAR]
QUARTER: /q[1-4]/
WS: /\s/
YEAR: /((19[0-9]{2})|(20[0-3][0-9]))/
"""

print(Lark(grammar).parse("q1 1923"))    # works
print(Lark(grammar).parse("q1 2023"))    # works now

What am I missing here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions