Skip whitespace in SeqN grammar #1626

DavidLegg · 2025-02-07T02:16:56Z

Adds a skip expression for whitespace to the SeqN grammar, rather than dealing with whitespace by hand.
This should simplify the grammar, without significantly changing its meaning.

All grammar regression tests for legal SeqN examples passed unchanged.
A few examples of invalid SeqN parse slightly differently, but the new parse tree is still reasonable.
For these examples, I rebaselined the test to the new behavior.

DavidLegg · 2025-02-07T02:22:58Z

One more note to add on this PR: I think there's an opportunity to go farther, and skip newlines and maybe comments as well. I didn't do that here because I think it might expand what counts as legal SeqN, by effectively making all mandatory newlines optional, and allowing you to put comments anywhere. I don't know if that's an acceptable trade-off for having a simpler grammar. From a quick code search, it also looks like comments may be meaningful in some contexts, so skipping them may be completely off the table.

Add a skip expression for whitespace to the SeqN grammar, rather than dealing with whitespace by hand. This should simplify the grammar, without significantly changing its meaning. All grammar regression tests for legal SeqN examples pass unchanged. A few examples of invalid SeqN parse slightly differently, but the new parse tree is still reasonable. For these examples, rebaseline the test to the new behavior.

Condense similar rules together and in-line some very simple rules.

sonarqubecloud · 2025-02-07T02:32:15Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

goetzrrGit

I think this was a good idea but we are using whitespace a a delimiter. We want the user to have space between certain elements such as time, command stems, and arguments. I believe this change should not be merged in

ex.

A2025-001T10:00:00ECHO"arg1"#test <--- Now valid with the new grammar change

Also you might have ambiguity when parsing arguments specifically with number arguments

ex

C CMD_STEM 1 10
C CMD_STEM 110

goetzrrGit · 2025-02-11T21:29:00Z

src/utilities/sequence-editor/grammar.test.ts

-  ],
-  [
-    'Stem with disallowed characters',
-    `FSW_CMD%BAR$BAZ`,
-    `Sequence(Commands(
-Command(Stem,⚠),
-Command(Stem,⚠),
-Command(Stem,Args)
-))`,
+    `Sequence(Commands(Command(TimeTag(TimeComplete),⚠(Number),Stem,Args),Command(Stem,⚠,Args(Enum))))`,
  ],
+  ['Stem with disallowed characters', `FSW_CMD%BAR$BAZ`, `Sequence(Commands(Command(Stem,⚠,Args(Enum,⚠,Enum))))`],


Just curios why this test was changed? I don't think it needed to be changed.

I think the new grammar just parsed it slightly differently, so I changed the expected output to match. It winds up parsing as a single command with some errors and enum arguments, rather than multiple commands with errors in between.
The formatting was changed by the formatting and linting tools. I'm not sure if those were being applied regularly to this file... they might be doing more harm than good here.

DavidLegg · 2025-02-12T18:39:34Z

Re: the way skipping whitespace works, it's true that this would allow that "smashed together" syntax. If that's unacceptable, we could perhaps try adding "word boundary" elements into the tokens. That would demand that there are non-word characters (whitespace being the only choice that would match the rest of the grammar) between the various tokens, without actually including those non-word characters in the token we match.

As for this example:

C CMD_STEM 1 10
C CMD_STEM 110

That's actually not ambiguous, because the "skip" happens after the tokenizer runs. That is, this gets parsed as the following tokens:

"C ", "CMD_STEM", whitespace, "1", whitespace, "10", whitespace,
"C ", "CMD_STEM", whitespace, "110", whitespace

then the whitespace is dropped, and finally the grammar is applied, resulting in this parse tree (ignore the line numbers, I just tacked this onto another example I was working on)

    ┣━  Command [9:0..10:0]
    ┃   ┣━  TimeTag [9:0..9:2]
    ┃   ┃   ┗━  TimeComplete [9:0..9:2]: "C "
    ┃   ┣━  Stem [9:2..9:10]: "CMD_STEM"
    ┃   ┗━  Args [9:11..9:15]
    ┃       ┣━  Number [9:11..9:12]: "1"
    ┃       ┗━  Number [9:13..9:15]: "10"
    ┗━  Command [10:0..11:0]
        ┣━  TimeTag [10:0..10:2]
        ┃   ┗━  TimeComplete [10:0..10:2]: "C "
        ┣━  Stem [10:2..10:10]: "CMD_STEM"
        ┗━  Args [10:11..10:14]
            ┗━  Number [10:11..10:14]: "110"

All of that said, I think we're going to go a different route for sequence templates, that doesn't need to modify the SeqN grammar. So, feel free to take this if you think it'll help with maintenance, or close it. It won't interfere with the sequence templates either way.

cartermak · 2025-04-22T21:14:41Z

@DavidLegg I'm going to close this PR in favor of a PR on the new aerie-sequence-languages repo

DavidLegg requested a review from a team as a code owner February 7, 2025 02:16

DavidLegg requested review from dandelany, AaronPlave and joswig February 7, 2025 02:16

DavidLegg temporarily deployed to test-workflow February 7, 2025 02:17 — with GitHub Actions Inactive

David Legg added 2 commits February 6, 2025 18:29

refactor: SeqN grammar

d91e442

Condense similar rules together and in-line some very simple rules.

DavidLegg force-pushed the feature/seqn-skip-ws branch from d21b30b to d91e442 Compare February 7, 2025 02:31

DavidLegg temporarily deployed to test-workflow February 7, 2025 02:31 — with GitHub Actions Inactive

goetzrrGit requested changes Feb 11, 2025

View reviewed changes

goetzrrGit added the DON'T MERGE Do Not Merge This Branch label Feb 11, 2025

cartermak closed this Apr 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Skip whitespace in SeqN grammar #1626

Skip whitespace in SeqN grammar #1626

Uh oh!

DavidLegg commented Feb 7, 2025

Uh oh!

DavidLegg commented Feb 7, 2025

Uh oh!

sonarqubecloud bot commented Feb 7, 2025

Uh oh!

goetzrrGit left a comment •

edited

Loading

Uh oh!

goetzrrGit Feb 11, 2025

Uh oh!

DavidLegg Feb 12, 2025

Uh oh!

DavidLegg commented Feb 12, 2025

Uh oh!

cartermak commented Apr 22, 2025

Uh oh!

Uh oh!

Skip whitespace in SeqN grammar #1626

Skip whitespace in SeqN grammar #1626

Uh oh!

Conversation

DavidLegg commented Feb 7, 2025

Uh oh!

DavidLegg commented Feb 7, 2025

Uh oh!

sonarqubecloud bot commented Feb 7, 2025

Quality Gate passed

Uh oh!

goetzrrGit left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

goetzrrGit Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

DavidLegg Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

DavidLegg commented Feb 12, 2025

Uh oh!

cartermak commented Apr 22, 2025

Uh oh!

Uh oh!

goetzrrGit left a comment •

edited

Loading