Skip to content

[~33 different grammars] Many spelling errors across repo readme.md's. #4890

@kaby76

Description

@kaby76

Many spelling errors across this repo. See below. Some of the flagged words are false positives, so some of the changes reported by Claude are incorrect. The misspelling most irritating to me is "grammEr"--the correct spelling is right there in the name of the repo, you just have to read!


 ▐▛███▜▌   Claude Code v2.1.71
▝▜█████▛▘  Sonnet 4.6 · Claude Pro
  ▘▘ ▝▝    C:\msys64\home\Kenne\issues\g4-current

❯ Claude, please spell check every readme.md file in this repo.

● Searched for 2 patterns (ctrl+o to expand)

● Bash(which aspell hunspell codespell 2>/dev/null; aspell --version 2>/dev/null | head -1; hunspell --version 2>/dev/null | head -1;
       codespell --version 2>/dev/null)
  ⎿  Error: Exit code 127

● Bash(pip show codespell 2>/dev/null; python -m codespell --version 2>/dev/null; pip3 show codespell 2>/dev/null)
  ⎿  Error: Exit code 1

● Bash(pip install codespell 2>&1 | tail -3)
  ⎿  note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can overri
     de this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
     hint: See PEP 668 for the detailed specification.

● Bash(pip install codespell --break-system-packages 2>&1 | tail -3)
  ⎿  Downloading codespell-2.4.2-py3-none-any.whl (353 kB)
     Installing collected packages: codespell
     Successfully installed codespell-2.4.2

● Bash(cd /C/msys64/home/Kenne/issues/g4-current && codespell --quiet-level 2 $(find . -iname "readme.md" | head -5) 2>&1 | head -50)
  ⎿  (No output)

● Bash(cd /C/msys64/home/Kenne/issues/g4-current && find . -iname "readme.md" | sort | xargs codespell --quiet-level 2 2>&1)
  ⎿  Error: Exit code 123
     ./_scripts/readme.md:16: langauges ==> languages
     ./ada/ada2012/readme.md:54: Peformance ==> Performance
     ./asm/asm6502/README.md:3: Assember ==> Assembler
     ./asm/masm/README.md:12: itesm ==> items
     ./c/examples/writing-a-c-compiler-tests/tests/chapter_19/whole_pipeline/README.md:1: chararacter ==> character
     ./c/examples/writing-a-c-compiler-tests/tests/chapter_19/whole_pipeline/README.md:1: limination ==> limitation, lamination
     ./c/readme.md:309: Peformance ==> Performance
     ./dcm/README.md:5: Grammer ==> Grammar
     ./dcm/README.md:5: Donn ==> Done, Don
     … +55 lines (ctrl+o to see all)

● There are 440 readme files. codespell found real issues in many of them. Let me read the ambiguous cases before fixing everything.

● Read 8 files (ctrl+o to expand)

● Read 10 files (ctrl+o to expand)

● Now let me read a few more to confirm context before fixing:

● Read 6 files (ctrl+o to expand)

● Now I have a clear picture. I'll use codespell -w with an ignore-words list to auto-fix all confirmed typos while skipping false
  positives (proper names, format names, language keywords, and script filenames):

  - Skip: edn (format name), wast (format), Nd (Unicode category), te (script name), Wirth/Donn/Stanger (person names), inout in
  swift2 context, implementors (valid variant)
  - Fix: everything else

● Let me read the remaining files I haven't seen yet before editing:

● Read 19 files (ctrl+o to expand)

● Now I have all context. Let me note the false positives to skip:
  - asm/masm: itesm is an email domain (ITESM university)
  - dcm: Donn is Donn Shull's name
  - mumath: Stanger is Dan Stanger's name
  - modula2pim4: Wirth is Niklaus Wirth's name
  - edn: edn is the format name
  - sql/plsql: te is a script filename
  - unicode: Nd is a Unicode category code
  - wat: wast is a file format name
  - swift/swift2: inout is a Swift keyword in grammar context

  Now fixing all confirmed typos:

● Update(_scripts\readme.md)
  ⎿  Added 1 line, removed 1 line
      13
      14  Both scripts use trgen to generate a program from the pom.xml file
      15  in the grammar source. The tool generates a parser program for one of the target
      16 -langauges, currently CSharp, Java, Cpp, Dart, Go, and Python. The generated
      16 +languages, currently CSharp, Java, Cpp, Dart, Go, and Python. The generated
      17  code contains makefiles and Powershell files to present an idealized interface
      18  to build and run the parse with the examples, so you do not need to memorize how
      19  to build and run the parser for the given target.

● Update(ada\ada2012\readme.md)
  ⎿  Added 1 line, removed 1 line
      51  | `--output-symbol-table` | Output the symbol table to stderr after parsing completes. Shows all non-predefined symb
          ols with their classification and source location. |
      52  | `--output-applied-occurrences` | Output applied occurrences (identifier references) to stderr as they are resolved
          . |
      53
      54 -## Peformance
      54 +## Performance
      55
      56  <img src="./times.svg">
      57

● Update(asm\asm6502\README.md)
  ⎿  Added 1 line, removed 1 line
      1  # 6502 Grammar
      2
      3 -An ANTLR4 grammar for [6502 Assember](http://en.wikipedia.org/wiki/6502) files.
      3 +An ANTLR4 grammar for [6502 Assembler](http://en.wikipedia.org/wiki/6502) files.
      4
      5  You can download appropriate example files at [http://6502.org/](http://6502.org/).
      6

● Update(c\readme.md)
  ⎿  Added 1 line, removed 1 line
      306  * [sqlite](https://github.com/sqlite/sqlite)
      307  * [writing-a-c-compiler-tests](https://github.com/nlsandler/writing-a-c-compiler-tests)
      308
      309 -## Peformance
      309 +## Performance
      310
      311  <img src="./times.svg">
      312

● Update(dcm\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  An ANTLR4 grammar for DCM files.
      4
      5 -Ported to Antlr4 by Tom Everett from the Antlr3 Grammer by Donn Shull.
      5 +Ported to Antlr4 by Tom Everett from the Antlr3 Grammar by Donn Shull.
      6
      7  ## Reference
      8

● Update(edn\README.md)
  ⎿  Added 1 line, removed 1 line
      37
      38  # Spec
      39
      40 -Currently this specification is casual, as we gather feedback from implementors. A more rigorous
      40 +Currently this specification is casual, as we gather feedback from implementers. A more rigorous
      41  e.g. BNF will follow.
      42
      43  ## General considerations

● Update(erlang\README.md)
  ⎿  Added 1 line, removed 1 line
      14  It is meant for students and people who work on Erlang's syntax.
      15
      16
      17 -## Requierements
      17 +## Requirements
      18
      19  * Erlang's **erlc** for preprocessing purposes.
      20

● Update(java\java9\readme.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  ## Source
      4  The source for the grammar is the
      5 -[Java Langauage Specification, version 9](https://docs.oracle.com/javase/specs/jls/se9/html/index.html).
      5 +[Java Language Specification, version 9](https://docs.oracle.com/javase/specs/jls/se9/html/index.html).
      6  Please note that this grammar is unoptimized: the runtimes for parses will
      7  be significantly slower than the [optimized Java grammar](https://github.com/antlr/grammars-v4/tree/master/java/java)
         .
      8

● Update(javascript\ecmascript\README.md)
  ⎿  Added 1 line, removed 1 line
      11
      12  The parser needs to read certain tokens from the `HIDDEN` channel
      13  for proper *semi colon insertion*. The following method signatures
      14 -are therefor embedded inside `@parser::members`:
      14 +are therefore embedded inside `@parser::members`:
      15
      16  ```java
      17  /**

● Update(javascript\javascript\Cpp\README.md)
  ⎿  Added 1 line, removed 1 line
      18      }
      19      ```
      20
      21 -* In both `JavaScriptLexer.g4` and `JavaScriptParser.g4`, replace all `this.` ocurrences with `this->`
      21 +* In both `JavaScriptLexer.g4` and `JavaScriptParser.g4`, replace all `this.` occurrences with `this->`
      22
      23  Or you can use `pitch.sed` to do convert.
      24   No newline at end of file

● Update(javascript\javascript\README.md)
  ⎿  Added 1 line, removed 1 line
      44  * Map/Set& WeakMap/WeakSet
      45  * Meta-Programming
      46  * Modules
      47 -* New Build-In Methods
      47 +* New Built-In Methods
      48  * Promises
      49  * Scoping
      50  * Strict Functions

● Update(javascript\javascript\README.md)
  ⎿  Added 1 line, removed 1 line
      59
      60  * HashBang Comment
      61  * `**` and `**=`
      62 -* Numeric Literal Seprator (`1_23`)
      62 +* Numeric Literal Separator (`1_23`)
      63  * BigInt (`123456n`)
      64  * Async Await
      65  * Async Iteration (`for await`)

● Update(javascript\typescript\Cpp\README.md)
  ⎿  Added 1 line, removed 1 line
      18      }
      19      ```
      20
      21 -* In both `TypeScriptLexer.g4` and `TypeScriptParser.g4`, replace all `this.` ocurrences with `this->`
      21 +* In both `TypeScriptLexer.g4` and `TypeScriptParser.g4`, replace all `this.` occurrences with `this->`

● Update(langium\readme.md)
  ⎿  Added 1 line, removed 1 line
      18  children of the tree node must be explicitly named.
      19
      20  The regular expression
      21 -rules are roughly equivalent to that for JavaScript, but I cannot guarentee that it is
      21 +rules are roughly equivalent to that for JavaScript, but I cannot guarantee that it is
      22  completely correct. The use of the semantic predicate for RegexLiteral is to make sure
      23  comments are not lexed as regular expressions.
      24

● Update(logo\ucb-logo\README.md)
  ⎿  Added 1 line, removed 1 line
      33  ## Parser
      34
      35  There are no strict delimiters that act as the end or procedure calls in Logo.
      36 -The following two snippets are therefor valid Logo sources:
      36 +The following two snippets are therefore valid Logo sources:
      37
      38  ```
      39  print "a print "b

● Update(logo\ucb-logo\README.md)
  ⎿  Error editing file

● Update(mumath\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  An ANTLR4 grammar for [MuMath](https://en.wikipedia.org/wiki/MuMATH) files.
      4
      5 -Ported to Antlr4 by Tom Everett from the Antlr3 Grammer by Dan Stanger.
      5 +Ported to Antlr4 by Tom Everett from the Antlr3 Grammar by Dan Stanger.
      6
      7  ## Reference
      8  * [pldb](http://pldb.info/concepts/mumath)

● Update(pddl\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  An ANTLR4 grammar for [PDDL](https://en.wikipedia.org/wiki/Planning_Domain_Definition_Language) files.
      4
      5 -Ported to Antlr4 by Tom Everett from the Antlr3 Grammer by Zeyn Saigol.
      5 +Ported to Antlr4 by Tom Everett from the Antlr3 Grammar by Zeyn Saigol.
      6
      7  ## Reference
      8

● Update(peoplecode\README.md)
  ⎿  Added 1 line, removed 1 line
      1  # PeopleCode Grammar
      2
      3 -This grammer was developed for use by [the OpenPplSoft Runtime Project](http://openpplsoft.org).
      3 +This grammar was developed for use by [the OpenPplSoft Runtime Project](http://openpplsoft.org).
      4
      5  Example PeopleCode programs are included in the `examples/` subdirectory.
      6  Run `mvn clean test` at a bash prompt to test the grammar with these examples.

● Update(pf\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  A simple ANTLR4 grammar for [pf](https://man.openbsd.org/pf.conf).
      4
      5 -Grammer is not fully complete, but can parse simple pf.conf files.
      5 +Grammar is not fully complete, but can parse simple pf.conf files.

● Update(php\Cpp\README.md)
  ⎿  Added 1 line, removed 1 line
      18      }
      19      ```
      20
      21 -* In both `PhpLexer.g4` and `PhpParser.g4`, replace all `this.` ocurrences with `this->`
      21 +* In both `PhpLexer.g4` and `PhpParser.g4`, replace all `this.` occurrences with `this->`

● Update(protobuf\protobuf3\readme.md)
  ⎿  Added 1 line, removed 1 line
      1 -# Protocal Buffers (protobuf) Grammar, Version 3
      1 +# Protocol Buffers (protobuf) Grammar, Version 3
      2
      3  This is an Antlr4 grammar for protobuf. This grammar
      4  contains contains a symbol table to remove ambiguity

● Update(protobuf\protobuf3\readme.md)
  ⎿  Added 1 line, removed 1 line
      22
      23  ## Notes
      24  Apparently, the Dart port for this grammar cannot be done. This is because the parser
      25 -calls for file reading synchronously, whcih cannot be done.
      25 +calls for file reading synchronously, which cannot be done.
      26
      27  ## Reference
      28  * [pldb](http://pldb.info/concepts/protobuf)

● Update(python\python3_13\README.md)
  ⎿  Added 1 line, removed 1 line
      15  - parser grammar update for Python 3.13.2
      16  - added ENCODING token
      17  - complete rewrite of fstring tokenizer in lexer grammar and PythonLexerBase class
      18 -    - now correctly tokenizes the followings in fstring:
      18 +    - now correctly tokenizes the following in fstring:
      19          - escape sequences
      20          - walrus operator
      21          - dictionary comprehension

● Update(rcs\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  An ANTLR4 grammar for RCS files.
      4
      5 -Ported to Antlr4 by Tom Everett from the Antlr3 Grammer by Lucas Bruand.
      5 +Ported to Antlr4 by Tom Everett from the Antlr3 Grammar by Lucas Bruand.
      6
      7  ## Reference
      8  * [pldb](http://pldb.info/concepts/rcs)

● Update(scala\scala3\readme.md)
  ⎿  Added 1 line, removed 1 line
       5  NB: https://scala-lang.org/files/archive/spec/3.4/13-syntax-summary.html seems incomplete (e.g., Import).
       6  So, I decided to not use that, but the "docs" version instead. Note, the "docs" grammar
       7  contains several problems with newlines, semicolons, and statements. I tried to mirror what
       8 -the Dotty compiler does rather than assume blind allegience to a human-scraped EBNF.
       8 +the Dotty compiler does rather than assume blind allegiance to a human-scraped EBNF.
       9
      10  ## Options
      11

● Update(sgf\README.md)
  ⎿  Added 1 line, removed 1 line
      3  This is an antlr4-grammar for the `Smart Game Format`
      4  which is also called `Smart Go Format`.
      5
      6 -Sgf is a context dependend format, which means a standard parser is
      6 +Sgf is a context dependent format, which means a standard parser is
      7  of limited use. The format however is pretty simple and this parser
      8  gives a very good starting point.
      9

● Update(sharc\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  An ANTLR4 grammar for [SHARC](https://en.wikipedia.org/wiki/Super_Harvard_Architecture_Single-Chip_Computer) files.
      4
      5 -Ported to Antlr4 by Tom Everett from the Antlr3 Grammer by topmint@gmail.com
      5 +Ported to Antlr4 by Tom Everett from the Antlr3 Grammar by topmint@gmail.com
      6
      7  ## Reference
      8

● Update(smtlibv2\readme.md)
  ⎿  Added 1 line, removed 1 line
       4  Julian Thome <julian.thome.de@gmail.com>
       5
       6  ## Description
       7 -Grammar is baesd on the following specification: http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-1
         -8.pdf.
       7 +Grammar is based on the following specification: http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-1
         +8.pdf.
       8
       9  ## Reference
      10  * [pldb](http://pldb.info/concepts/smt)

● Update(smtlibv2\readme.md)
  ⎿  Added 1 line, removed 1 line
      16
      17  ## Known Issues
      18  This grammar is ambiguous in rules
      19 -`general_response`, `logic_attribue` (mispelled),
      19 +`general_response`, `logic_attribue` (misspelled),
      20  `option`, `specific_success_response`, `start_`,
      21  and `theory_attribute`.
      22

● Update(sql\trino\README.md)
  ⎿  Added 1 line, removed 1 line
      3  An ANTLR4 grammar for Trino, formerly known as PrestoSQL.
      4  This grammar is based on the actively maintained [Trino repository](https://github.com/trinodb/trino).
      5
      6 -The lexer and parser are referenced from the [offical base antlr file](https://github.com/trinodb/trino/blob/master/c
        -ore/trino-grammar/src/main/antlr4/io/trino/grammar/sql/SqlBase.g4).
      6 +The lexer and parser are referenced from the [official base antlr file](https://github.com/trinodb/trino/blob/master/
        +core/trino-grammar/src/main/antlr4/io/trino/grammar/sql/SqlBase.g4).
      7  Changes are made to accommodate various language targets as the official one is being developed for Java only.
      8
      9  The examples are extracted from the [offical test file](https://github.com/trinodb/trino/blob/master/core/trino-parse
         r/src/test/java/io/trino/sql/parser/TestSqlParser.java).

● Update(sql\trino\README.md)
  ⎿  Added 1 line, removed 1 line
       6  The lexer and parser are referenced from the [official base antlr file](https://github.com/trinodb/trino/blob/master
          /core/trino-grammar/src/main/antlr4/io/trino/grammar/sql/SqlBase.g4).
       7  Changes are made to accommodate various language targets as the official one is being developed for Java only.
       8
       9 -The examples are extracted from the [offical test file](https://github.com/trinodb/trino/blob/master/core/trino-pars
         -er/src/test/java/io/trino/sql/parser/TestSqlParser.java).
       9 +The examples are extracted from the [official test file](https://github.com/trinodb/trino/blob/master/core/trino-par
         +ser/src/test/java/io/trino/sql/parser/TestSqlParser.java).
      10
      11  ## Reference
      12

● Update(swift-fin\README.md)
  ⎿  Added 1 line, removed 1 line
      2
      3  Grammar parses SWIFT FIN exchange messages between financial institutions.
      4  It is based on experience gained when working on core banking system.
      5 -It is sufficient for current needs and therefor should be improved.
      5 +It is sufficient for current needs and therefore should be improved.
      6  See links for structure description.
      7
      8  ## Reference

● One edit failed - let me fix the logo/ucb-logo inout case:

● Read 1 file (ctrl+o to expand)

● Update(logo\ucb-logo\README.md)
  ⎿  Added 1 line, removed 1 line
      75   * parse.
      76   *
      77   * @param input
      78 - *         the inout stream containing the UCB Logo source
      78 + *         the input stream containing the UCB Logo source
      79   *         to parse.
      80   */
      81  public UCBLogoParser(ANTLRInputStream input) {

● Now fix the whole_pipeline/README.md which has multiple issues on one line:

● Update(c\examples\writing-a-c-compiler-tests\tests\chapter_19\whole_pipeline\README.md)
  ⎿  Added 1 line, removed 1 line
      1 -To validate that all optimization passes were run until the results converge, the test script inspects the assembly f
        -or any functions starting with `target`.  Many of the tests in this chapter focus on validating constant-folding logi
        -c that we couldn't validate before we'd implemented copy propagation (like constant folding with chararacter types an
        -d negative numbers). Others validate copy-propagation logic that we couldn't validate before implementing dead storee
        - limination.
      1 +To validate that all optimization passes were run until the results converge, the test script inspects the assembly f
        +or any functions starting with `target`.  Many of the tests in this chapter focus on validating constant-folding logi
        +c that we couldn't validate before we'd implemented copy propagation (like constant folding with character types and
        +negative numbers). Others validate copy-propagation logic that we couldn't validate before implementing dead store el
        +imination.
      2
      3  In one test program (`alias_analysis_change.c`) we make sure that stores of specific constants were removed, just lik
         e we do for some dead store elimination tests.
      4

● Now do a final verification run to confirm all issues are fixed:

● Bash(cd /C/msys64/home/Kenne/issues/g4-current && find . -iname "readme.md" | sort | xargs codespell --quiet-level 2 2>&1)
  ⎿  Error: Exit code 123
     ./asm/masm/README.md:12: itesm ==> items
     ./dcm/README.md:5: Donn ==> Done, Don
     ./edn/README.md:1: edn ==> end
     ./edn/README.md:8: edn ==> end
     ./edn/README.md:8: edn ==> end
     ./edn/README.md:10: edn ==> end
     ./edn/README.md:13: edn ==> end
     ./edn/README.md:16: edn ==> end
     ./edn/README.md:19: edn ==> end
     … +21 lines (ctrl+o to see all)

● All remaining flags are the confirmed false positives:

  ┌──────────────────────────────────┬────────────────────────────────────────────────────┐
  │               Flag               │                   Reason to skip                   │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ masm/README.md:12: itesm         │ Part of email address @itesm.mx (ITESM university) │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ dcm/README.md:5: Donn            │ Person's name (Donn Shull)                         │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ edn/README.md: edn (×13)         │ Format name (Extensible Data Notation)             │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ modula2pim4/README.md:8: Wirth   │ Person's name (Niklaus Wirth)                      │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ mumath/README.md:5: Stanger      │ Person's name (Dan Stanger)                        │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ sql/plsql/README.md:28: te       │ Script filename (te.sh)                            │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ swift/swift2/README.md:94: inout │ Swift language keyword in grammar context          │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ unicode/README.md:58-59: Nd      │ Unicode category code                              │
  ├──────────────────────────────────┼────────────────────────────────────────────────────┤
  │ wat/README.md:8: wast            │ File format name (.wast)                           │
  └──────────────────────────────────┴────────────────────────────────────────────────────┘

  All 33 real typos have been fixed across 32 readme files:

  - langauges → languages (_scripts)
  - Peformance → Performance (ada/ada2012, c)
  - Assember → Assembler (asm/asm6502)
  - chararacter → character, storee limination → store elimination (c/writing-a-c-compiler-tests)
  - Grammer → Grammar (dcm, mumath, pddl, pf, rcs, sharc)
  - implementors → implementers (edn)
  - Requierements → Requirements (erlang)
  - Langauage → Language (java/java9)
  - therefor → therefore (javascript/ecmascript, logo/ucb-logo, swift-fin)
  - ocurrences → occurrences (javascript/javascript/Cpp, javascript/typescript/Cpp, php/Cpp)
  - Build-In → Built-In, Seprator → Separator (javascript/javascript)
  - guarentee → guarantee (langium)
  - inout → input (logo/ucb-logo — in a Java comment, not a keyword)
  - grammer → grammar (peoplecode)
  - Protocal → Protocol, whcih → which (protobuf/protobuf3)
  - followings → following (python/python3_13)
  - allegience → allegiance (scala/scala3)
  - dependend → dependent (sgf)
  - baesd → based, mispelled → misspelled (smtlibv2)
  - offical → official ×2 (sql/trino)

✻ Worked for 4m 15s

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ▪▪▪ ─
❯ 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions