Here's a comparison of the Burbidge implementation vs the Official ISO website spec.
╭─── Claude Code v2.1.71 ───────────────────────────────────────────────────────────────────────────────────╮
│ │ Tips for getting started │
│ Welcome back Ken! │ Ask Claude to create a new app or clone a repository │
│ │ ──────────────────────────────────────────────────── │
│ │ Recent activity │
│ ▐▛███▜▌ │ No recent activity │
│ ▝▜█████▛▘ │ │
│ ▘▘ ▝▝ │ │
│ Sonnet 4.6 · Claude Pro · ken.domino@gmail.com's │ │
│ Organization │ │
│ C:\msys64\home\Kenne\issues\g4-gql\gql │ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯
❯ Please compare the EBNF at https://standards.iso.org/iso-iec/39075/ed-1/en/ISO_IEC_39075(en).bnf.txt with the Antlr4 grammar file
at https://github.com/opengql/grammar/blob/main/GQL.g4. Report a name to name mapping, any odd renamings, and any deviations in the
EBNF rules.
● Fetch(https://standards.iso.org/iso-iec/39075/ed-1/en/ISO_IEC_39075(en).bnf.txt)
⎿ Received 76.9KB (200 )
● The WebFetch summarized instead of returning verbatim content. Let me fetch both files more directly.
● Bash(curl -s "https://standards.iso.org/iso-iec/39075/ed-1/en/ISO_IEC_39075(en).bnf.txt" | head -200)
⎿ ************************************************************************************************
This file is a "digital artifact" that contains the grammar specified by ISO/IEC 39075.
… +46 lines (ctrl+o to expand)
● Bash(gh api repos/opengql/grammar/contents/GQL.g4 --jq '.content' | base64 -d | head -200)
⎿ grammar GQL;
options { caseInsensitive = true; }
… +67 lines (ctrl+o to expand)
● Good start. Let me fetch both files in full for a thorough comparison.
● Bash(curl -s "https://standards.iso.org/iso-iec/39075/ed-1/en/ISO_IEC_39075(en).bnf.txt" > /tmp/iso_bnf.txt && wc -l
/tmp/iso_bnf.txt)
⎿ 3502 /tmp/iso_bnf.txt
● Bash(gh api repos/opengql/grammar/contents/GQL.g4 --jq '.content' | base64 -d > /tmp/gql.g4 && wc -l /tmp/gql.g4)
⎿ 3774 /tmp/gql.g4
● Read 2 files (ctrl+o to expand)
● Let me extract all rule names from both files for systematic comparison.
● Bash(# Extract ISO BNF rule names (lines starting with <...> ::=)
grep -oP '^<[^>]+>' /tmp/iso_bnf.txt | sed 's/^<//;s/>$//' | sort > /tmp/iso_rules.txt…)
⎿ 814 /tmp/iso_rules.txt
● Bash(# Extract ANTLR4 rule names (lines that are rule definitions: start of line, lowercase, end with newline before colon)
grep -oP '^[a-zA-Z][a-zA-Z0-9_]*(?=\s*$|\…)
⎿ 964 /tmp/g4_rules.txt
● Bash(# Better extraction of ANTLR4 parser rule names (start of line, lowercase first char, followed by word chars, then
whitespace/newline then colon or newline only…)
⎿ Traceback (most recent call last):
File "<string>", line 3, in <module>
content = open('/tmp/gql.g4').read()
… +3 lines (ctrl+o to expand)
● Bash(ls /tmp/iso_bnf.txt /tmp/gql.g4 2>&1 || echo "Files missing"
# Recheck where they ended up…)
⎿ /tmp/gql.g4
/tmp/iso_bnf.txt
33912e71-4475-4b8e-b935-9284f4f16469
33912e71-4475-4b8e-b935-9284f4f16469.jsonl
● Bash(python3 << 'EOF'
import re…)
⎿ Error: Exit code 1
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/iso_bnf.txt'
● Searched for 1 pattern (ctrl+o to expand)
● Bash(python3 /dev/stdin << 'PYEOF'
import re…)
⎿ Error: Exit code 1
File "C:/Program Files/Git/dev/stdin", line 1
!<symlink>\ufffd\ufffd/
^
SyntaxError: Non-UTF-8 code starting with '\xff' in file C:/Program Files/Git/dev/stdin on line 1, but no encoding declared; see
https://peps.python.org/pep-0263/ for details
● Bash(cat > /tmp/compare.py << 'PYEOF'
import re…)
⎿ Error: Exit code 1
Traceback (most recent call last):
File "C:/msys64/tmp/compare.py", line 4, in <module>
iso_content = open('/tmp/iso_bnf.txt').read()
~~~~^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/iso_bnf.txt'
● Bash(python3 -c "import os; print(os.path.abspath('/tmp/iso_bnf.txt'))")
⎿ C:/tmp/iso_bnf.txt
● Bash(TMPDIR=$(cygpath -w /tmp) && echo "TMPDIR=$TMPDIR"
cp /tmp/iso_bnf.txt /tmp/gql.g4 /c/tmp/ 2>/dev/null || mkdir -p /c/tmp && cp /tmp/iso_bnf.txt /tmp/gql.g4 /c/…)
⎿ TMPDIR=C:\msys64\tmp
ISO rules: 814
G4 parser rules: 571
… +2 lines (ctrl+o to expand)
● Bash(cat > /c/tmp/compare2.py << 'PYEOF'
import re…)
⎿ === ISO rules NOT in G4 (255) ===
<GQL language character> => GQLLanguageCharacter
<GQL special character> => GQLSpecialCharacter
… +98 lines (ctrl+o to expand)
● Now let me look more carefully at the significant deviations — what's missing in G4 vs collapsed, and content differences in shared
rules.
● Bash(cat > /c/tmp/deep_compare.py << 'PYEOF'
import re…)
⎿ === Rules in both - checking for structural deviations ===
!!! <value expression> (ISO) => valueExpression NOT IN G4 !!!
… +37 lines (ctrl+o to expand)
● Bash(cat > /c/tmp/compare3.py << 'PYEOF'
import re…)
⎿ === Content comparison for shared rules ===
<boolean value expression> (ISO: 3 alts) vs booleanValueExpression (G4: 1 alts) *** ALT COUNT DIFFERS ***
… +51 lines (ctrl+o to expand)
● Bash(cat > /c/tmp/compare4.py << 'PYEOF'
import re…)
⎿ === predicate ===
ISO: <comparison predicate>
| <exists predicate>
… +51 lines (ctrl+o to expand)
● Bash(cat > /c/tmp/compare5.py << 'PYEOF'
import re…)
⎿ === ISO rules missing from G4, classified ===
--- Likely moved to LEXER (punctuation/chars/tokens) ---
… +83 lines (ctrl+o to expand)
● Now I have enough data for a comprehensive report. Let me compile it.
---
ISO/IEC 39075 BNF vs opengql/grammar GQL.g4 Comparison
File sizes: ISO BNF: 814 production rules · ANTLR4 G4: 571 parser rules + lexer rules
Exact name matches (after ISO kebab-case → camelCase): 559
---
1. Name-to-Name Mapping (Standard Transformation)
The ANTLR4 file systematically converts ISO BNF names by the rule:
- <kebab-case-name> → camelCaseName (angle brackets dropped, hyphens removed, each word after first capitalized)
Examples of clean mappings:
┌─────────────────────────────┬─────────────────────────┐
│ ISO BNF │ ANTLR4 │
├─────────────────────────────┼─────────────────────────┤
│ <GQL-program> │ gqlProgram │
├─────────────────────────────┼─────────────────────────┤
│ <session set command> │ sessionSetCommand │
├─────────────────────────────┼─────────────────────────┤
│ <procedure body> │ procedureBody │
├─────────────────────────────┼─────────────────────────┤
│ <start transaction command> │ startTransactionCommand │
├─────────────────────────────┼─────────────────────────┤
│ <graph pattern quantifier> │ graphPatternQuantifier │
└─────────────────────────────┴─────────────────────────┘
Note on <GQL-program>: ISO would yield GQLProgram by strict rule, but ANTLR4 uses gqlProgram (lowercase first letter) — required
because ANTLR4 parser rules must start lowercase.
---
2. Odd / Deliberate Renamings
┌─────────────────────────────────────┬───────────────────────────────────┬─────────────────────────────────────────────────────┐
│ ISO BNF Rule │ ANTLR4 Rule │ Reason │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <source/destination predicate> │ sourceDestinationPredicate │ Slash is illegal in an identifier; renamed │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <non-reserved word> (singular) │ nonReservedWords (plural) │ Changed to plural without comment │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <fold> │ foldCharacterString │ More descriptive name │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <normalize function> │ normalizeCharacterString │ More descriptive name │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <substring function> │ subCharacterOrByteString │ Broadened to cover byte strings too │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <single-character trim function> │ trimSingleCharacterOrByteString │ Broadened to cover byte strings │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <multi-character trim function> │ trimMultiCharacterCharacterString │ Renamed │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <trim character string> │ trimCharacterOrByteString │ Broadened to byte string │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <trim source> │ trimCharacterOrByteStringSource │ Broadened to byte string │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <catalog-modifying procedure │ (commented out) │ Merged into procedureSpecification; distinction │
│ specification> │ │ made semantically │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <data-modifying procedure │ (commented out) │ Same as above │
│ specification> │ │ │
├─────────────────────────────────────┼───────────────────────────────────┼─────────────────────────────────────────────────────┤
│ <query specification> │ (commented out) │ Same as above │
└─────────────────────────────────────┴───────────────────────────────────┴─────────────────────────────────────────────────────┘
---
3. New Rules in G4 Not Present in ISO
┌───────────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────┐
│ ANTLR4 Rule │ Description │
├───────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ characterOrByteStringFunction │ New grouping rule combining subCharacterOrByteString, trim, fold, normalize functions │
├───────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ valueFunction │ New aggregation rule combining numeric, datetime, duration, char/byte, and list value │
│ │ functions │
├───────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ nonReservedWords │ Enumerates non-reserved words inline (ISO rule <non-reserved word> delegates to a list │
│ │ elsewhere) │
└───────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────┘
---
4. Structural Deviations in Shared Rules
4a. Expression Hierarchy — Collapsed to valueExpression
ISO defines a full type-directed expression hierarchy; G4 collapses many branches to simply delegate to valueExpression:
┌──────────────────────────────────┬───────────────────────────────────────────────────────────────────────┬──────────────────┐
│ ISO Rule │ ISO Body (abbreviated) │ G4 Body │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <value expression> │ <common value expression> | <boolean value expression> │ ABSENT │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <common value expression> │ <numeric | string | datetime | duration | list | record | path | │ ABSENT │
│ │ reference> │ │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <boolean value expression> │ <boolean term> | … OR … | … XOR … │ : │
│ │ │ valueExpression │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <character string value │ <char string concat> | <char string primary> │ : │
│ expression> │ │ valueExpression │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <list value expression> │ <list concatenation> | <list primary> │ : │
│ │ │ valueExpression │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <duration value expression> │ <duration term> | <duration addition> | <datetime subtraction> │ : │
│ │ │ valueExpression │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <path value expression> │ <path value concatenation> | <path value primary> │ : │
│ │ │ valueExpression │
├──────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────────┤
│ <datetime value expression> │ <datetime primary> | … ± durations … │ ABSENT │
└──────────────────────────────────┴───────────────────────────────────────────────────────────────────────┴──────────────────┘
All these type-specific expression rules exist syntactically only to provide context for well-formedness; G4 defers all those
checks to semantic analysis.
4b. Boolean Expression Hierarchy — Completely Flattened
ISO defines: booleanTest → booleanPrimary → booleanPredicand → booleanFactor → booleanTerm → booleanValueExpression. All five
intermediate rules are absent from G4:
┌─────────────────────┬──────────────────────────────────────────────────────────────────────────┐
│ Missing ISO Rule │ ISO Definition │
├─────────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ <boolean test> │ <boolean primary> [ IS [ NOT ] <truth value> ] │
├─────────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ <boolean primary> │ <predicate> | <boolean predicand> │
├─────────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ <boolean predicand> │ <parenthesized bool expr> | <non-parenthesized value expression primary> │
├─────────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ <boolean factor> │ [ NOT ] <boolean test> │
├─────────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ <boolean term> │ <boolean factor> | <boolean term> AND <boolean factor> │
└─────────────────────┴──────────────────────────────────────────────────────────────────────────┘
4c. Numeric Expression Hierarchy — Expanded Inline
ISO uses the classic term/factor/numericPrimary precedence ladder; G4 eliminates those rules and inlines everything into
numericValueExpression:
┌──────────────────────────────────────────────┬───────────────────────────────────────────────────────────────────────────────┐
│ ISO │ G4 │
├──────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ <factor> → [ <sign> ] <numeric primary> │ Absent │
├──────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ <term> → <factor> | <term> * <factor> | │ Absent │
│ <term> / <factor> │ │
├──────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ <numeric value expression> → <term> | … + … │ 5 alternatives: unary sign, *, /, +, -, plus direct valueExpressionPrimary │
│ | … - … │ and numericValueFunction │
└──────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────┘
4d. Label Expression Hierarchy — Flattened with Labeled Alternatives
ISO defines a Pratt-style precedence hierarchy: labelPrimary → labelFactor → labelTerm (via labelConjunction) → labelExpression
(via labelDisjunction). G4 collapses all into a single labelExpression with ANTLR4 labeled alternatives:
┌─────────────────────┬────────────────────────────────────────────────────────────────────┐
│ Missing ISO Rule │ ISO Definition │
├─────────────────────┼────────────────────────────────────────────────────────────────────┤
│ <label primary> │ <label name> | <wildcard label> | <parenthesized label expression> │
├─────────────────────┼────────────────────────────────────────────────────────────────────┤
│ <label factor> │ <label primary> | <label negation> │
├─────────────────────┼────────────────────────────────────────────────────────────────────┤
│ <label term> │ <label factor> | <label conjunction> │
├─────────────────────┼────────────────────────────────────────────────────────────────────┤
│ <label conjunction> │ <label term> & <label factor> │
├─────────────────────┼────────────────────────────────────────────────────────────────────┤
│ <label disjunction> │ <label expression> | <label term> │
└─────────────────────┴────────────────────────────────────────────────────────────────────┘
4e. <predicate> — Missing Two Alternatives
ISO <predicate> has 11 alternatives; G4 predicate has 9. Missing:
- <comparison predicate> — moved into valueExpression per G4 comment: "moved to valueExpression to avoid left mutually recursive
productions"
- <normalized predicate> — absent with no comment; normalizedPredicate rule exists in G4 but is not referenced from predicate
4f. <value expression primary> — Significantly Expanded
ISO: 2 alternatives. G4: 11 alternatives (absorbs nonParenthesizedValueExpressionPrimarySpecialCase inline, and adds
valueExpressionPrimary PERIOD propertyName for property references):
ISO <property reference> (<property source> . <property name>) is absorbed directly as valueExpressionPrimary PERIOD propertyName
with a comment // <property reference>.
The ISO rule <record expression> → <value expression primary> is simply absent; semantics are embedded in the grammar indirectly.
4g. <path term> and Path Pattern Rules
┌────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────┐
│ ISO │ G4 │
├────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
│ <path term> → <path factor> | <path concatenation> │ pathTerm : pathFactor+ — repetition replaces explicit │
│ │ concatenation rule │
├────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
│ <quantified path primary> → <path primary> <graph pattern │ Inlined into pathFactor as labeled alt #pfQuantifiedPathPrimary │
│ quantifier> │ │
├────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
│ <questioned path primary> → <path primary> <question mark> │ Inlined into pathFactor as labeled alt #pfQuestionedPathPrimary │
├────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
│ <path pattern union> → <path term> | <path term> … │ Absent; handled upstream │
├────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
│ <path concatenation> → <path term> <path factor> │ Absent; replaced by pathFactor+ │
├────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
│ <path multiset alternation> → <path term> %% <path term> … │ Absent │
└────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────┘
4h. <trim operands> — Uses Renamed References
G4 trimOperands uses trimCharacterOrByteString (new name) and trimCharacterOrByteStringSource instead of ISO's <trim character
string> and <trim source>.
4i. <boolean literal> — Missing
ISO defines <boolean literal> → TRUE | FALSE | UNKNOWN. G4 has no booleanLiteral parser rule; instead TRUE, FALSE, UNKNOWN are
likely handled as keyword tokens in the lexer/literal rules.
---
5. Rules Present in ISO but Absent from G4 (Parser-Level, Significant)
Beyond what's covered above, notable absent rules:
┌─────────────────────────────────────────────┬──────────────────────────────────┬─────────────────────────────────────────────┐
│ ISO Rule │ ISO Definition │ Disposition in G4 │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <byte string literal> │ X'...' hex literal │ No dedicated parser rule; token-level │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <literal> │ <signed numeric literal> | │ Absent; unsignedLiteral / │
│ │ <general literal> │ signedNumericLiteral used differently │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <constructed value type> │ <path|list|record type> │ No grouping rule │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <dynamic union type> / <open dynamic union │ Type system │ Absent │
│ type> / <closed dynamic union type> │ │ │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <component type> / <component type list> │ Record type components │ Absent │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <string value expression> │ <char string|byte string value │ Absent │
│ │ expression> │ │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <general parameter reference> │ $ <parameter name> │ GENERAL_PARAMETER_REFERENCE is a lexer │
│ │ │ token │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <substituted parameter reference> │ $$ <parameter name> │ Similarly a lexer token │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <authorization identifier> │ Used in schema │ Absent │
├─────────────────────────────────────────────┼──────────────────────────────────┼─────────────────────────────────────────────┤
│ <external object reference> │ Implementation-defined │ Absent │
└─────────────────────────────────────────────┴──────────────────────────────────┴─────────────────────────────────────────────┘
---
Summary Table
┌──────────────────────────────────────────────────────┬───────┐
│ Category │ Count │
├──────────────────────────────────────────────────────┼───────┤
│ ISO rules cleanly mapped to G4 (camelCase) │ 559 │
├──────────────────────────────────────────────────────┼───────┤
│ ISO rules absorbed into lexer tokens │ ~155 │
├──────────────────────────────────────────────────────┼───────┤
│ ISO parser rules absent from G4 (collapsed/semantic) │ ~50 │
├──────────────────────────────────────────────────────┼───────┤
│ G4 parser rules with different structure than ISO │ ~15 │
├──────────────────────────────────────────────────────┼───────┤
│ G4 rules with no ISO counterpart (new/renamed) │ 12 │
└──────────────────────────────────────────────────────┴───────┘
The overarching pattern: G4 aggressively collapses ISO's multi-level expression precedence hierarchies (boolean, numeric, string,
datetime, path, label) into flatter structures using ANTLR4's built-in operator-precedence mechanism or simple delegation to
valueExpression, and moves purely lexical structure (punctuation names, character classes, comment syntax) entirely to lexer rules.
✻ Cogitated for 4m 25s
Here's a comparison of the Burbidge implementation vs the Official ISO website spec.