-
Notifications
You must be signed in to change notification settings - Fork 288
Macaulay2 language grammar
Note: This grammar was AI-generated by Claude Opus 4.8. 🤖
EBNF description of the Macaulay2 expression grammar. This matches the semantics of Macaulay2's internal Pratt parser and can serve as a reference for writing any Macaulay2 parser.
identifier ::= alpha (alpha | [0-9] | "'" | "$")*
(* alpha = ASCII letter, or any non-ASCII Unicode letter
that is not an operator symbol *)
integer ::= [0-9]+
| ("0b" | "0B") [01]+
| ("0o" | "0O") [0-7]+
| ("0x" | "0X") [0-9a-fA-F]+
float ::= ( [0-9]+ ("." [0-9]*)? | "." [0-9]+ )
("p" [0-9]+)? ([eE] [+-]? [0-9]+)?
string ::= '"' ([^"\\] | "\\" .)* '"'
| "///" .*? "///" (* non-greedy *)
comment ::= "--" [^\n]*
| "-*" .*? "*-" (* non-greedy *)
| "#!" [^\n]* (* only on line 1, column 0 *)A token matches float only when it contains a fractional part, a p
precision field, or an exponent; a bare [0-9]+ is an integer. (The two
productions above overlap on bare digit sequences; the presence of ., p,
or [eE] is what distinguishes a float.)
Each table is ordered from highest to lowest precedence (top to bottom). Operators on the same row share the same precedence level. The name in the first column is used as a reference label in the grammar rules below.
| Name | Assoc | Operators |
|---|---|---|
| element-access | left |
# #? . .? ^ ^** ^< ^<= ^> ^>= _ _< _<= _> _>= |_
|
| composition | left |
@@ @@?
|
| (adjacent — no symbol, see Tier 2) | right | |
| direct-sum | right | @ |
| multiplicative | left |
% * / //
|
| quotient | right |
\ \\
|
| tensor | left |
** ⊠ ⧢
|
| cdot | left | · |
| additive | left |
+ ++ -
|
| range | left |
.. ..<
|
| intersection | left | & |
| exterior-power | left | ^^ |
| union | left | | |
| coercion | right | : |
| vertical-concatenation | left | || |
| comparison | right |
!= < <= =!= == === > >= ? ~
|
| and | right | and |
| xor | right | xor |
| or | right |
?? or
|
| implication | right |
<== ==>
|
| biconditional | right | <==> |
| long-implication | right |
<=== ===>
|
| entailment | right | |- |
| output | left | << |
| assignment | right |
= := -> => <- >> += -= *= /= //= %= **= ++= ..= ..<= <<= >>= ??= @= @@= @@?= \= \\= ^= ^**= ^^= _= &= |-= |= |_= ||= ~= <==>= ===>= ==>= ·= ⊠= ⧢=
|
| sequence | left | , |
The adjacent row marks the boundary between strong operators (above, prec > adjacent) and weak operators (below, prec < adjacent).
Note: in the interpreter, multiplicative and quotient are actually a
single precedence level with mixed associativity (% * / // left, \ \\
right). Splitting them into the two rows above is observationally equivalent —
the right-associative operators bind their right operand one level lower — so a
parser may use either representation.
All postfix operators have higher precedence than adjacent. Rows are ordered from highest to lowest precedence.
| Name | Operators |
|---|---|
| shriek |
! ^! _!
|
| sheaf |
^* ^~ _* _~
|
| sum-of-twists | (*) |
Rows are ordered from highest to lowest precedence. Operators marked with (also binary) appear in both this table and the binary table; context (whether a left operand is present) determines which role applies.
| Name | Operators |
|---|---|
| count |
# (also binary: element-access)
|
| star |
* (also binary: multiplicative)
|
| sign |
+ - (also binary: additive)
|
| comparison-test |
< <= > >= ? ~ (also binary: comparison)
|
| not | not |
| null-test |
?? (also binary: or)
|
| left-implication |
<== (also binary: implication)
|
| long-left-implication |
<=== (also binary: long-implication)
|
| deduction |
|- (also binary: entailment)
|
| output |
<< (also binary: output)
|
| control-flow |
break breakpoint catch continue elapsedTime elapsedTiming finish profile return shield step TEST throw time timing trap
|
| comma |
, (also binary: sequence)
|
Note: count (#) sits at the same precedence level as adjacent.
It is unary prefix when no left operand is present; binary element-access (at a
higher level) when preceded by an expression.
source_file ::= statement* expression?
statement ::= expression (newline+ | ";")The general expression type — a flat union of all forms. The adjacent rule is what distinguishes M2's grammar from most other languages: function application is written by juxtaposition, and it sits at a middle precedence level (between the high-precedence element-access/composition operators and the low-precedence arithmetic/logical operators). Macaulay2's own parser handles this with a Pratt parser; see the Adjacent section below for the disambiguation rule.
expression ::= token
| adjacent
| strong_binary
| binary
| unary
| postfix
| parentheses
| if_expr
| quote
| try_expr
| while_expr
| for_expr
| new_exprAdjacency (juxtaposition): function application without an explicit operator
symbol. Right-associative, so f g x = f (g x).
adjacent ::= expression expression (* right-associative *)Two expressions written next to each other — with any amount of whitespace between them, including none — form an adjacent expression, provided the character immediately following the whitespace is not an operator symbol or a keyword that can only appear as an infix binary operator. This is the same disambiguation rule used in Macaulay2's own Pratt parser, where adjacency is encoded as a SPACE token with left-binding-power 61.
The rule for whether adjacency applies is determined by the first non-whitespace character after the left expression:
| Next character | Adjacent? |
|---|---|
Letter [a-zA-Z]
|
Yes, unless the full word is a binary-only keyword: and, or, xor, do, list, then, else, of, from, in, to, when, except
|
Digit [0-9]
|
Yes |
" |
Yes |
( |
Yes, unless followed by *) (which is the postfix sum-of-twists operator) |
[ or {
|
Yes |
Operator symbol, newline, ;, closing bracket |
No |
Whitespace is optional: f(x) and QQ[x] are adjacent expressions with no
space between the parts, equivalent to f (x) and QQ [x].
Note: bracket and angle-bar juxtaposition ([…], <|…|>) binds at a lower
precedence than paren and brace juxtaposition ((…), {…}) — the former are
installed at precBracket, the latter at precSpace. This is observable:
a*b[c] parses as (a*b)[c], but a*b(c) parses as a*(b c).
High-precedence binary operators (element-access and composition from the Binary Operators table). These bind more tightly than adjacent.
strong_binary ::= expression strong_binary_op expression (* element-access: left *)
(* composition: left *)
strong_binary_op ::= (* element-access *)
"#" | "#?" | "." | ".?" | "^" | "^**"
| "^<" | "^<=" | "^>" | "^>="
| "_" | "_<" | "_<=" | "_>" | "_>=" | "|_"
| (* composition *)
"@@" | "@@?"All binary operators with lower precedence than adjacent (the rows below adjacent in the Binary Operators table).
binary ::= expression weak_binary_op expression (* see Binary Operators table *)
weak_binary_op ::= (* direct-sum *) "@"
| (* multiplicative *) "%" | "*" | "/" | "//"
| (* quotient *) "\" | "\\"
| (* tensor *) "**" | "⊠" | "⧢"
| (* cdot *) "·"
| (* additive *) "+" | "++" | "-"
| (* range *) ".." | "..<"
| (* intersection *) "&"
| (* exterior-power *) "^^"
| (* union *) "|"
| (* coercion *) ":"
| (* vertical-concatenation *) "||"
| (* comparison *) "!=" | "<" | "<=" | "=!=" | "==" | "===" | ">" | ">=" | "?" | "~"
| (* and *) "and"
| (* xor *) "xor"
| (* or *) "??" | "or"
| (* implication *) "<==" | "==>"
| (* biconditional *) "<==>"
| (* long-implication *) "===>" | "<==="
| (* entailment *) "|-"
| (* output *) "<<"
| (* assignment *) "=" | ":=" | "->" | "=>" | "<-" | ">>"
| "+=" | "-=" | "*=" | "/=" | "//=" | "%="
| "**=" | "++=" | "..=" | "..<=" | "<<=" | ">>="
| "??=" | "@=" | "@@=" | "@@?=" | "\=" | "\\="
| "^=" | "^**=" | "^^=" | "_=" | "&=" | "|-=" | "|="
| "|_=" | "||=" | "<==>="| "===>="| "==>="
| "·=" | "⊠=" | "⧢=" | "~="
| (* sequence *) ","Each operator has its own precedence and associativity as listed in the Binary Operators table. The EBNF above is deliberately ambiguous about those relative precedences; the table governs.
unary ::= unary_prefix_op expression
unary_prefix_op ::= (* count *) "#"
| (* star *) "*"
| (* sign *) "+" | "-"
| (* comparison-test *) "<" | "<=" | ">" | ">=" | "?" | "~"
| (* not *) "not"
| (* null-test *) "??"
| (* left-implication *) "<=="
| (* long-left-implication *) "<==="
| (* deduction *) "|-"
| (* output *) "<<"
| (* control-flow *) "break" | "breakpoint" | "catch" | "continue"
| "elapsedTime" | "elapsedTiming" | "finish" | "profile"
| "return" | "shield" | "step" | "TEST"
| "throw" | "time" | "timing" | "trap"
| (* comma *) ","postfix ::= expression postfix_op
postfix_op ::= (* shriek *) "!" | "^!" | "_!"
| (* sheaf *) "^*" | "^~" | "_*" | "_~"
| (* sum-of-twists *) "(*)"token ::= identifier | string | integer | float
parentheses ::= "(" paren_contents? ")"
| "[" paren_contents? "]"
| "{" paren_contents? "}"
| "<|" paren_contents? "|>"
paren_contents ::= expression
| semicolon_sequence
(* ; inside brackets is a sequence separator, not a statement terminator.
The trailing expression after the last ; may be absent, corresponding to
M2's "dummy" token: (foo;) evaluates foo and discards the result. *)
semicolon_sequence ::= expression (";" expression?)+
quote ::= ("symbol" | "global" | "local" | "threadLocal")
identifierif_expr ::= "if" expression "then" expression ("else" expression)?
try_expr ::= "try" expression
("then" expression)?
( "else" expression
| "except" identifier "do" expression )?
while_expr ::= "while" expression
( "do" expression
| "list" expression ("do" expression)? )
for_expr ::= "for" identifier
( "in" expression
| ("from" expression)? ("to" expression)? )
("when" expression)?
( "do" expression
| "list" expression ("do" expression)? )
new_expr ::= "new" expression
("of" expression)?
("from" expression)?Adjacent is right-associative: f g x parses as f (g x). This means
g is applied to x first, and the result is passed to f.
Adjacent works with or without whitespace: f x, f(x), and QQ[x]
are all adjacent expressions.
count (#) sits at the adjacent level: Unary # has the same precedence
as adjacent. It is always unary when no left operand precedes it; when preceded
by an expression, the # symbol triggers the higher-precedence
element-access binary form instead.
Dual-use operators: Many symbols appear in both the binary and unary prefix
tables (e.g., +, -, *, <<, ??). The parser disambiguates by position:
the operator is binary when it appears between two expressions, and unary prefix
when it appears at the start of an expression (no left operand present).
x + y is never adjacent: The scanner does not emit SPACE before
operator symbols, so dual-use operators like +, -, *, << in infix
position are always parsed as binary, never as the start of adjacent's RHS.
Strong binary RHS is expression: The right-hand side of element-access
and composition operators is the full expression type. This allows constructs
like x . if y then z and x # not y. Operator precedence prevents
lower-precedence constructs from being incorrectly absorbed: in x # f y,
element-access (higher precedence) reduces before adjacent (lower precedence)
can form, giving adjacent(strong_binary(#, x, f), y).
Homepage | Projects | Packages | Documentation | Events | Google Group