Skip to content

Macaulay2 language grammar

Doug Torrance edited this page Jun 25, 2026 · 1 revision

Note: This grammar was AI-generated by Claude Opus 4.8. 🤖

Macaulay2 Expression Grammar

EBNF description of the Macaulay2 expression grammar. This matches the semantics of Macaulay2's internal Pratt parser and can serve as a reference for writing any Macaulay2 parser.

Lexical Rules

identifier ::= alpha (alpha | [0-9] | "'" | "$")*
             (* alpha = ASCII letter, or any non-ASCII Unicode letter
                that is not an operator symbol *)

integer    ::= [0-9]+
             | ("0b" | "0B") [01]+
             | ("0o" | "0O") [0-7]+
             | ("0x" | "0X") [0-9a-fA-F]+

float      ::= ( [0-9]+ ("." [0-9]*)? | "." [0-9]+ )
               ("p" [0-9]+)? ([eE] [+-]? [0-9]+)?

string     ::= '"' ([^"\\] | "\\" .)* '"'
             | "///" .*? "///"    (* non-greedy *)

comment    ::= "--" [^\n]*
             | "-*" .*? "*-"      (* non-greedy *)
             | "#!" [^\n]*        (* only on line 1, column 0 *)

A token matches float only when it contains a fractional part, a p precision field, or an exponent; a bare [0-9]+ is an integer. (The two productions above overlap on bare digit sequences; the presence of ., p, or [eE] is what distinguishes a float.)

Operator Tables

Each table is ordered from highest to lowest precedence (top to bottom). Operators on the same row share the same precedence level. The name in the first column is used as a reference label in the grammar rules below.

Binary Operators

Name Assoc Operators
element-access left # #? . .? ^ ^** ^< ^<= ^> ^>= _ _< _<= _> _>= |_
composition left @@ @@?
(adjacent — no symbol, see Tier 2) right
direct-sum right @
multiplicative left % * / //
quotient right \ \\
tensor left **
cdot left ·
additive left + ++ -
range left .. ..<
intersection left &
exterior-power left ^^
union left |
coercion right :
vertical-concatenation left ||
comparison right != < <= =!= == === > >= ? ~
and right and
xor right xor
or right ?? or
implication right <== ==>
biconditional right <==>
long-implication right <=== ===>
entailment right |-
output left <<
assignment right = := -> => <- >> += -= *= /= //= %= **= ++= ..= ..<= <<= >>= ??= @= @@= @@?= \= \\= ^= ^**= ^^= _= &= |-= |= |_= ||= ~= <==>= ===>= ==>= ·= ⊠= ⧢=
sequence left ,

The adjacent row marks the boundary between strong operators (above, prec > adjacent) and weak operators (below, prec < adjacent).

Note: in the interpreter, multiplicative and quotient are actually a single precedence level with mixed associativity (% * / // left, \ \\ right). Splitting them into the two rows above is observationally equivalent — the right-associative operators bind their right operand one level lower — so a parser may use either representation.

Postfix Operators

All postfix operators have higher precedence than adjacent. Rows are ordered from highest to lowest precedence.

Name Operators
shriek ! ^! _!
sheaf ^* ^~ _* _~
sum-of-twists (*)

Unary Prefix Operators

Rows are ordered from highest to lowest precedence. Operators marked with (also binary) appear in both this table and the binary table; context (whether a left operand is present) determines which role applies.

Name Operators
count # (also binary: element-access)
star * (also binary: multiplicative)
sign + - (also binary: additive)
comparison-test < <= > >= ? ~ (also binary: comparison)
not not
null-test ?? (also binary: or)
left-implication <== (also binary: implication)
long-left-implication <=== (also binary: long-implication)
deduction |- (also binary: entailment)
output << (also binary: output)
control-flow break breakpoint catch continue elapsedTime elapsedTiming finish profile return shield step TEST throw time timing trap
comma , (also binary: sequence)

Note: count (#) sits at the same precedence level as adjacent. It is unary prefix when no left operand is present; binary element-access (at a higher level) when preceded by an expression.

Grammar

Top Level

source_file ::= statement* expression?

statement   ::= expression (newline+ | ";")

expression

The general expression type — a flat union of all forms. The adjacent rule is what distinguishes M2's grammar from most other languages: function application is written by juxtaposition, and it sits at a middle precedence level (between the high-precedence element-access/composition operators and the low-precedence arithmetic/logical operators). Macaulay2's own parser handles this with a Pratt parser; see the Adjacent section below for the disambiguation rule.

expression ::= token
             | adjacent
             | strong_binary
             | binary
             | unary
             | postfix
             | parentheses
             | if_expr
             | quote
             | try_expr
             | while_expr
             | for_expr
             | new_expr

Adjacent

Adjacency (juxtaposition): function application without an explicit operator symbol. Right-associative, so f g x = f (g x).

adjacent ::= expression expression   (* right-associative *)

Two expressions written next to each other — with any amount of whitespace between them, including none — form an adjacent expression, provided the character immediately following the whitespace is not an operator symbol or a keyword that can only appear as an infix binary operator. This is the same disambiguation rule used in Macaulay2's own Pratt parser, where adjacency is encoded as a SPACE token with left-binding-power 61.

The rule for whether adjacency applies is determined by the first non-whitespace character after the left expression:

Next character Adjacent?
Letter [a-zA-Z] Yes, unless the full word is a binary-only keyword: and, or, xor, do, list, then, else, of, from, in, to, when, except
Digit [0-9] Yes
" Yes
( Yes, unless followed by *) (which is the postfix sum-of-twists operator)
[ or { Yes
Operator symbol, newline, ;, closing bracket No

Whitespace is optional: f(x) and QQ[x] are adjacent expressions with no space between the parts, equivalent to f (x) and QQ [x].

Note: bracket and angle-bar juxtaposition ([…], <|…|>) binds at a lower precedence than paren and brace juxtaposition ((…), {…}) — the former are installed at precBracket, the latter at precSpace. This is observable: a*b[c] parses as (a*b)[c], but a*b(c) parses as a*(b c).


Strong Binary

High-precedence binary operators (element-access and composition from the Binary Operators table). These bind more tightly than adjacent.

strong_binary ::= expression strong_binary_op expression   (* element-access: left *)
                                                           (* composition:    left *)

strong_binary_op ::= (* element-access *)
                     "#" | "#?" | "." | ".?" | "^" | "^**"
                   | "^<" | "^<=" | "^>" | "^>="
                   | "_" | "_<" | "_<=" | "_>" | "_>=" | "|_"
                   | (* composition *)
                     "@@" | "@@?"

Binary

All binary operators with lower precedence than adjacent (the rows below adjacent in the Binary Operators table).

binary ::= expression weak_binary_op expression   (* see Binary Operators table *)

weak_binary_op ::= (* direct-sum *)             "@"
                 | (* multiplicative *)         "%" | "*" | "/" | "//"
                 | (* quotient *)               "\" | "\\"
                 | (* tensor *)                 "**" | "" | ""
                 | (* cdot *)                   "·"
                 | (* additive *)               "+" | "++" | "-"
                 | (* range *)                  ".." | "..<"
                 | (* intersection *)           "&"
                 | (* exterior-power *)         "^^"
                 | (* union *)                  "|"
                 | (* coercion *)               ":"
                 | (* vertical-concatenation *) "||"
                 | (* comparison *)             "!=" | "<" | "<=" | "=!=" | "==" | "===" | ">" | ">=" | "?" | "~"
                 | (* and *)                    "and"
                 | (* xor *)                    "xor"
                 | (* or *)                     "??" | "or"
                 | (* implication *)            "<==" | "==>"
                 | (* biconditional *)          "<==>"
                 | (* long-implication *)       "===>" | "<==="
                 | (* entailment *)             "|-"
                 | (* output *)                 "<<"
                 | (* assignment *)             "=" | ":=" | "->" | "=>" | "<-" | ">>"
                                              | "+=" | "-=" | "*=" | "/=" | "//=" | "%="
                                              | "**=" | "++=" | "..=" | "..<=" | "<<=" | ">>="
                                              | "??=" | "@=" | "@@=" | "@@?=" | "\=" | "\\="
                                              | "^=" | "^**=" | "^^=" | "_=" | "&=" | "|-=" | "|="
                                              | "|_=" | "||=" | "<==>="|  "===>="|  "==>="
                                              | "·=" | "⊠=" | "⧢=" | "~="
                 | (* sequence *)              ","

Each operator has its own precedence and associativity as listed in the Binary Operators table. The EBNF above is deliberately ambiguous about those relative precedences; the table governs.


Unary

unary ::= unary_prefix_op expression

unary_prefix_op ::= (* count *)                "#"
                  | (* star *)                 "*"
                  | (* sign *)                 "+" | "-"
                  | (* comparison-test *)      "<" | "<=" | ">" | ">=" | "?" | "~"
                  | (* not *)                  "not"
                  | (* null-test *)            "??"
                  | (* left-implication *)     "<=="
                  | (* long-left-implication *) "<==="
                  | (* deduction *)            "|-"
                  | (* output *)               "<<"
                  | (* control-flow *)         "break" | "breakpoint" | "catch" | "continue"
                                             | "elapsedTime" | "elapsedTiming" | "finish" | "profile"
                                             | "return" | "shield" | "step" | "TEST"
                                             | "throw" | "time" | "timing" | "trap"
                  | (* comma *)               ","

Postfix

postfix ::= expression postfix_op

postfix_op ::= (* shriek *)        "!" | "^!" | "_!"
             | (* sheaf *)         "^*" | "^~" | "_*" | "_~"
             | (* sum-of-twists *) "(*)"

Atoms

token ::= identifier | string | integer | float

parentheses ::= "(" paren_contents? ")"
              | "[" paren_contents? "]"
              | "{" paren_contents? "}"
              | "<|" paren_contents? "|>"

paren_contents ::= expression
                 | semicolon_sequence

(* ; inside brackets is a sequence separator, not a statement terminator.
   The trailing expression after the last ; may be absent, corresponding to
   M2's "dummy" token: (foo;) evaluates foo and discards the result. *)
semicolon_sequence ::= expression (";" expression?)+

quote ::= ("symbol" | "global" | "local" | "threadLocal")
          identifier

Keyword Expressions

if_expr ::= "if" expression "then" expression ("else" expression)?

try_expr ::= "try" expression
             ("then" expression)?
             ( "else" expression
             | "except" identifier "do" expression )?

while_expr ::= "while" expression
               ( "do" expression
               | "list" expression ("do" expression)? )

for_expr ::= "for" identifier
             ( "in" expression
             | ("from" expression)? ("to" expression)? )
             ("when" expression)?
             ( "do" expression
             | "list" expression ("do" expression)? )

new_expr ::= "new" expression
             ("of" expression)?
             ("from" expression)?

Notes

Adjacent is right-associative: f g x parses as f (g x). This means g is applied to x first, and the result is passed to f.

Adjacent works with or without whitespace: f x, f(x), and QQ[x] are all adjacent expressions.

count (#) sits at the adjacent level: Unary # has the same precedence as adjacent. It is always unary when no left operand precedes it; when preceded by an expression, the # symbol triggers the higher-precedence element-access binary form instead.

Dual-use operators: Many symbols appear in both the binary and unary prefix tables (e.g., +, -, *, <<, ??). The parser disambiguates by position: the operator is binary when it appears between two expressions, and unary prefix when it appears at the start of an expression (no left operand present).

x + y is never adjacent: The scanner does not emit SPACE before operator symbols, so dual-use operators like +, -, *, << in infix position are always parsed as binary, never as the start of adjacent's RHS.

Strong binary RHS is expression: The right-hand side of element-access and composition operators is the full expression type. This allows constructs like x . if y then z and x # not y. Operator precedence prevents lower-precedence constructs from being incorrectly absorbed: in x # f y, element-access (higher precedence) reduces before adjacent (lower precedence) can form, giving adjacent(strong_binary(#, x, f), y).

Clone this wiki locally