You'll need just
, bun
, and a C compiler.
Currently tree-sitter
requires a JS runtime (Node, Deno, or Bun) to create grammar.json
from grammar.js
, the former of which is used to auto-generate the C parser. Given this requirement I've chosen to install the tree-sitter CLI using said JS runtime (bun specifically) as a small win on things-to-install-burden.
If you'd prefer Node, or Deno, or the standalone tree-sitter
CLI then modify the justfile
at the root of this repository accordingly. I'd recommend using Bun until upstream tree-sitter
improves their developer experience since Bun is trivial to install, installs packages incredibly quickly, and nothing crazy is going on with grammar.js
(i.e. node-specific APIs that Bun doesn't implement).
Execute just --list
to view commands, in-summary:
just bootstrap
just generate
just compile
just test
The tree-sitter
CLI will automatically re-run the generate and compile steps itself when using commands like test
if it detects the grammar has changed since last run; this has nothing to do with just
.
No bindings (as generated by the tree-sitter
CLI) are currently 'supported'; meaning I am not managing them because currently tree-sitter
litters the repository with files all over the place and tracking what is a true source-file and what is some generated binding file which will be overwritten is... not fun at all. You can trivially generate bindings by executing the recipe just generate
does, simply remove the --no-bindings
flag.
When upstream tree-sitter
improves it's littering behaviour (something they are already planning for); I will manage bindings in-repo.
Currently the tree-sitter
CLI creates a lot of autogenerated files at the repo root, options to configure another directory are not properly respected. Until that is fixed upstream beware that only the following files are sources of truth with all else autogenerated (and overwritten):
TODO: Put other oracle files here.
grammar.js
TODO: All possible (valid) noir syntax combinations with parse tree assertions.
TODO: Parse all of noir_stdlib in repo noir-lang/noir but without asserting the structure of the parse tree, only that it does not error. Otherwise would have to add manual parse tree for every single line of code in noir_stdlib. Maybe the stuff in test_programs of that repo too?
TODO: Track issues on repo noir-lang/noir which will change language syntax, collab with Aztec?
^^ examples include lowercase keyword for field
instead of current Field
: noir-lang/noir#3631
^^ new unsafe
keyword?: noir-lang/noir#4429
Existing vscode-noir textmate syntax issues:
fmtstr
broken: noir-lang/vscode-noir#50
textmate grammar: https://github.com/noir-lang/vscode-noir/blob/master/syntaxes/noir.tmLanguage.json
TODO: When its time for this mimic the canonical formatting as described in tooling/nargo_fmt/tests of repo noir-lang/noir, maybe some collab with Noir people too.
--- noirc compiler internals:
ASTs at: https://github.com/noir-lang/noir/tree/master/compiler/noirc_frontend/src/ast
Lexer token definitions: https://github.com/noir-lang/noir/blob/master/compiler/noirc_frontend/src/lexer/token.rs
Lexer-proper: https://github.com/noir-lang/noir/blob/master/compiler/noirc_frontend/src/lexer/lexer.rs
Type check errors at: https://github.com/noir-lang/noir/blob/master/compiler/noirc_frontend/src/hir/type_check/errors.rs
Interesting noirc_frontend debug stuff: https://github.com/noir-lang/noir/blob/c2eab6f4eb1437c16a0bad8cfca4634991df31c7/compiler/noirc_frontend/src/debug/mod.rs#L200-L211
Mentions cancer syntax like: let (((a,b,c),D { d }),e,f) = x;
being translated automatically, so we can write such cancer let bindings? So this parser needs to account for that too?
--- Scratch notes:
-
Zed for some general highlighting grammars, queries etc (for other projects).
-
That guy that maintains 250 neovim treesitter languages or whatever, any stuff there?
-
Zed blog has some nice posts about TS.
-
The tree-sitter CLI is creates A LOT of binding-file spam at the root of the directory and currently does not respect paths to place these elsewhere. Very annoying indeed. As and when that's fixed (upstream) clean up repo root and what not. Relevant issues for this:
--- Random down-the-rabbit-hole ideas:
-
Use rust's tree-sitter parser to parse noir's lexer tokens file and watch for changes in keywords as a form of weak automation concerning keyword changes and what not? Not fool-proof but one component along the line of keeping things in-sync.
-
Generate textmate grammar from tree-sitter grammar?
--- ts ideas put elsewhere later
TODO:
- shouldn't need Node (or Bun or Deno) just embed QuickJS and have it execute the grammar.js to produce the canonical grammar.json the tree-sitter generate CLI needs to create the parser. Contribute that upstream to treesitter later.