Fix 'tree-sitter generate' memory explosion on Hack grammar #365
Open
Description
This is preventing us from merging and using semgrep/ocaml-tree-sitter-core#48.
What we know:
- Processing the tree-sitter-hack grammar after rewriting by ocaml-tree-sitter has always consumed a lot of memory. It now requires over 16 GB, which is more than a reasonable host should have to support.
- ocaml-tree-sitter unhides all the rules by removing the leading underscore from the rule name. Re-hiding all these rules except the entry point leads to high CPU usage which times out after 50 min (on @mjambon's old laptop).
We need to investigate tree-sitter
, which is a Rust program. The first step would be to come up with a minimal test case and file a bug with the tree-sitter project. Right now, we know that the modified Hack grammar is problematic but other grammars of similar size don't show excessive memory or CPU consumption.