Skip to content

Fix 'tree-sitter generate' memory explosion on Hack grammar #365

Open
@mjambon

Description

This is preventing us from merging and using semgrep/ocaml-tree-sitter-core#48.

What we know:

  • Processing the tree-sitter-hack grammar after rewriting by ocaml-tree-sitter has always consumed a lot of memory. It now requires over 16 GB, which is more than a reasonable host should have to support.
  • ocaml-tree-sitter unhides all the rules by removing the leading underscore from the rule name. Re-hiding all these rules except the entry point leads to high CPU usage which times out after 50 min (on @mjambon's old laptop).

We need to investigate tree-sitter, which is a Rust program. The first step would be to come up with a minimal test case and file a bug with the tree-sitter project. Right now, we know that the modified Hack grammar is problematic but other grammars of similar size don't show excessive memory or CPU consumption.

Metadata

Assignees

No one assigned

    Labels

    blockingA task in some other project depends on thisbugSomething isn't workinghelp wantedExtra attention is neededpriority:highblocks a userr2c usernot originally reported by an external user

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions