Skip to content

Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73

Open
@mjambon

Description

Typically, long sequences of items such as long array literals or long sequences of statements are found only in generated files. Such files may not be interesting to analyze (e.g. in the case of semgrep) so it would be fine to give up on them. The problem is that at least the translation to the OCaml tree uses stack space that's proportional to the length of the tree (including lists) and it results in segfaults on some platforms while on other platforms it raises a Stack_overflow exception. To avoid such a crash, a solution may be to calculate the depth of the tree returned by the tree-sitter parser and return an error if it exceeds some limit. This assumes the tree-sitter parser itself doesn't crash due to insufficient stack space.

Here's an example of a large generated C++ file whose parsing results in stack overflows: https://github.com/juce-framework/JUCE/blob/d054f0d14dcac387aebda44ce5d792b5e7a625b3/extras/Projucer/JuceLibraryCode/BinaryData.cpp

Tasks:

  1. Check whether we can parse the input file above with just tree-sitter-cpp (e.g. with tree-sitter parse).
  2. If so, add a pass to calculate/estimate the depth of the tree returned by tree-sitter before its translation to OCaml.
  3. Add an option to fail if the tree depth exceeds a limit.

Ideas to avoid complete failure:

  • Truncate excessively deep trees/lists without aborting when possible (e.g. tree-sitter's repeat() and repeat1() constructs).
  • Increase the system's stack size limit or ask the user to do so in a last-gasp error message.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or requestpriority:mediumto do, not blocking users

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions