Skip to content

Parsing Performance & Maintainability. #3369

Open
@andrevidela

Description

@andrevidela

Long term users of the idris programming language will know that one of the weaknesses of the compiler is
it's most common interaction model: the parser. Because the parser is quite a complex piece of software, there
is not a single element that we can point out and fix to resolve all the symptoms we experience because of it.

The role of this issue is to keep track of problems related to the parser, and different efforts deployed to solve
them.

This issue will be closed when the current state of the parser will be improve in at least two aspects: Performance
and maintainability. First, a non-exhaustive list of symptoms we experience and would like to address.

Existing and known problems

  1. Performance is subpar
  2. Inconsistent error recovery messages
  3. Unexpected whitespace rules
  4. Code architecture regarding location is brittle
  5. Code architecture around desugaring and resugaring is brittle
  6. No tree sitter support
  7. Code synthesis generates unparsable code

Next steps

Here are a list of smaller steps that are not necessarily related to code that we can push as a PR on the
project but we still need to do in order to make progress toward better identifying and scoping solutions.

  • Measure performance in multiple scenarios: Large files, interactive modes, visually ambiguous code, etc.
  • investigate options around tree sitter. Are there ways to generate idris-compatible code from tree-sitter? Is there a way to generate tree-sitter spec from the parser? What kind of custom lexer do we need to write?
  • Investigate a better API for file location that is compatible with desugaring and resugaring, and does not require mindful use of FC in constructors but can be handled automatically

Conclusion

There are definitely more things we could do but I thought those would be small, achievable targets in the currently underspecified
goal to address parser runtime performance and maintainability. Please feel free to add to this discussion and share ideas of
small goals we can achieve easily to better design issues. Please, also share your experience with the parser if it is not represented
here in a way that we might be able to address with this kind of project

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions