Skip to content

Syntax symbol pickers #12275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

Syntax symbol pickers #12275

wants to merge 6 commits into from

Conversation

the-mikedavis
Copy link
Member

This adds two new symbol picker commands that use tree-sitter rather than LSP. We run a new symbols.scm query across the file and extract tagged things like function definitions, types, classes, etc. For languages with unambiguous syntax this behaves roughly the same as the LSP symbol picker (<space>s). It's less precise though since we don't have semantic info about the language. For example it can easily produce false positives for C/C++ because of preprocessor magic. Prior art for this feature is GitHub's imprecise code navigation which I believe works the same way and leverages tags.scm queries. (I have no internal GitHub knowledge so this is an educated guess.) It should be possible to find definitions and references as well like gd and gr - this is left as a follow-up.

The hope is to start introducing LSP-like features for navigation that can work without installing or running a language server. I made these two pickers in particular because I don't like LSP equivalents in ErlangLS or ELP - the document symbol picker can take a long time to show up during boot and the workspace symbol picker only searches for module names. The other motivation is to have some navigation features in cases when running a language server is too cumbersome - either to install or because of resource constraints. For example clangd needs a fair amount of setup (compile_commands.json) that you might not want to do when quickly reading through a codebase.

This PR also adds commands that either open the LSP symbol picker or the syntax one if a language server is not available. This way you can customize a language to not use the LSP symbol pickers, for example:

[[language]]
name = "erlang"
language-servers = [{ name = "erlang-ls", except-features = ["document-symbols", "workspace-symbols"] }]

and <space>s will use the syntax symbol picker, while <space>s on a Rust file will still prefer the language server.

Some prior discussion of a feature like this is in #3518 talking about Ctags support. The idea here is similar but extracts tags/symbols with tree-sitter instead.

Outstanding question: how closely should we try to match LSP symbol kind? Not at all? Should we have markup specific symbol kinds? (For example see markdown's symbols.scm).

@the-mikedavis the-mikedavis added A-tree-sitter Area: Tree-sitter E-medium Call for participation: Experience needed to fix: Medium / intermediate A-command Area: Commands labels Dec 16, 2024
nikvoid added a commit to nikvoid/helix that referenced this pull request Dec 28, 2024
@EricHenry
Copy link
Contributor

I'm having trouble getting this to work. I pulled down the branch, but when I try to load the symbol picker, without having lsp enabled, I get an error that No language server supporting document symbols or syntax info available. I am testing this on a rust project.

Any ideas?

@the-mikedavis
Copy link
Member Author

There are only a few languages with symbols.scm queries so far: C, C++, Erlang, Elixir, Markdown and Python. Rust queries would need to be added. (Feel free to send a PR to this branch if you'd like. I always have rust-analyzer going so I haven't felt the need to add Rust yet.)

@cgahr
Copy link
Contributor

cgahr commented Feb 6, 2025

I added symbols for typst: #12793

Co-authored-by: Constantin Gahr <[email protected]>
@EricHenry EricHenry mentioned this pull request Feb 13, 2025
repository.

[tree-sitter-captures]: https://tree-sitter.github.io/tree-sitter/using-parsers#capturing-nodes
[example-queries]: https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+path%3A%2A%2A/symbols.scm&type=Code&ref=advsearch&l=&l=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[example-queries]: https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+path%3A%2A%2A/symbols.scm&type=Code&ref=advsearch&l=&l=
[example-queries]: https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+path%3A%2A%2A/symbols.scm&type=Code

Should lead to the same result without cruft in the URL

Comment on lines +83 to +84
// TODO: the workspace symbol picker will take advantage of this.
#[allow(dead_code)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still valid or is it used now ?

Comment on lines +41 to +48
Function,
Macro,
Module,
Constant,
Struct,
Interface,
Type,
Class,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordering them alphabetically here, in the docs and in matches will make it easier to avoid conflicts if more variants are added in the future, the current order encourages adding at the end

};

use arc_swap::ArcSwapAny;
use dashmap::DashMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dashmap can be very slow to free memory (though it's fast on all other operations).

In my personal experience, scc is better at memory reclamation, which is probably good for a picker (avoid lingering effects from mapping the whole workspace for example).

Of course, we probably want benchmarking results, it may not be an issue at all in practice

@the-mikedavis the-mikedavis marked this pull request as draft February 17, 2025 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-command Area: Commands A-tree-sitter Area: Tree-sitter E-medium Call for participation: Experience needed to fix: Medium / intermediate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants