Open
Description
Goal: make it simple to use the ocaml-tree-sitter runtime library as a git submodule without pulling the big tree-sitter-* submodules that contain generated files for real-world languages.
Proposed split:
- ocaml-tree-sitter-core repo containing:
src
andscripts
folders with all the machinery for code generation and runtimetests
folder with the end-to-end tests on simple grammars
- ocaml-tree-sitter-semgrep repo containing:
lang
folder which wraps around real-world grammars and includes tree-sitter-* repos as submodulesocaml-tree-sitter-core
repo as a submodule
- ocaml-tree-sitter-languages repo: similar to ocaml-tree-sitter-semgrep but without the grammar extensions for semgrep patterns. This is for the community of users of tree-sitter and ocaml, independently from semgrep.
Two-step plan
Phase 1
- Clone ocaml-tree-sitter to ocaml-tree-sitter-core.
- Remove the languages from the
lang/
folder in ocaml-tree-sitter-core. - Create
core
submodule in ocaml-tree-sitter. Remove duplicate code. Create symlinks tocore
as needed.
Phase 2
- Rename ocaml-tree-sitter → ocaml-tree-sitter-semgrep.
- Make semgrep use ocaml-tree-sitter-core instead of ocaml-tree-sitter.
- Create community repo ocaml-tree-sitter-languages on the same model as ocaml-tree-sitter-semgrep.
[Phase 3 - later]
Simplify the structure of the repos, minimize reliance on symlinks.
Progress
- phase 1
- phase 2.1
- phase 2.2
- phase 2.3
- update documentation in ocaml-tree-sitter-core
- update documentation in ocaml-tree-sitter-semgrep
- update links to documentation from semgrep