Skip to content

fix(lang-kotlin): support fun interface extraction via tree-sitter-kotlin re-vendor#2271

Open
glier wants to merge 3 commits into
abhigyanpatwari:mainfrom
glier:fix/kotlin-fun-interface
Open

fix(lang-kotlin): support fun interface extraction via tree-sitter-kotlin re-vendor#2271
glier wants to merge 3 commits into
abhigyanpatwari:mainfrom
glier:fix/kotlin-fun-interface

Conversation

@glier

@glier glier commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Problem

The vendored tree-sitter-kotlin@0.3.8 (fwcd) parses a Kotlin functional (SAM) interfacefun interface Foo { ... } — as an ERROR node and drops the declaration and its abstract method. So fun interface types are never extracted (no Interface node, no method).

Repro (old grammar):

fun interface Clicker { fun onClick(): Boolean }

(source_file (ERROR "fun" (user_type (type_identifier)) (simple_identifier)) (lambda_literal ...))

The fix exists upstream: fwcd/tree-sitter-kotlin#169 (closes #87), merged to main 2025-04-25 — but it is not in any npm release (latest tag is still 0.3.8, from 2024-08-03; main is the unreleased 0.4.0).

Change

Re-vendor the grammar from the unreleased fwcd main commit c8ac3d26:

  • Refresh src/{parser.c,scanner.c,node-types.json,tree_sitter/*.h} and bindings/node/index.js; bump the vendor version 0.3.8 → 0.4.0; record the pinned SHA + rationale in _vendoredBy and the vendor README.md.
  • Switch the prebuild workflow's kotlin registry kind: 'npm' → 'vendored' — the fix is unreleased on npm, so prebuilds must build from the vendored C source (like swift/dart/proto).
  • Add a hold to .github/vendored-grammars.json so the weekly auto-update monitor does not revert the pin: isNewer is strict-inequality, so 0.3.8 !== 0.4.0 would otherwise re-vendor the broken npm 0.3.8 over 0.4.0.
  • Add 3 regression tests + a fixture asserting fun interfaces extract as Interface nodes with their methods, and that plain-interface heritage still resolves.

No KOTLIN_QUERIES change needed: the new grammar models fun interface Foo as a class_declaration with an "interface" keyword child (plus an extra "fun" modifier child), which the existing interface rule (class_declaration "interface" (type_identifier) @name) already matches.

Verification

Full Kotlin suite green against the new grammar (built locally for darwin-arm64):

  • test/unit/{kotlin-scope-captures,kotlin-static-marker}.test.ts, test/unit/cfg/kotlin-visitor.test.ts, test/integration/resolvers/{kotlin,kotlin-coverage}.test.ts300 + 233 pass (incl. the 3 new fun-interface cases).
  • test/unit/{vendored-grammars,grammar-update-monitor}.test.ts28 pass.
  • Confirmed every node type referenced by KOTLIN_QUERIES is present in the new node-types.json (no silent node-type drift from the 0.3.8→main jump).

⚠️ Prebuilds / CI sequencing (please read before merge)

prebuilds/ are intentionally not in this PR — only darwin-arm64 is buildable locally; the other 5 platforms require the cross-build CI.

The version bump (0.3.8 → 0.4.0) auto-triggers build-tree-sitter-prebuilds, which regenerates all 6 platform binaries from the vendored source. Until those binaries land, node-gyp-build keeps loading the committed 0.3.8 prebuild, so:

  • CI for the new kotlin tests will be RED on this PR, and
  • the grammar change is inert at runtime until prebuilds are regenerated.

Because this is a fork PR, the prebuild workflow's aggregate (auto-PR) job is skipped — it will only upload artifacts. Recommended: a maintainer runs build-tree-sitter-prebuilds (workflow_dispatch, grammars=kotlin) to regenerate + commit the 6 binaries + SHA256SUMS, and that lands first or together with this PR. (SHA256SUMS is only written by that workflow, never verified at install/runtime.)

🤖 Generated with Claude Code

…kotlin re-vendor

Vendored tree-sitter-kotlin@0.3.8 (fwcd) parsed `fun interface Foo` as an
ERROR node and dropped the declaration plus its abstract method, so functional
(SAM) interfaces were never extracted. The fix landed upstream in
fwcd/tree-sitter-kotlin#169 (closes abhigyanpatwari#87), merged to main 2025-04-25, but is not
in any npm release (latest tag 0.3.8; main is the unreleased 0.4.0).

Re-vendor the grammar from the unreleased fwcd main commit c8ac3d26:
- refresh src/{parser.c,scanner.c,node-types.json,tree_sitter/*.h} and
  bindings/node/index.js; bump the vendor version 0.3.8 -> 0.4.0; record the
  pinned SHA + rationale in _vendoredBy and the vendor README.
- switch the prebuild workflow's kotlin registry kind 'npm' -> 'vendored' (the
  fix is unreleased on npm, so prebuilds must build from the vendored C source,
  like swift/dart/proto).
- add a hold to .github/vendored-grammars.json so the weekly auto-update
  monitor does not strict-inequality-revert the pin to the broken npm 0.3.8
  (isNewer compares 0.3.8 != 0.4.0).
- add 3 regression tests + a fixture asserting fun interfaces extract as
  Interface nodes with their abstract methods, and that plain-interface
  heritage still resolves.

Existing KOTLIN_QUERIES need no change: the new grammar models `fun interface`
as a class_declaration with an "interface" keyword child (plus an extra "fun"
modifier child), which the existing interface rule already matches. Full Kotlin
suite green against the new grammar (300 unit/cfg/resolver + 233 integration).

NOTE: prebuilds/ are intentionally not in this commit. The version bump
auto-triggers .github/workflows/build-tree-sitter-prebuilds.yml, which
regenerates all 6 platform binaries from the vendored source in a separate PR.
Until that lands, CI loads the committed 0.3.8 prebuild, so the new kotlin
tests are red and the grammar change is inert at runtime. Merge the prebuild PR
first or together.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@glier glier requested a review from magyargergo as a code owner June 22, 2026 15:31
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

@glier is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

CI Report

Some checks failed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
❌ Tests failure unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
12942 12881 2 55 764s

2 failed / 12881 passed

55 test(s) skipped — expand for details

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 79.28% 50382/63545 79.24% 📈 +0.0 🟢 ███████████████░░░░░
Branches 66.37% 30692/46243 66.3% 📈 +0.1 🟢 █████████████░░░░░░░
Functions 85.28% 5766/6761 85.22% 📈 +0.1 🟢 █████████████████░░░
Lines 82.76% 44899/54249 82.72% 📈 +0.0 🟢 ████████████████░░░░

📋 View full run · Generated by CI

@magyargergo

Copy link
Copy Markdown
Collaborator

Please double check it with #2194. We need to make sure this fits into the matrix. For now, we only support tree-sitter on 0.21.x and you need to compile it locally on this tree-sitter version and double check if it can generate the grammar properly.

@magyargergo magyargergo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my comment above.

@glier

glier commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @magyargergo — I checked against #2194, and on the ABI/matrix side this already fits, so I want to confirm that and then ask you about the right way to get CI green, since the grammar-bump + prebuild flow is your pipeline.

ABI / #2194 matrix — fits without a runtime bump:

  • The vendored src/parser.c is #define LANGUAGE_VERSION 14, i.e. ABI 14, inside the supported tree-sitter@0.21.x range (13–14). It's also within the 0.25 target range (13–15), so it won't block that upgrade.
  • Tree-sitter 0.25 upgrade readiness #2194 already tracks this exact commit: tree-sitter-kotlin … upstream fwcd/tree-sitter-kotlin@c8ac3d262724 ABI 14 (in target range) — the pin here is that same commit.

Compiled + verified locally on tree-sitter@0.21.1 (the pinned runtime):

  • Built the committed parser.c/scanner.c against tree-sitter@0.21.1; Parser.setLanguage() accepts it (the ABI gate throws on >14 — it doesn't), and fun interface IntPredicate { fun accept(i: Int): Boolean } parses with no ERROR.
  • python3 .github/scripts/check-tree-sitter-upgrade-readiness.py --assert-current → green: tree-sitter-kotlin: vendored ABI 14 in range [vendored, held].
  • Full Kotlin suite green (300 unit/cfg/resolver + 233 integration incl. the new cases). We vendor upstream's committed ABI-14 parser.c directly rather than regenerating, so no newer tree-sitter CLI is in play.

The red CI — and a process question for you:
The only failing checks are the 3 new fun interface tests. They fail because the committed prebuilds/ are still the 0.3.8 binaries, and node-gyp-build prefers a committed prebuild over a source build — so CI loads the old grammar until the prebuilds are regenerated from the new source. The version bump triggers build-tree-sitter-prebuilds, but on a fork PR its aggregate (auto-PR) job is skipped, so it only uploads artifacts and can't commit the binaries here.

I'm not sure of the intended sequencing for a vendored-grammar bump in this repo — could you point me the right way? A couple of options I see:

  1. A maintainer dispatches build-tree-sitter-prebuilds (grammars=kotlin) so the 6 rebuilt binaries + SHA256SUMS land (in this branch or a companion PR merged first/together), after which the tests go green; or
  2. I pull the 3 fun interface tests out into a follow-up PR that merges right after the prebuilds, so this PR is green on its own.

Happy to do whichever fits your workflow — or follow a different process if there's an established one I've missed.

@magyargergo

Copy link
Copy Markdown
Collaborator

I saw one ci job is red: https://github.com/abhigyanpatwari/GitNexus/actions/runs/27964323565/job/82753978209#step:4:1


I'll also check this locally if i can build it on tree-sitter 0.22.1

The kotlin `hold` added in the previous commit makes the tree-sitter
upgrade-readiness report count it as a blocker — the report treats every
held vendored grammar as frozen below a runtime upgrade (same as the
intentionally-pinned tree-sitter-cpp and the ABI-held tree-sitter-c),
"in-range ABI or not". So the report's blocker count goes 2 -> 3.

Update the hardcoded count in
test_issue_update_summary_regex_matches_current_report (and the
_render_report docstring) accordingly — exactly as that test instructs:
"if a grammar is added/removed or a pin/hold changes, update the expected
counts". kotlin's ABI (14) is in range; the hold is what flags it, with the
reason recorded in .github/vendored-grammars.json.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@glier

glier commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the link — didn't spot that failing job at first. Tracked it down: the hold I added for kotlin makes the readiness report count it as a blocker (it treats any held vendored grammar that way), so the hardcoded blocker count in the readiness unit test was off by one. Fixed in d406620 — bumped it 2→3. ABI's still 14, so it builds fine on 0.21.x/0.22.1, the hold's just what flags it.

Only thing left red is the 3 fun-interface tests — still the prebuild question from above.

Two committed baselines pinned the pre-bump kotlin state and broke when the
grammar was re-vendored (0.3.8 -> 0.4.0):

- cli-commands.test.ts pinned the vendored kotlin package version at 0.3.8 ->
  update to 0.4.0.
- bench/scope-capture/baselines.json: the new kotlin-fun-interface fixture joins
  the lang-resolution/kotlin-* corpus AND the new grammar parses `fun interface`
  as a class_declaration (not an ERROR node), so the capture fingerprint drifts.
  Rebaselined to the NEW grammar's fingerprint (verified by building the vendored
  parser.c against tree-sitter@0.21.1 and running measure.mjs --check); scaling
  ~0.83 (linear).

Like the fun-interface integration tests, the scope-capture --check passes only
once the regenerated prebuilds land; until then CI loads the committed 0.3.8
binary, so it stays red.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@magyargergo

magyargergo commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

We also need to regenerate the fingerprints for kotling benchmarks. https://github.com/abhigyanpatwari/GitNexus/actions/runs/27979931111/job/82807100858?pr=2271

@glier

glier commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Already did that in 6105cf7 — but I baselined the kotlin scope-capture fingerprint to the new grammar (4900431791…), so like the fun-interface tests it only goes green once the regenerated prebuilds land. Right now CI loads the committed 0.3.8 binary, computes the old fingerprint (b4406b7c…, 2 fewer capture groups — the missing fun-interface ones) and stays red.

I can flip the baseline to the current old-grammar value so the bench goes green now, but the fun-interface tests need the new binary regardless — so it really all hinges on regenerating the kotlin prebuilds. Want me to handle that somehow, or will you kick build-tree-sitter-prebuilds?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Functional interface parsing has ERROR

2 participants