Expose `ts_parser_set_encoding` on the Parser API

Tree-sitter's C API supports `ts_parser_set_encoding`, which makes node offsets line up with utf16 code units instead of utf8 bytes. This is a natural fit for JVM languages, where `String` is already utf16 internally.

ktreesitter `v0.24.1` hardcodes utf8 in both parse paths:

```kotlin
parse(source: String)
parse(oldTree, callback)
```

This forces Kotlin callers have to maintain a byte to char offset table on every parse to bridge tree-sitter's utf8 offsets to Kotlin's utf16 string indices. Exposing a knob would let us skip that entirely.

Is there interest in this? Happy to contribute a PR if the direction sounds reasonable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose `ts_parser_set_encoding` on the Parser API #61

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Expose ts_parser_set_encoding on the Parser API #61

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Expose `ts_parser_set_encoding` on the Parser API #61