Summary
Parser.ParseWithOptions leaks one cgo handle on every call made with a non-nil *ParseOptions. The options value is registered in mattn/go-pointer's process-global handle map and never released, so it — and the ProgressCallback it holds — is retained for the lifetime of the process. One handle per call, unbounded in the number of calls, and independent of the input, the grammar, or whether the parse succeeds: the handle is saved before parsing even starts.
Present in v0.25.0 and on the current master (c949200) — the same four entry points in both.
Affected functions (parser.go)
ParseWithOptions
ParseUTF16LEWithOptions
ParseUTF16BEWithOptions
ParseCustomEncoding
Root cause
Each function registers two handles via mattn/go-pointer: the input payload and the options. The payload is released with a deferred Unref; the options handle has no matching Unref.
// parser.go — func (p *Parser) ParseWithOptions
cptr := pointer.Save(&payload)
defer pointer.Unref(cptr) // input payload: Saved AND Unref'd ✅
var cOptions C.TSParseOptions
if options != nil {
cOptions = C.TSParseOptions{
progress_callback: (*[0]byte)(C.parserProgressCallback),
payload: pointer.Save(options), // options: Saved, never Unref'd ❌
}
}
cNewTree := C.ts_parser_parse_with_options(p._inner, cOldTree, cInput, cOptions)
mattn/go-pointer.Save mallocs one byte and stores the value in a process-global map; Unref is the only path that deletes the entry and frees the byte:
var (
mutex sync.RWMutex
store = map[unsafe.Pointer]interface{}{} // process-global, grows forever
)
func Save(v interface{}) unsafe.Pointer {
var ptr unsafe.Pointer = C.malloc(C.size_t(1))
mutex.Lock()
store[ptr] = v // retains v (the *ParseOptions)
mutex.Unlock()
return ptr
}
Each leaked call therefore costs one malloc(1) byte and one permanent map entry retaining the *ParseOptions (and its progress closure and whatever it captures). Because pointer.Save(options) runs before ts_parser_parse_with_options, the leak does not depend on a grammar being set or a tree being produced.
Why this path matters
ParseWithOptions is the only non-deprecated parse entry point that exposes progress reporting / cooperative cancellation. ParseCtx, SetTimeoutMicros, and CancellationFlag/SetCancellationFlag are all marked Deprecated: Use Parser.ParseWithOptions .... The non-deprecated Parse forwards nil options (so it doesn't leak through this path) but offers no progress/cancellation hook — so a long-lived process that needs cancellation is steered onto the one API that leaks.
Relationship to #52
I looked at #52 ("Possible memory leak when processing certain files") looks related but is a different effect. It seems to happen before this code path is reached, thus is separate I believe.
Suggested direction
The uses the input-payload handle a few lines above in each function: capture the saved options handle and defer pointer.Unref(...) it after the parse (the parse is synchronous; tree-sitter does not retain the payload past ts_parser_parse_with_options).
Environment
github.com/tree-sitter/go-tree-sitter v0.25.0 (and current master, c949200)
github.com/mattn/go-pointer v0.0.1 (transitive)
- Reproduced on macOS (arm64), Go 1.26.3; requires CGO. The leak is platform-independent.
Provenance: this leak was found autonomously by Claude Opus 4.7 (claude-opus-4-7) while it was working on an unrelated project of mine that depends on go-tree-sitter — it detected the leak and worked around it without my prompting it to look at memory. The root-cause analysis and reproduction above are the model's; I'm filing the report.
Summary
Parser.ParseWithOptionsleaks one cgo handle on every call made with a non-nil*ParseOptions. The options value is registered inmattn/go-pointer's process-global handle map and never released, so it — and theProgressCallbackit holds — is retained for the lifetime of the process. One handle per call, unbounded in the number of calls, and independent of the input, the grammar, or whether the parse succeeds: the handle is saved before parsing even starts.Present in v0.25.0 and on the current
master(c949200) — the same four entry points in both.Affected functions (
parser.go)ParseWithOptionsParseUTF16LEWithOptionsParseUTF16BEWithOptionsParseCustomEncodingRoot cause
Each function registers two handles via
mattn/go-pointer: the input payload and the options. The payload is released with a deferredUnref; the options handle has no matchingUnref.mattn/go-pointer.Savemallocs one byte and stores the value in a process-global map;Unrefis the only path that deletes the entry and frees the byte:Each leaked call therefore costs one
malloc(1)byte and one permanent map entry retaining the*ParseOptions(and its progress closure and whatever it captures). Becausepointer.Save(options)runs beforets_parser_parse_with_options, the leak does not depend on a grammar being set or a tree being produced.Why this path matters
ParseWithOptionsis the only non-deprecated parse entry point that exposes progress reporting / cooperative cancellation.ParseCtx,SetTimeoutMicros, andCancellationFlag/SetCancellationFlagare all markedDeprecated: Use Parser.ParseWithOptions .... The non-deprecatedParseforwardsniloptions (so it doesn't leak through this path) but offers no progress/cancellation hook — so a long-lived process that needs cancellation is steered onto the one API that leaks.Relationship to #52
I looked at #52 ("Possible memory leak when processing certain files") looks related but is a different effect. It seems to happen before this code path is reached, thus is separate I believe.
Suggested direction
The uses the input-payload handle a few lines above in each function: capture the saved options handle and
defer pointer.Unref(...)it after the parse (the parse is synchronous; tree-sitter does not retain the payload pastts_parser_parse_with_options).Environment
github.com/tree-sitter/go-tree-sitterv0.25.0 (and currentmaster,c949200)github.com/mattn/go-pointerv0.0.1 (transitive)Provenance: this leak was found autonomously by Claude Opus 4.7 (
claude-opus-4-7) while it was working on an unrelated project of mine that depends on go-tree-sitter — it detected the leak and worked around it without my prompting it to look at memory. The root-cause analysis and reproduction above are the model's; I'm filing the report.