Skip to content

Possible memory leak when processing certain files #52

@petergardfjall

Description

@petergardfjall

We are using go-tree-sitter as a library to parse a stream of source files and we've noticed that the process' memory appears to grow over time, in particular when encountering certain files. We sometimes see the memory (RSS) of the process jump to 700MiB - memory surges that do not appear to ever be reclaimed by the operating system.

Also, looking at pprof the memory is not tracked by the Go runtime which makes me suspect that there is a memory leak somewhere in the CGO interactions.

I've attached a small program and testfiles to reproduce the issue. Unzipping the attached zip should make it simple to reproduce.

leak-reproducer.zip

The program parses an input file N number of times (--N being a flag). It seems like memory is not always reclaimed when a parsed Tree is Closed. But the behavior is not consistent. The leak only seems to manifest with certain (bigger?) files.

For example, parsing a small file 100000 times only shows a memory use (RSS)
of around 15-20MiB on my system. There is also no obvious growth in memory use over time.

make
./leak-reproducer -N=100000 --wait testdata/small.go

However, trying to parse a larger file 100 times shows memory quickly grow to around 200-300MiB. The memory does not grow linearly but seems to make sudden jumps.

make
./leak-reproducer -N=100 --wait testdata/large.go

Profiling

This example uses the memleak tool from https://github.com/iovisor/bcc installed on ubuntu as memleak-bpfcc.

To run and profile use something like:

make
sudo memleak-bpfcc -c './leak-reproducer -N=30 --wait-start --wait testdata/large.go'

# 1. Wait for memleak-bpfcc to start outputting "Top 10 stacks with outstanding allocations:"
#    before pressing RETURN (and starting the run).
# 2. Wait for all iterations (-N) to complete.
# 3. When you see "Press RETURN to terminate ..." the program is
#    done and all memory should be reclaimed.
# 4. Inspect the "Top 10 stacks with outstanding allocations:" output
#    before pressing RETURN to terminate.

What I see is output similar to the below.

I must admit that I'm not well-versed in profiling CGO code, but I'm hoping that this is helpful to (1) identify it there actually is a memory leak and (2) if so help pinpoint where memory is being leaked.

[12:46:04] Top 10 stacks with outstanding allocations:
	262144 bytes in 1 allocations from stack
		0x00000000008eea0a	x_cgo_mmap+0x45 [leak-reproducer]
		0x000000000049ec38	runtime.callCgoMmap.abi0+0x38 [leak-reproducer]
		0x000000000041d35f	runtime.mmap.func1+0x3f [leak-reproducer]
		0x000000000041d29a	runtime.mmap+0x5a [leak-reproducer]
		0x000000000042fe33	runtime.sysAllocOS+0x33 [leak-reproducer]
		0x000000000042faaa	runtime.sysAlloc+0x4a [leak-reproducer]
		0x0000000000429f5f	runtime.persistentalloc1+0xff [leak-reproducer]
		0x0000000000429e48	runtime.persistentalloc.func1+0x28 [leak-reproducer]
		0x0000000000429e05	runtime.persistentalloc+0x45 [leak-reproducer]
		0x0000000000433777	runtime.(*fixalloc).alloc+0x77 [leak-reproducer]
		0x00000000004478c9	runtime.(*mheap).allocMSpanLocked+0xa9 [leak-reproducer]
		0x0000000000447c12	runtime.(*mheap).allocSpan+0x232 [leak-reproducer]
		0x000000000044743f	runtime.(*mheap).allocManual+0x3f [leak-reproducer]
		0x0000000000477c25	runtime.stackalloc+0x125 [leak-reproducer]
		0x000000000046751f	runtime.malg.func1+0x1f [leak-reproducer]
		0x00000000004674a5	runtime.malg+0x65 [leak-reproducer]
		0x000000000045855d	runtime.mpreinit+0x1d [leak-reproducer]
		0x000000000045fee5	runtime.mcommoninit+0xc5 [leak-reproducer]
		0x0000000000462232	runtime.allocm+0xb2 [leak-reproducer]
		0x0000000000462cba	runtime.newm+0x3a [leak-reproducer]
		0x000000000046330b	runtime.startm+0x16b [leak-reproducer]
		0x0000000000495fc7	runtime.wakep+0x87 [leak-reproducer]
		0x000000000046547c	runtime.resetspinning+0x3c [leak-reproducer]
		0x00000000004658a5	runtime.schedule+0x145 [leak-reproducer]
		0x0000000000465c19	runtime.park_m+0x1d9 [leak-reproducer]
		0x000000000049b595	runtime.mcall+0x55 [leak-reproducer]
	524288 bytes in 2 allocations from stack
		0x00000000008eea0a	x_cgo_mmap+0x45 [leak-reproducer]
		0x000000000049ec38	runtime.callCgoMmap.abi0+0x38 [leak-reproducer]
		0x000000000041d35f	runtime.mmap.func1+0x3f [leak-reproducer]
		0x000000000041d29a	runtime.mmap+0x5a [leak-reproducer]
		0x000000000042fe33	runtime.sysAllocOS+0x33 [leak-reproducer]
		0x000000000042faaa	runtime.sysAlloc+0x4a [leak-reproducer]
		0x0000000000429f5f	runtime.persistentalloc1+0xff [leak-reproducer]
		0x0000000000429e48	runtime.persistentalloc.func1+0x28 [leak-reproducer]
		0x000000000049b60a	runtime.systemstack.abi0+0x4a [leak-reproducer]
		0x0000000000429e05	runtime.persistentalloc+0x45 [leak-reproducer]
		0x0000000000454b49	runtime.(*spanSetBlockAlloc).alloc+0x49 [leak-reproducer]
		0x00000000004545f3	runtime.(*spanSet).push+0x153 [leak-reproducer]
		0x00000000004446de	runtime.(*sweepLocked).sweep+0x75e [leak-reproducer]
		0x0000000000443cc5	runtime.sweepone+0xc5 [leak-reproducer]
		0x0000000000443a77	runtime.bgsweep+0xb7 [leak-reproducer]
		0x0000000000433a77	runtime.gcenable.gowrap1+0x17 [leak-reproducer]
		0x000000000049d1e1	runtime.goexit.abi0+0x1 [leak-reproducer]
	786432 bytes in 3 allocations from stack
		0x00000000008eea0a	x_cgo_mmap+0x45 [leak-reproducer]
		0x000000000049ec38	runtime.callCgoMmap.abi0+0x38 [leak-reproducer]
		0x000000000041d35f	runtime.mmap.func1+0x3f [leak-reproducer]
		0x000000000041d29a	runtime.mmap+0x5a [leak-reproducer]
		0x000000000042fe33	runtime.sysAllocOS+0x33 [leak-reproducer]
		0x000000000042faaa	runtime.sysAlloc+0x4a [leak-reproducer]
		0x0000000000429f5f	runtime.persistentalloc1+0xff [leak-reproducer]
		0x0000000000429e48	runtime.persistentalloc.func1+0x28 [leak-reproducer]
		0x0000000000429e05	runtime.persistentalloc+0x45 [leak-reproducer]
		0x0000000000433777	runtime.(*fixalloc).alloc+0x77 [leak-reproducer]
		0x00000000004478c9	runtime.(*mheap).allocMSpanLocked+0xa9 [leak-reproducer]
		0x0000000000447c12	runtime.(*mheap).allocSpan+0x232 [leak-reproducer]
		0x000000000044743f	runtime.(*mheap).allocManual+0x3f [leak-reproducer]
		0x0000000000445e4d	runtime.getempty.func1+0x2d [leak-reproducer]
		0x0000000000445d3d	runtime.getempty+0xfd [leak-reproducer]
		0x0000000000445474	runtime.(*gcWork).init+0x14 [leak-reproducer]
		0x000000000044558c	runtime.(*gcWork).putObj+0xcc [leak-reproducer]
		0x000000000043b3e5	runtime.greyobject+0x1e5 [leak-reproducer]
		0x000000000043aeab	runtime.scanblock+0x14b [leak-reproducer]
		0x0000000000438948	runtime.markrootBlock+0x88 [leak-reproducer]
		0x000000000043871c	runtime.markroot+0x3fc [leak-reproducer]
		0x000000000043a9c6	runtime.gcDrain+0x4a6 [leak-reproducer]
		0x000000000043a3c5	runtime.gcDrainMarkWorkerDedicated+0x25 [leak-reproducer]
		0x000000000043638c	runtime.gcBgMarkWorker.func2+0x8c [leak-reproducer]
		0x000000000049b60a	runtime.systemstack.abi0+0x4a [leak-reproducer]
		0x000000000043619f	runtime.gcBgMarkWorker+0x1bf [leak-reproducer]
		0x0000000000435f97	runtime.gcBgMarkStartWorkers.gowrap1+0x17 [leak-reproducer]
		0x000000000049d1e1	runtime.goexit.abi0+0x1 [leak-reproducer]
	8388608 bytes in 2 allocations from stack
		0x00000000008eea0a	x_cgo_mmap+0x45 [leak-reproducer]
		0x000000000049ec38	runtime.callCgoMmap.abi0+0x38 [leak-reproducer]
		0x000000000041d35f	runtime.mmap.func1+0x3f [leak-reproducer]
		0x000000000041d29a	runtime.mmap+0x5a [leak-reproducer]
		0x000000000043045d	runtime.sysMapOS+0x3d [leak-reproducer]
		0x000000000042fda5	runtime.sysMap+0x45 [leak-reproducer]
		0x00000000004486a5	runtime.(*mheap).grow+0x325 [leak-reproducer]
		0x0000000000447b9e	runtime.(*mheap).allocSpan+0x1be [leak-reproducer]
		0x00000000004473df	runtime.(*mheap).alloc.func1+0x5f [leak-reproducer]
		0x000000000049b60a	runtime.systemstack.abi0+0x4a [leak-reproducer]
		0x0000000000447337	runtime.(*mheap).alloc+0x57 [leak-reproducer]
		0x000000000042cec8	runtime.(*mcache).allocLarge+0x88 [leak-reproducer]
		0x000000000042972d	runtime.mallocgcLarge+0x6d [leak-reproducer]
		0x0000000000493dd6	runtime.mallocgc+0x116 [leak-reproducer]
		0x0000000000498106	runtime.makeslice+0x86 [leak-reproducer]
		0x0000000000517c7a	os.readFileContents+0xba [leak-reproducer]
		0x0000000000517936	os.ReadFile+0x1f6 [leak-reproducer]
		0x00000000007c3a49	main.parseFile+0x69 [leak-reproducer]
		0x00000000007c385e	main.main+0x3be [leak-reproducer]
		0x000000000045e587	runtime.main+0x267 [leak-reproducer]
		0x000000000049d1e1	runtime.goexit.abi0+0x1 [leak-reproducer]
	8392704 bytes in 1 allocations from stack
		0x000070441249d537	__pthread_create_2_1+0x977 [libc.so.6]
		0x00000000008ee733	_cgo_try_pthread_create+0x3d [leak-reproducer]
		0x00000000008ee927	_cgo_sys_thread_start+0xac [leak-reproducer]
		0x00000000008eee53	x_cgo_thread_start+0x7a [leak-reproducer]
		0x000000000049ceca	runtime.asmcgocall.abi0+0xaa [leak-reproducer]
		0x0000000000462e73	runtime.newm1+0x93 [leak-reproducer]
		0x0000000000462d7c	runtime.newm+0xfc [leak-reproducer]
		0x000000000046166a	runtime.startTheWorldWithSema+0x12a [leak-reproducer]
		0x0000000000434557	runtime.gcStart.func4+0x37 [leak-reproducer]
		0x000000000049b60a	runtime.systemstack.abi0+0x4a [leak-reproducer]
		0x00000000004343df	runtime.gcStart+0x53f [leak-reproducer]
		0x0000000000429814	runtime.mallocgcLarge+0x154 [leak-reproducer]
		0x0000000000493dd6	runtime.mallocgc+0x116 [leak-reproducer]
		0x000000000047b397	runtime.slicebytetostring+0x77 [leak-reproducer]
		0x00000000007c14d9	github.com/tree-sitter/go-tree-sitter.readUTF8+0x159 [leak-reproducer]
		0x00000000007c2d2c	_cgoexp_ee3fac1ee088_readUTF8+0x6c [leak-reproducer]
		0x000000000041dc08	runtime.cgocallbackg1+0x2a8 [leak-reproducer]
		0x000000000041d87e	runtime.cgocallbackg+0x11e [leak-reproducer]
		0x000000000049f629	runtime.cgocallbackg.abi0+0x29 [leak-reproducer]
		0x000000000049cfac	runtime.cgocallback.abi0+0xcc [leak-reproducer]
		0x00000000005c90e1	crosscall2+0x41 [leak-reproducer]
		0x00007ffd05d93430	[unknown]
		0x00000000008bf21e	readUTF8+0xaa [leak-reproducer]
		0x00000000008c73da	ts_lexer__get_chunk+0x51 [leak-reproducer]
		0x00000000008c80a7	ts_lexer_start+0x89 [leak-reproducer]
		0x00000000008cda78	ts_parser__lex+0x551 [leak-reproducer]
		0x00000000008d16d8	ts_parser__advance+0x144 [leak-reproducer]
		0x00000000008d351b	ts_parser_parse+0x616 [leak-reproducer]
		0x00000000008d3956	ts_parser_parse_with_options+0x78 [leak-reproducer]
		0x00000000008c18f9	_cgo_ee3fac1ee088_Cfunc_ts_parser_parse_with_options+0x78 [leak-reproducer]
		0x000000000049ce84	runtime.asmcgocall.abi0+0x64 [leak-reproducer]
		0x000000000049315f	runtime.cgocall+0x7f [leak-reproducer]
		0x00000000007bfd0a	github.com/tree-sitter/go-tree-sitter._Cfunc_ts_parser_parse_with_options.abi0+0x4a [leak-reproducer]
		0x00000000007c1d48	github.com/tree-sitter/go-tree-sitter.(*Parser).ParseWithOptions.func2+0x248 [leak-reproducer]
		0x00000000007c1a4e	github.com/tree-sitter/go-tree-sitter.(*Parser).ParseWithOptions+0x38e [leak-reproducer]
		0x00000000007c1232	github.com/tree-sitter/go-tree-sitter.(*Parser).Parse+0xb2 [leak-reproducer]
		0x00000000007c3e6a	main.parseFile+0x48a [leak-reproducer]
		0x00000000007c385e	main.main+0x3be [leak-reproducer]
		0x000000000045e587	runtime.main+0x267 [leak-reproducer]
		0x000000000049d1e1	runtime.goexit.abi0+0x1 [leak-reproducer]
	8392704 bytes in 1 allocations from stack
		0x000070441249d537	__pthread_create_2_1+0x977 [libc.so.6]
		0x00000000008ee733	_cgo_try_pthread_create+0x3d [leak-reproducer]
		0x00000000008ee927	_cgo_sys_thread_start+0xac [leak-reproducer]
		0x00000000008eee53	x_cgo_thread_start+0x7a [leak-reproducer]
		0x000000000049ceca	runtime.asmcgocall.abi0+0xaa [leak-reproducer]
		0x0000000000462e73	runtime.newm1+0x93 [leak-reproducer]
		0x0000000000462d7c	runtime.newm+0xfc [leak-reproducer]
		0x000000000046330b	runtime.startm+0x16b [leak-reproducer]
		0x0000000000495fc7	runtime.wakep+0x87 [leak-reproducer]
		0x000000000046174a	runtime.startTheWorldWithSema+0x20a [leak-reproducer]
		0x0000000000434557	runtime.gcStart.func4+0x37 [leak-reproducer]
		0x000000000049b60a	runtime.systemstack.abi0+0x4a [leak-reproducer]
		0x00000000004343df	runtime.gcStart+0x53f [leak-reproducer]
		0x0000000000429814	runtime.mallocgcLarge+0x154 [leak-reproducer]
		0x0000000000493dd6	runtime.mallocgc+0x116 [leak-reproducer]
		0x000000000047b397	runtime.slicebytetostring+0x77 [leak-reproducer]
		0x00000000007c14d9	github.com/tree-sitter/go-tree-sitter.readUTF8+0x159 [leak-reproducer]
		0x00000000007c2d2c	_cgoexp_ee3fac1ee088_readUTF8+0x6c [leak-reproducer]
		0x000000000041dc08	runtime.cgocallbackg1+0x2a8 [leak-reproducer]
		0x000000000041d87e	runtime.cgocallbackg+0x11e [leak-reproducer]
		0x000000000049f629	runtime.cgocallbackg.abi0+0x29 [leak-reproducer]
		0x000000000049cfac	runtime.cgocallback.abi0+0xcc [leak-reproducer]
		0x00000000005c90e1	crosscall2+0x41 [leak-reproducer]
		0x00007ffd05d93430	[unknown]
		0x00000000008bf21e	readUTF8+0xaa [leak-reproducer]
		0x00000000008c73da	ts_lexer__get_chunk+0x51 [leak-reproducer]
		0x00000000008c80a7	ts_lexer_start+0x89 [leak-reproducer]
		0x00000000008cda78	ts_parser__lex+0x551 [leak-reproducer]
		0x00000000008d16d8	ts_parser__advance+0x144 [leak-reproducer]
		0x00000000008d351b	ts_parser_parse+0x616 [leak-reproducer]
		0x00000000008d3956	ts_parser_parse_with_options+0x78 [leak-reproducer]
		0x00000000008c18f9	_cgo_ee3fac1ee088_Cfunc_ts_parser_parse_with_options+0x78 [leak-reproducer]
		0x000000000049ce84	runtime.asmcgocall.abi0+0x64 [leak-reproducer]
		0x000000000049315f	runtime.cgocall+0x7f [leak-reproducer]
		0x00000000007bfd0a	github.com/tree-sitter/go-tree-sitter._Cfunc_ts_parser_parse_with_options.abi0+0x4a [leak-reproducer]
		0x00000000007c1d48	github.com/tree-sitter/go-tree-sitter.(*Parser).ParseWithOptions.func2+0x248 [leak-reproducer]
		0x00000000007c1a4e	github.com/tree-sitter/go-tree-sitter.(*Parser).ParseWithOptions+0x38e [leak-reproducer]
		0x00000000007c1232	github.com/tree-sitter/go-tree-sitter.(*Parser).Parse+0xb2 [leak-reproducer]
		0x00000000007c3e6a	main.parseFile+0x48a [leak-reproducer]
		0x00000000007c385e	main.main+0x3be [leak-reproducer]
		0x000000000045e587	runtime.main+0x267 [leak-reproducer]
		0x000000000049d1e1	runtime.goexit.abi0+0x1 [leak-reproducer]
	8392704 bytes in 1 allocations from stack
		0x000070441249d537	__pthread_create_2_1+0x977 [libc.so.6]
		0x00000000008ee733	_cgo_try_pthread_create+0x3d [leak-reproducer]
		0x00000000008ee927	_cgo_sys_thread_start+0xac [leak-reproducer]
		0x00000000008eee53	x_cgo_thread_start+0x7a [leak-reproducer]
		0x000000000049ceca	runtime.asmcgocall.abi0+0xaa [leak-reproducer]
		0x0000000000462e73	runtime.newm1+0x93 [leak-reproducer]
		0x0000000000462d7c	runtime.newm+0xfc [leak-reproducer]
		0x000000000046330b	runtime.startm+0x16b [leak-reproducer]
		0x0000000000463425	runtime.handoffp+0x45 [leak-reproducer]
		0x000000000046372a	runtime.stoplockedm+0x6a [leak-reproducer]
		0x000000000046579a	runtime.schedule+0x3a [leak-reproducer]
		0x0000000000465fb3	runtime.preemptPark+0xf3 [leak-reproducer]
		0x000000000047932b	runtime.newstack+0x3cb [leak-reproducer]
		0x000000000049b71d	runtime.morestack.abi0+0x7d [leak-reproducer]
	16785408 bytes in 2 allocations from stack
		0x000070441249d537	__pthread_create_2_1+0x977 [libc.so.6]
		0x00000000008ee733	_cgo_try_pthread_create+0x3d [leak-reproducer]
		0x00000000008ee927	_cgo_sys_thread_start+0xac [leak-reproducer]
		0x00000000008eee53	x_cgo_thread_start+0x7a [leak-reproducer]
		0x000000000049ceca	runtime.asmcgocall.abi0+0xaa [leak-reproducer]
		0x0000000000462e73	runtime.newm1+0x93 [leak-reproducer]
		0x0000000000462d7c	runtime.newm+0xfc [leak-reproducer]
		0x000000000046330b	runtime.startm+0x16b [leak-reproducer]
		0x0000000000495fc7	runtime.wakep+0x87 [leak-reproducer]
		0x000000000046547c	runtime.resetspinning+0x3c [leak-reproducer]
		0x00000000004658a5	runtime.schedule+0x145 [leak-reproducer]
		0x0000000000465c19	runtime.park_m+0x1d9 [leak-reproducer]
		0x000000000049b595	runtime.mcall+0x55 [leak-reproducer]
	134217728 bytes in 2 allocations from stack
		0x00007044124aa034	alloc_new_heap+0x84 [libc.so.6]
		0x00007044124aa599	arena_get2.part.0+0x299 [libc.so.6]
		0x00007044124aceb9	tcache_init.part.0+0xa9 [libc.so.6]
		0x00007044124ade52	__libc_free+0x102 [libc.so.6]
		0x00000000008ee9a4	threadentry+0x37 [leak-reproducer]
		0x000070441249caa4	start_thread+0x384 [libc.so.6]
		0x0000704412529c6c	__GI___clone3+0x2c [libc.so.6]
	268435456 bytes in 2 allocations from stack
		0x00007044124aa0dc	alloc_new_heap+0x12c [libc.so.6]
		0x00007044124aa599	arena_get2.part.0+0x299 [libc.so.6]
		0x00007044124aceb9	tcache_init.part.0+0xa9 [libc.so.6]
		0x00007044124ade52	__libc_free+0x102 [libc.so.6]
		0x00000000008ee9a4	threadentry+0x37 [leak-reproducer]
		0x000070441249caa4	start_thread+0x384 [libc.so.6]
		0x0000704412529c6c	__GI___clone3+0x2c [libc.so.6]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions