Skip to content

feat: add recover to goroutine to prevent unexpected panics#226

Merged
wu-sheng merged 2 commits into
apache:mainfrom
youngxhui:main
Jun 29, 2025
Merged

feat: add recover to goroutine to prevent unexpected panics#226
wu-sheng merged 2 commits into
apache:mainfrom
youngxhui:main

Conversation

@youngxhui

Copy link
Copy Markdown
Contributor

The online service experienced a panic after integrating SkyWalking.

SkyWalking’s panic should not affect the normal operation of the service.
Each goroutine should be guarded with a recover function.
Currently, many goroutines in the codebase are missing recover protection.

panic log

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x48333b]

goroutine 23 [running]:
google.golang.org/protobuf/encoding/protowire.AppendString(...)
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/encoding/protowire/wire.go:455
google.golang.org/protobuf/internal/impl.appendStringNoZeroValidateUTF8({0xc00338a000?, 0x10a?, 0x1e7?}, {0xc00026a978?}, 0x42d52b?, {0x2?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/codec_gen.go:5113 +0x95
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc000165708, {0xc00338a000?, 0xa90705?, 0x1070293?}, {0xaa53ac?}, {0x8?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/encode.go:139 +0x137
google.golang.org/protobuf/internal/impl.appendMessageSliceInfo({0xc00338a000?, 0x0?, 0x3a00?}, {0x2964e20?}, 0xc0000c5de0, {0x20?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/codec_field.go:485 +0xee
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc000022d18, {0xc00338a000?, 0xa90705?, 0x1070322?}, {0x1072000?}, {0x0?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/encode.go:139 +0x137
google.golang.org/protobuf/internal/impl.appendMessageSliceInfo({0xc00338a000?, 0xc00026aa60?, 0x419d45?}, {0xc000072808?}, 0xc0000b8118, {0x0?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/codec_field.go:485 +0xee
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc000022a88, {0xc00338a000?, 0xc000072808?, 0xc00338a000?}, {0xc00026aaf0?}, {0xe5?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/encode.go:139 +0x137
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshal(0xc001ed2a01?, {{}, {0x1b4baf0, 0xc001081a70}, {0xc00338a000, 0x0, 0x10703c9}, 0x2})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/internal/impl/encode.go:107 +0x85
google.golang.org/protobuf/proto.MarshalOptions.marshal({{}, 0x0?, 0xac?, 0x26?}, {0x0, 0x0, 0x0}, {0x1b4baf0, 0xc001081a70})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/proto/encode.go:166 +0x24e
google.golang.org/protobuf/proto.MarshalOptions.MarshalAppend({{}, 0xe0?, 0xd4?, 0x80?}, {0x0, 0x0, 0x0}, {0x1b2da60?, 0xc001081a70?})
        /root/go/pkg/mod/google.golang.org/protobuf@v1.33.0/proto/encode.go:125 +0x73
github.com/golang/protobuf/proto.marshalAppend({0x0, 0x0, 0x0}, {0x7f693418b118?, 0xc001081a70?}, 0x0)
        /root/go/pkg/mod/github.com/golang/protobuf@v1.5.3/proto/wire.go:40 +0x92
github.com/golang/protobuf/proto.Marshal(...)
        /root/go/pkg/mod/github.com/golang/protobuf@v1.5.3/proto/wire.go:23
google.golang.org/grpc/encoding/proto.codec.Marshal({}, {0x180d4e0?, 0xc001081a70?})
        /root/go/pkg/mod/google.golang.org/grpc@v1.57.0/encoding/proto/proto.go:45 +0x52
google.golang.org/grpc.encode({0x7f69342c6178?, 0x297be40?}, {0x180d4e0?, 0xc001081a70?})
        /root/go/pkg/mod/google.golang.org/grpc@v1.57.0/rpc_util.go:633 +0x3e
google.golang.org/grpc.prepareMsg({0x180d4e0?, 0xc001081a70?}, {0x7f69342c6178?, 0x297be40?}, {0x0, 0x0}, {0x0, 0x0})
        /root/go/pkg/mod/google.golang.org/grpc@v1.57.0/stream.go:1766 +0xc7
google.golang.org/grpc.(*clientStream).SendMsg(0xc000170000, {0x180d4e0, 0xc001081a70})
        /root/go/pkg/mod/google.golang.org/grpc@v1.57.0/stream.go:882 +0xd7
skywalking.apache.org/repo/goapi/collect/language/agent/v3.(*traceSegmentReportServiceCollectClient).Send(0xc00004cad0?, 0xc000e70e10?)
        /root/go/pkg/mod/skywalking.apache.org/repo/goapi@v0.0.0-20230314034821-0c5a44bb767a/collect/language/agent/v3/Tracing_grpc.pb.go:61 +0x25
github.com/apache/skywalking-go/agent/reporter.(*gRPCReporter).initSendPipeline.func1()
        grpc.go:322 +0x1d1
created by github.com/apache/skywalking-go/agent/reporter.(*gRPCReporter).initSendPipeline in goroutine 1
        grpc.go:304 +0x5f

@wu-sheng

Copy link
Copy Markdown
Member

Could you explain why panic in this case? If there is any code bug, we should fic them rather than recovering.
Because it could cause CPU overload, and happen again, which makes your app overloaded too.

@youngxhui

youngxhui commented Jun 29, 2025

Copy link
Copy Markdown
Contributor Author

I haven’t identified the root cause yet. Many services in our production environment are integrated with SkyWalking, but no similar issues have been observed in other services. Just before the service panic, a large number of error logs were printed:

2025-06-29T00:00:36.914+0800 ERROR tracer.go:191 reach max tracing send buffer {"SW_CTX": "[sever-name,instance-name,N/A,N/A,-1]"}

@wu-sheng

Copy link
Copy Markdown
Member

We should merge this directly. Instead, we should try to analyze why.

Because, if it is a predicted case, we should process it properly.

@youngxhui

Copy link
Copy Markdown
Contributor Author

Currently, I'm unable to reproduce the issue with a demo. Since we cannot guarantee the safety of goroutines, every goroutine must have a recover .

@wu-sheng wu-sheng requested a review from mrproliu June 29, 2025 07:54
@wu-sheng wu-sheng added this to the 0.7.0 milestone Jun 29, 2025
@mrproliu

Copy link
Copy Markdown
Contributor

Please update the CHANGES.md file.

@wu-sheng wu-sheng added the enhancement New feature or request label Jun 29, 2025
@youngxhui

Copy link
Copy Markdown
Contributor Author

Please update the CHANGES.md file.

The documentation has been updated.

@wu-sheng wu-sheng merged commit 04d4d36 into apache:main Jun 29, 2025
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants