Skip to content

fix(grpc): sanitize invalid UTF-8 in sockaddr conversion#5325

Open
arpitjain099 wants to merge 1 commit into
aquasecurity:mainfrom
arpitjain099:fix/grpc-invalid-utf8-args-5292
Open

fix(grpc): sanitize invalid UTF-8 in sockaddr conversion#5325
arpitjain099 wants to merge 1 commit into
aquasecurity:mainfrom
arpitjain099:fix/grpc-invalid-utf8-args-5292

Conversation

@arpitjain099

Copy link
Copy Markdown

1. Explain what the PR does

Fixes #5292.

getSockaddr was the one place left in the gRPC event-to-proto conversion that assigned raw map[string]string values straight into proto3 string fields without UTF-8 sanitization. A getsockname / sockaddr_t argument can carry non-UTF-8 bytes:

  • sun_path of an abstract AF_UNIX socket begins with a NUL byte and can hold arbitrary binary bytes (uninitialized tail of the 108-byte sun_path buffer, abstract namespace names, etc.).
  • sin_addr / sin6_addr can likewise contain raw bytes that are not valid UTF-8.

proto3 string fields must be valid UTF-8, so proto.Marshal rejected these values with string field contains invalid UTF-8. That error propagated out of StreamEvents, killing the stream and disconnecting clients (grpcurl and others). The reporter confirmed via bisection that simply having getsockname in the event list triggered the crash.

The earlier UTF-8 hardening (#4894, #4916) sanitized nearly every other string path in this file but missed getSockaddr. This change closes that gap by routing the three affected fields through the existing sanitizeStringForProtobuf helper, matching the convention already used everywhere else in event_data.go.

2. Explain how to test it

Unit test added in pkg/server/grpc/event_data_test.go (Test_getSockaddr_InvalidUTF8). It feeds AF_UNIX, AF_INET, and AF_INET6 sockaddr arguments whose address/path fields contain invalid UTF-8 bytes through getEventData, then calls proto.Marshal on the resulting pb.Event.

Before this change the marshal fails:

proto: ...: string field contains invalid UTF-8

After it, marshal succeeds, the sanitized fields are valid UTF-8, and a valid sun_path like /tmp/socket is returned byte-for-byte unchanged.

make test-unit PKG=pkg/server/grpc TEST=Test_getSockaddr_InvalidUTF8

3. Other comments

The sanitization is lossy only for already-malformed input: invalid byte sequences are dropped (the same behavior sanitizeStringForProtobuf already applies to every other string field), while valid UTF-8 is preserved exactly. An alternative would be to expose the raw bytes through a dedicated bytes field on SockAddr, but that is an API/proto change; string sanitization keeps the wire format and existing consumers unchanged and is consistent with the rest of this file.

getSockaddr assigned raw map[string]string values straight into the
proto3 string fields sun_path, sin_addr and sin6_addr without UTF-8
sanitization. A getsockname / sockaddr_t argument can carry non-UTF-8
bytes (for example an abstract AF_UNIX sun_path that begins with a NUL
byte and holds binary data, or address fields with raw bytes). proto3
string fields require valid UTF-8, so proto.Marshal failed with
"string field contains invalid UTF-8", which killed the StreamEvents
gRPC stream and disconnected clients.

The earlier UTF-8 hardening covered the other string paths in this file
but missed getSockaddr. Route the three affected fields through the
existing sanitizeStringForProtobuf helper so the message marshals
cleanly. Valid UTF-8 values are returned unchanged.

Add Test_getSockaddr_InvalidUTF8 feeding AF_UNIX, AF_INET and AF_INET6
sockaddr arguments with invalid UTF-8 bytes through the conversion and
asserting proto.Marshal succeeds while a valid sun_path is preserved.

Fixes aquasecurity#5292

Signed-off-by: Arpit Jain <arpitjain099@gmail.com>
@arpitjain099 arpitjain099 requested review from a team and geyslan June 5, 2026 05:33
@geyslan

geyslan commented Jun 16, 2026

Copy link
Copy Markdown
Member

@arpitjain099 thank you for contributing. @trvll do you mind reviewing this? Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.24.1: Tracee gRPC server tries to marshal invalid UTF-8 in strings

2 participants