fix: data race and nil deref reading S3 bucket config#331
Conversation
ProcessMediaForS3 read m.configs[userID].Bucket without holding m.mu while the configs map is written under the lock from SetS3Config/InitializeS3Client/ RemoveClient. That is a data race, and if the config was removed between the upload and the metadata build it nil-dereferenced and crashed. Reads now go through bucketFor, which takes the RLock and returns an empty string when there is no config. Tests: TestBucketFor (nil-safe, red->green) and TestBucketForConcurrent (guards the read under go test -race). go build/vet/test pass on host.
There was a problem hiding this comment.
Code Review
This pull request introduces a thread-safe bucketFor helper method and associated unit tests to prevent concurrent map read/write races when retrieving S3 bucket configurations. The reviewer suggests simplifying the implementation by reusing the existing thread-safe GetClient method instead of adding a new helper, and provides a code suggestion to adapt the concurrent tests accordingly.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // bucketFor returns the configured bucket for userID, read under the lock so it | ||
| // cannot race with a concurrent reconfigure/removal (the configs map is written | ||
| // under m.mu elsewhere). Returns an empty string when there is no config instead | ||
| // of nil-dereferencing. | ||
| func (m *S3Manager) bucketFor(userID string) string { | ||
| m.mu.RLock() | ||
| defer m.mu.RUnlock() | ||
| if cfg, ok := m.configs[userID]; ok && cfg != nil { | ||
| return cfg.Bucket | ||
| } | ||
| return "" | ||
| } |
There was a problem hiding this comment.
There was a problem hiding this comment.
Agreed — done in 71d3be9: dropped the bucketFor helper and now read the bucket through the existing thread-safe GetClient (if _, config, ok := m.GetClient(userID); ok && config != nil). No new locking surface.
| func TestBucketFor(t *testing.T) { | ||
| m := &S3Manager{configs: map[string]*S3Config{}} | ||
|
|
||
| if got := m.bucketFor("nobody"); got != "" { | ||
| t.Errorf("no config: got %q; want empty string", got) | ||
| } | ||
|
|
||
| m.configs["u1"] = &S3Config{Bucket: "my-bucket"} | ||
| if got := m.bucketFor("u1"); got != "my-bucket" { | ||
| t.Errorf("got %q; want %q", got, "my-bucket") | ||
| } | ||
| } | ||
|
|
||
| // TestBucketForConcurrent runs bucketFor against concurrent writers. The unlocked | ||
| // map read this replaced raced with config updates: Go's runtime flags it with | ||
| // "concurrent map read and map write" (and the race detector confirms it). The | ||
| // locked read passes. | ||
| func TestBucketForConcurrent(t *testing.T) { | ||
| m := &S3Manager{configs: map[string]*S3Config{}} | ||
| var wg sync.WaitGroup | ||
| for i := 0; i < 100; i++ { | ||
| wg.Add(2) | ||
| go func() { | ||
| defer wg.Done() | ||
| _ = m.bucketFor("u1") | ||
| }() | ||
| go func() { | ||
| defer wg.Done() | ||
| m.mu.Lock() | ||
| m.configs["u1"] = &S3Config{Bucket: "b"} | ||
| m.mu.Unlock() | ||
| }() | ||
| } | ||
| wg.Wait() | ||
| } |
There was a problem hiding this comment.
Since bucketFor is removed in favor of GetClient, we can adapt the concurrent test to target GetClient directly. This ensures that the core configuration retrieval path remains thread-safe and free of data races.
func TestGetClientConcurrent(t *testing.T) {
m := &S3Manager{configs: map[string]*S3Config{}}
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(2)
go func() {
defer wg.Done()
_, _, _ = m.GetClient("u1")
}()
go func() {
defer wg.Done()
m.mu.Lock()
m.configs["u1"] = &S3Config{Bucket: "b"}
m.mu.Unlock()
}()
}
wg.Wait()
}There was a problem hiding this comment.
Done in 71d3be9: the concurrent test is now TestGetClientConcurrent, exercising GetClient directly under -race against a concurrent writer (verified on Linux via Docker).
Addresses Gemini review: instead of a new bucketFor helper, ProcessMediaForS3 now reads the config through the existing GetClient, which already takes the read lock and returns ok=false when there is no config. Drops the duplicated locking; the concurrent test now targets GetClient (TestGetClientConcurrent).
|
Thanks — applied. |
What & why
ProcessMediaForS3readm.configs[userID].Bucketwithout holdingm.mu, while that map is written under the lock fromSetS3Config/InitializeS3Client/RemoveClient. That is a data race, and if the config was removed between the upload and the metadata build, it nil-dereferenced and crashed the process.Change
The read now goes through
bucketFor, which takes theRLockand returns an empty string when there is no config (no nil deref). Mirrors the locking already used byGetClient.Testing
TestBucketFor— returns the bucket when configured, empty (never a nil panic) when not. Red→green: the old direct read nil-panics on a missing config.TestBucketForConcurrent— exercises the read against concurrent writers undergo test -race.go build ./...·go vet ./...·go test ./...— pass on host.go test -race ./...— pass on linux/amd64 (Dockergolang:1.25).No API change — internal stability fix, same class as #325.