-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Description
Summary
cachemulti.Store passes traceContext map by reference in newCacheMultiStoreFromCMS(), causing fatal error: concurrent map iteration and map write when multiple goroutines concurrently create and configure cache stores with tracing enabled.
This is the same class of bug that was fixed for rootmulti.Store in #11114 / #11117, but the fix was not applied to cachemulti.Store.
Stack Trace
fatal error: concurrent map iteration and map write
goroutine X [running]:
internal/runtime/maps.(*Iter).Next(...)
maps.Copy[...](...)
cosmossdk.io/store/types.TraceContext.Clone(...)
cosmossdk.io/store/types/store.go:464
cosmossdk.io/store/cachemulti.NewFromKVStore(...)
cosmossdk.io/store/cachemulti/store.go:55
cosmossdk.io/store/cachemulti.newCacheMultiStoreFromCMS(...)
cosmossdk.io/store/cachemulti/store.go:82
cosmossdk.io/store/cachemulti.Store.CacheMultiStore(...)
cosmossdk.io/store/cachemulti/store.go:141
Root Cause
PR #11117 added traceContextMutex to rootmulti.Store and introduced getTracingContext() which returns a copy of the trace context. This fixed the race at the rootmulti level.
However, cachemulti.Store was not updated with the same protection. Specifically, newCacheMultiStoreFromCMS() passes cms.traceContext by reference to NewFromKVStore():
// store/cachemulti/store.go:76-83
func newCacheMultiStoreFromCMS(cms Store) Store {
stores := make(map[types.StoreKey]types.CacheWrapper)
for k, v := range cms.stores {
stores[k] = v
}
return NewFromKVStore(cms.db, stores, nil, cms.traceWriter, cms.traceContext)
// ^^^^^^^^^^^^^^^^
// shared reference, not a copy
}When multiple goroutines share a parent CacheMultiStore:
- Read path:
CacheMultiStore()→NewFromKVStore()→Clone()→maps.Copy(ret, tc)iterates the shared map - Write path:
SetTracingContext()→maps.Copy(cms.traceContext, tc)writes to the same shared map
Concurrent execution of these two paths triggers Go's fatal concurrent map access detection.
rootmulti.Store
│
│ getTracingContext() → creates copy A (with mutex, safe)
│
▼
cachemulti.Store (parent)
│ traceContext = copy A
│
│ Multiple goroutines call CacheMultiStore() on the SAME parent:
│
│ goroutine 1: newCacheMultiStoreFromCMS()
│ → NewFromKVStore(..., copy A) ← passes reference directly
│ → copy A.Clone() ← READ (maps.Copy iterates copy A)
│
│ goroutine 2: newCacheMultiStoreFromCMS()
│ → NewFromKVStore(..., copy A) ← same reference
│ → copy A.Clone() ← READ
│
│ goroutine 3: SetTracingContext({"txHash":...})
│ → maps.Copy(copy A, tc) ← WRITE to the same map
│
│ goroutine 1 READ + goroutine 3 WRITE = concurrent map iteration and map write
│
▼
FATAL PANIC
Reproduction
func TestConcurrentCacheMultiStoreTraceContext(t *testing.T) {
db := dbm.NewMemDB()
stores := map[types.StoreKey]types.CacheWrapper{
types.NewKVStoreKey("store1"): dbadapter.Store{DB: dbm.NewMemDB()},
}
store := NewStore(db, stores, nil, &bytes.Buffer{}, types.TraceContext{"initial": "context"})
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
child := store.CacheMultiStore()
child.SetTracingContext(types.TraceContext{
"txHash": fmt.Sprintf("TX_%d", i),
})
}(i)
}
wg.Wait()
}Run from store/ directory:
cd store && go test ./cachemulti/... -run TestConcurrentCacheMultiStoreTraceContextThis reliably reproduces fatal error: concurrent map iteration and map write.
Suggested Fix
Copy traceContext in newCacheMultiStoreFromCMS before passing it to NewFromKVStore, consistent with the pattern established in rootmulti.Store.getTracingContext() by #11117:
func newCacheMultiStoreFromCMS(cms Store) Store {
stores := make(map[types.StoreKey]types.CacheWrapper)
for k, v := range cms.stores {
stores[k] = v
}
// Clone traceContext to prevent concurrent map access
var tc types.TraceContext
if cms.traceContext != nil {
tc = cms.traceContext.Clone()
}
return NewFromKVStore(cms.db, stores, nil, cms.traceWriter, tc)
}Version
cosmossdk.io/store v1.1.2