Open
Description
What happened?
Test TestExecOkRestartThenExit
failed as part of https://github.com/hyperledger/firefly/actions/runs/14037665734/job/39299605862?pr=1669
2025-03-24T14:23:10.7376736Z [2025-03-24T14:23:08.985Z] INFO API server complete pid=16830
2025-03-24T14:23:10.7376860Z fatal error: concurrent map read and map write
2025-03-24T14:23:10.7376869Z
2025-03-24T14:23:10.7376960Z goroutine 13 [running]:
2025-03-24T14:23:10.7377241Z github.com/spf13/viper.(*Viper).searchMap(0xc000523880, 0xc0008272c0?, {0xc0008272c0, 0x3, 0x3})
2025-03-24T14:23:10.7377524Z /home/runner/go/pkg/mod/github.com/spf13/[email protected]/viper.go:669 +0x53
2025-03-24T14:23:10.7377740Z github.com/spf13/viper.(*Viper).find(0xc000523880, {0xc0005b2138, 0x13}, 0x1)
2025-03-24T14:23:10.7377985Z /home/runner/go/pkg/mod/github.com/spf13/[email protected]/viper.go:1400 +0xb55
2025-03-24T14:23:10.7378205Z github.com/spf13/viper.(*Viper).Get(0xc000523880, {0xc0005b2138?, 0xc000195c88?})
2025-03-24T14:23:10.7378438Z /home/runner/go/pkg/mod/github.com/spf13/[email protected]/viper.go:904 +0x49
2025-03-24T14:23:10.7378653Z github.com/spf13/viper.(*Viper).GetBool(0x1177863?, {0xc0005b2138?, 0x13da430?})
2025-03-24T14:23:10.7378897Z /home/runner/go/pkg/mod/github.com/spf13/[email protected]/viper.go:985 +0x18
2025-03-24T14:23:10.7379004Z github.com/spf13/viper.GetBool(...)
2025-03-24T14:23:10.7379228Z /home/runner/go/pkg/mod/github.com/spf13/[email protected]/viper.go:982
2025-03-24T14:23:10.7379581Z github.com/hyperledger/firefly-common/pkg/config.(*configSection).GetBool(0x1b29700?, {0x1177871?, 0x0?})
2025-03-24T14:23:10.7379939Z /home/runner/go/pkg/mod/github.com/hyperledger/[email protected]/pkg/config/config.go:493 +0xbe
2025-03-24T14:23:10.7380573Z github.com/hyperledger/firefly-common/pkg/httpserver.NewHTTPServer({0x13ea008, 0xc0002aca00}, {0x1177863, 0x7}, 0xc000000780, 0xc0002c89c0, {0x13f55f8, 0x1b29700}, {0x13f55f8, 0x1b29780}, ...)
2025-03-24T14:23:10.7380962Z /home/runner/go/pkg/mod/github.com/hyperledger/[email protected]/pkg/httpserver/httpserver.go:74 +0x89
2025-03-24T14:23:10.7381538Z github.com/hyperledger/firefly/internal/apiserver.(*apiServer).Serve(0xc0005e80c0, {0x13ea008, 0xc0002aca00}, {0x13f0f10, 0xc000500320})
2025-03-24T14:23:10.7381783Z /home/runner/work/firefly/firefly/internal/apiserver/server.go:126 +0x5b4
2025-03-24T14:23:10.7382456Z github.com/hyperledger/firefly/cmd.startFirefly({0x13ea008, 0xc0002aca00}, 0xc0001df6c0, {0x13f0f10, 0xc000500320}, {0x13e3060, 0xc0005e80c0}, 0xc0005e8120, 0xc0002ba1c0, 0xc0002c8540)
2025-03-24T14:23:10.7382645Z /home/runner/work/firefly/firefly/cmd/firefly.go:225 +0x5f2
2025-03-24T14:23:10.7382834Z created by github.com/hyperledger/firefly/cmd.run in goroutine 12
2025-03-24T14:23:10.7383013Z /home/runner/work/firefly/firefly/cmd/firefly.go:155 +0x79e
2025-03-24T14:23:10.7383076Z
2025-03-24T14:23:10.7383178Z goroutine 1 [chan receive]:
2025-03-24T14:23:10.7383325Z testing.(*T).Run(0xc000210b60, {0x11912bf?, 0x0?}, 0x12c4a80)
2025-03-24T14:23:10.7383557Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:1750 +0x3ab
2025-03-24T14:23:10.7383662Z testing.runTests.func1(0xc000210b60)
2025-03-24T14:23:10.7383941Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:2161 +0x37
2025-03-24T14:23:10.7384058Z testing.tRunner(0xc000210b60, 0xc00051fc60)
2025-03-24T14:23:10.7384274Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:1689 +0xfb
2025-03-24T14:23:10.7384541Z testing.runTests(0xc0001e9e30, {0x1acd100, 0x10, 0x10}, {0xc00051fca0?, 0x1018?, 0x1b2a400?})
2025-03-24T14:23:10.7384758Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:2159 +0x445
2025-03-24T14:23:10.7384862Z testing.(*M).Run(0xc000590140)
2025-03-24T14:23:10.7385073Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:2027 +0x68b
2025-03-24T14:23:10.7385154Z main.main()
2025-03-24T14:23:10.7385259Z _testmain.go:111 +0x193
2025-03-24T14:23:10.7385320Z
2025-03-24T14:23:10.7385424Z goroutine 15 [sync.Mutex.Lock]:
2025-03-24T14:23:10.7385606Z sync.runtime_SemacquireMutex(0xc000a47b80?, 0xe8?, 0xc000db4e80?)
2025-03-24T14:23:10.7385807Z /opt/hostedtoolcache/go/1.22.12/x64/src/runtime/sema.go:77 +0x25
2025-03-24T14:23:10.7385905Z sync.(*Mutex).lockSlow(0x1b8c2b0)
2025-03-24T14:23:10.7386101Z /opt/hostedtoolcache/go/1.22.12/x64/src/sync/mutex.go:171 +0x15d
2025-03-24T14:23:10.7386184Z sync.(*Mutex).Lock(...)
2025-03-24T14:23:10.7386355Z /opt/hostedtoolcache/go/1.22.12/x64/src/sync/mutex.go:90
2025-03-24T14:23:10.7386810Z github.com/hyperledger/firefly-common/pkg/config.(*configSection).AddKnownKey(0xc0006eb9b0, {0x1181466, 0xe}, {0xc000db4e70, 0x1, 0x1})
2025-03-24T14:23:10.7387178Z /home/runner/go/pkg/mod/github.com/hyperledger/[email protected]/pkg/config/config.go:413 +0x196
2025-03-24T14:23:10.7387591Z github.com/hyperledger/firefly/internal/events/websockets.(*WebSockets).InitConfig(0x1b299c0?, {0x13f55f8, 0xc0006eb9b0})
2025-03-24T14:23:10.7387875Z /home/runner/work/firefly/firefly/internal/events/websockets/config.go:33 +0x68
2025-03-24T14:23:10.7388180Z github.com/hyperledger/firefly/internal/events/eifactory.InitConfig({0x13f55f8, 0x1b299c0})
2025-03-24T14:23:10.7388451Z /home/runner/work/firefly/firefly/internal/events/eifactory/factory.go:47 +0xb9
2025-03-24T14:23:10.7388638Z github.com/hyperledger/firefly/internal/namespace.InitConfig()
2025-03-24T14:23:10.7388878Z /home/runner/work/firefly/firefly/internal/namespace/config.go:87 +0x3b1
2025-03-24T14:23:10.7389016Z github.com/hyperledger/firefly/cmd.resetConfig()
2025-03-24T14:23:10.7389192Z /home/runner/work/firefly/firefly/cmd/firefly.go:87 +0x76
2025-03-24T14:23:10.7389333Z github.com/hyperledger/firefly/cmd.reloadConfig()
2025-03-24T14:23:10.7389503Z /home/runner/work/firefly/firefly/cmd/firefly.go:92 +0x4e
2025-03-24T14:23:10.7389618Z github.com/hyperledger/firefly/cmd.run()
2025-03-24T14:23:10.7389803Z /home/runner/work/firefly/firefly/cmd/firefly.go:177 +0x90a
2025-03-24T14:23:10.7390041Z github.com/hyperledger/firefly/cmd.TestExecOkRestartThenExit(0xc000210340)
2025-03-24T14:23:10.7390242Z /home/runner/work/firefly/firefly/cmd/firefly_test.go:130 +0x22a
2025-03-24T14:23:10.7390349Z testing.tRunner(0xc000210340, 0x12c4a80)
2025-03-24T14:23:10.7390568Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:1689 +0xfb
2025-03-24T14:23:10.7390686Z created by testing.(*T).Run in goroutine 1
2025-03-24T14:23:10.7390956Z /opt/hostedtoolcache/go/1.22.12/x64/src/testing/testing.go:1742 +0x390
2025-03-24T14:23:10.7390962Z
2025-03-24T14:23:10.7391053Z goroutine 10 [syscall]:
2025-03-24T14:23:10.7391140Z os/signal.signal_recv()
2025-03-24T14:23:10.7391617Z /opt/hostedtoolcache/go/1.22.12/x64/src/runtime/sigqueue.go:152 +0x29
2025-03-24T14:23:10.7391713Z os/signal.loop()
2025-03-24T14:23:10.7391951Z /opt/hostedtoolcache/go/1.22.12/x64/src/os/signal/signal_unix.go:23 +0x13
2025-03-24T14:23:10.7392175Z created by os/signal.Notify.func1.1 in goroutine 23
2025-03-24T14:23:10.7392393Z /opt/hostedtoolcache/go/1.22.12/x64/src/os/signal/signal.go:151 +0x1f
2025-03-24T14:23:10.7392398Z
2025-03-24T14:23:10.7392487Z goroutine 14 [IO wait]:
2025-03-24T14:23:10.7392632Z internal/poll.runtime_pollWait(0x7f4f2c4925d8, 0x72)
2025-03-24T14:23:10.7392906Z /opt/hostedtoolcache/go/1.22.12/x64/src/runtime/netpoll.go:345 +0x85
2025-03-24T14:23:10.7393034Z internal/poll.(*pollDesc).wait(0x7?, 0x10?, 0x0)
2025-03-24T14:23:10.7393300Z /opt/hostedtoolcache/go/1.22.12/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27
2025-03-24T14:23:10.7393408Z internal/poll.(*pollDesc).waitRead(...)
2025-03-24T14:23:10.7393648Z /opt/hostedtoolcache/go/1.22.12/x64/src/internal/poll/fd_poll_runtime.go:89
2025-03-24T14:23:10.7393751Z internal/poll.(*FD).Accept(0xc000552000)
2025-03-24T14:23:10.7393983Z /opt/hostedtoolcache/go/1.22.12/x64/src/internal/poll/fd_unix.go:611 +0x2ac
2025-03-24T14:23:10.7394079Z net.(*netFD).accept(0xc000552000)
2025-03-24T14:23:10.7394282Z /opt/hostedtoolcache/go/1.22.12/x64/src/net/fd_unix.go:172 +0x29
2025-03-24T14:23:10.7394442Z net.(*TCPListener).accept(0xc00033e020)
2025-03-24T14:23:10.7394665Z /opt/hostedtoolcache/go/1.22.12/x64/src/net/tcpsock_posix.go:159 +0x1e
2025-03-24T14:23:10.7394763Z net.(*TCPListener).Accept(0xc00033e020)
2025-03-24T14:23:10.7394963Z /opt/hostedtoolcache/go/1.22.12/x64/src/net/tcpsock.go:327 +0x30
2025-03-24T14:23:10.7395125Z net/http.(*Server).Serve(0xc0005f8000, {0x13e8550, 0xc00033e020})
2025-03-24T14:23:10.7395345Z /opt/hostedtoolcache/go/1.22.12/x64/src/net/http/server.go:3260 +0x33e
2025-03-24T14:23:10.7395473Z net/http.(*Server).ListenAndServe(0xc0005f8000)
2025-03-24T14:23:10.7395684Z /opt/hostedtoolcache/go/1.22.12/x64/src/net/http/server.go:3189 +0x71
2025-03-24T14:23:10.7395847Z github.com/hyperledger/firefly/cmd.startFirefly.func1()
2025-03-24T14:23:10.7396030Z /home/runner/work/firefly/firefly/cmd/firefly.go:202 +0x27
2025-03-24T14:23:10.7396250Z created by github.com/hyperledger/firefly/cmd.startFirefly in goroutine 13
2025-03-24T14:23:10.7396439Z /home/runner/work/firefly/firefly/cmd/firefly.go:201 +0x40b
2025-03-24T14:23:10.7396444Z
2025-03-24T14:23:10.7396538Z goroutine 16 [chan send]:
2025-03-24T14:23:10.7396927Z github.com/hyperledger/firefly-common/pkg/httpserver.(*httpServer).ServeHTTP(0xc000a13320, {0x13ea008, 0xc000500230})
2025-03-24T14:23:10.7397329Z /home/runner/go/pkg/mod/github.com/hyperledger/[email protected]/pkg/httpserver/httpserver.go:176 +0x1a5
2025-03-24T14:23:10.7397639Z created by github.com/hyperledger/firefly/internal/apiserver.(*apiServer).Serve in goroutine 11
2025-03-24T14:23:10.7397891Z /home/runner/work/firefly/firefly/internal/apiserver/server.go:105 +0x20d
2025-03-24T14:23:10.7397895Z
2025-03-24T14:23:10.7397984Z goroutine 123 [chan send]:
2025-03-24T14:23:10.7398370Z github.com/hyperledger/firefly-common/pkg/httpserver.(*httpServer).ServeHTTP(0xc000e01320, {0x13ea008, 0xc0002aca00})
2025-03-24T14:23:10.7398754Z /home/runner/go/pkg/mod/github.com/hyperledger/[email protected]/pkg/httpserver/httpserver.go:176 +0x1a5
2025-03-24T14:23:10.7399064Z created by github.com/hyperledger/firefly/internal/apiserver.(*apiServer).Serve in goroutine 13
2025-03-24T14:23:10.7399312Z /home/runner/work/firefly/firefly/internal/apiserver/server.go:105 +0x20d
2025-03-24T14:23:10.7399318Z
2025-03-24T14:23:10.7399405Z goroutine 38 [chan send]:
2025-03-24T14:23:10.7399797Z github.com/hyperledger/firefly-common/pkg/httpserver.(*httpServer).ServeHTTP(0xc000ca1e60, {0x13ea008, 0xc0003db9f0})
2025-03-24T14:23:10.7400267Z /home/runner/go/pkg/mod/github.com/hyperledger/[email protected]/pkg/httpserver/httpserver.go:176 +0x1a5
2025-03-24T14:23:10.7400575Z created by github.com/hyperledger/firefly/internal/apiserver.(*apiServer).Serve in goroutine 36
2025-03-24T14:23:10.7400812Z /home/runner/work/firefly/firefly/internal/apiserver/server.go:105 +0x20d
2025-03-24T14:23:10.7400943Z FAIL github.com/hyperledger/firefly/cmd 0.065s
Diagnosis
From what I can see is at the same time:
- Go Routine 13 is reading from the config with
GetBool()
- Go Routine 15 is writing to the config with
AddKnownKey()
Okay there is a race condition here:
run
is the main functioncmd.go
method that starts the server as part of that there is aninit
for the manager.- The
init
call can get aresetChan
which is what the test is using so reset to mock a config change - But as part of the
init
call we are initializing some HTTPServer using FireFly Common - FireFly Common uses viper to manage config
- Viper has a mutex lock on the map when reading/writing config, the issue is that the
context.Context
is not propagated to theGetBool()
function. That call is stuck on the mutex but the context has been cancelled as we are reinitializing FireFly. - At that point we reload the config, which means adding
AddKnownKey
and we are reading/write to the same map - There is no
thread-safe
Viper Set method by the sounds of it