many: speed up unit/db tests from#1795
Conversation
|
Fixed the leaking connections, this was not caused by this PR it was there before as well. The pool was never closed after tests were finished. Also noted some warnings in the logs which were caused by the container not using IPv6 which is now the default for |
|
Also appended two more commits that enables The total test run on my system is now 3.3s :-) |
c0fc4cb to
d17aecd
Compare
|
@croissanne deps are broken due to OpenAPI, I had to add an extra commit that configures gobump to run unit tests after every single bump. since tests are now fast this is very nice and also it should improve gobump success rate with its PRs: I just re-executed tests 100x times to ensure this is stable enough, I know timing is not great but in the worst case we can turn off |
There was a problem hiding this comment.
This is a big PR and I'm not sure if all commits are actually helping with the speedup.
I think it might be nice to collect some stats per 'speedup' commit, in a little table or something, to see what is actually worth having the extra complexity.
The following I think are definitely worth it:
- tutils: improve postgres container startup
- distribution: share distro registry between tests
- cicd: fix race conditions in tests
- db: close connection pool in tests
- tutils: further speed up unit tests (but needs a better commit message title).
- and the .github workflow changes etc. which make use of the faster options
4d1348e to
808de5d
Compare
|
I rebased, resolved conflicts, reordered commits so tests are all passing, dropped the "localhost" commit that was not contributing any value and then reexecuted tests 3 times. In the PR description, I present cumulative number. As you can see, each commit contributes to better performance except the two that are technical things that are relevant to this:
|
|
Thank you, fixed both problems. Also fixed the embarrassing Schutzbot git authorship (I accidentally set that while I was testing gobump a week before). |
|
There is more of the |
|
Dropped it, I missed it, thanks. Rebased. Here is the summary of all these improvements:
|
ochosi
left a comment
There was a problem hiding this comment.
This looks good to me! ⚡ I have a slight concern about the test behaving differently and returning different results locally and in CI, but I get that the coverage may not something you always want to execute.
|
Added a warning message: |
Connection pool was never closed in tests, causing database connections
to leak. This became apparent when running tests multiple times in
sequence, as connections would accumulate until exhausting available
resources.
Reproducer:
go test ./... -count 5
Properly closing the connection pool after each test ensures clean
teardown and prevents connection leaks during repeated test runs.
Only enable verbose output and race detector when running in CI environment. This significantly speeds up local development iteration. In CI (GitHub Actions where CI=true): make unit-tests → go test -v -race ./... Locally (CI not set): make unit-tests → go test ./... Coverage targets still work in both modes, with CI getting additional -v -race flags for comprehensive checking.
Unit tests are quite slow. Most of the time is spent either creating and
migrating databases and parsing distro JSONs.
This patch slightly improves postgres performance by tuning some
settings. Then the database creation, which is called 40 times in unit
tests, is improved by using native pgx connection instead of podman exec.
The previous patch allows overriding unit test arguments:
make unit-tests UNIT_TEST_ARGS=
Combined with the changes from this patch, unit tests are now run in 19s
instead of 149s on my system.
By running tern (database migration tool) directly instead of through podman exec, we eliminate process spawning overhead. This allows the migration step to complete faster, shaving off approximately one second from the total test execution time.
Now that tests are faster, we can run them for each gobump.
Instead of loading and parsing distribution JSON files separately for each test, share a single distro registry instance across all tests. Distribution metadata parsing is expensive, and reusing the parsed registry eliminates this redundant work, speeding up tests by an additional 10 seconds.
Some unit tests can be safely executed in parallel. This was tested 50 times with high concurrency. Running tests in parallel exposed several race conditions that were previously hidden. This commit fixes those races by: - Adding proper synchronization where shared state is accessed - Ensuring test cleanup happens after parallel subtests complete - Fixing defer ordering issues with parallel execution Not all tests can be run in parallel, specifically subtests when the parent uses defer shutdownFn(): in Go, t.Run returns as soon as a subtest calls t.Parallel(), so the parent returns, defer runs (server stops), then the parallel subtests run → connection refused.
|
@croissanne did I drop all of the unnecessary commits? |
Several changes that speeds up unit tests significantly. Some patches are easy wins, others are debatable but I really like to have fast tests that just helps me to run them more often when changing things.
Major contributors:
UNIT_TEST_ARGSENV variable)psqlinsidepodman(is now native pgx call)tmpfsso minor improvement)