Skip to content

test: fix flaky audit log and service account status tests#3068

Merged
talos-bot merged 1 commit into
siderolabs:mainfrom
oguzkilcan:test/fix-flaky-audit-and-service-account-tests
Jul 1, 2026
Merged

test: fix flaky audit log and service account status tests#3068
talos-bot merged 1 commit into
siderolabs:mainfrom
oguzkilcan:test/fix-flaky-audit-and-service-account-tests

Conversation

@oguzkilcan

Copy link
Copy Markdown
Member

The service account status test indexed PublicKeys[0] right after a non-fatal assert.Len. A slow reconcile that left the slice empty would panic instead of retrying, so the index access now sits behind the length check and the poller waits for the key to be aggregated.

The two audit log tests seeded 1500 rows as separate autocommit transactions. Under full-suite disk contention those commits triggered enough checkpoint fsyncs to overrun the 30s context deadline mid-statement, so seeding now runs inside a single transaction.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces test flakiness in the Omni backend by preventing panics in the service account status reconcile test and by making audit log SQLite test seeding less sensitive to disk/WAL checkpoint contention.

Changes:

  • Guard PublicKeys[0] access behind a length check in the service account status reconcile test.
  • Seed large audit log datasets inside a single SQLite transaction to avoid many WAL autocommit checkpoints.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
internal/backend/runtime/omni/controllers/omni/service_account_status_test.go Avoids panics during polling by returning early when PublicKeys isn’t populated yet.
internal/backend/runtime/omni/audit/auditlog/auditlogsqlite/auditlogsqlite_test.go Seeds test data in a single transaction to reduce WAL/fsync overhead and associated timeouts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

db.Put(seedConn)
})

require.NoError(t, sqlitex.ExecuteTransient(seedConn, "BEGIN IMMEDIATE", nil))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service account status test indexed PublicKeys[0] right after a non-fatal assert.Len. A slow reconcile that left the slice empty would panic instead of retrying, so the index access now sits behind the length check and the poller waits for the key to be aggregated.

The two audit log tests seeded 1500 rows as separate autocommit transactions. Under full-suite disk contention those commits triggered enough checkpoint fsyncs to overrun the 30s context deadline mid-statement, so seeding now runs inside a single transaction.

Signed-off-by: Oguz Kilcan <oguz.kilcan@siderolabs.com>
@github-project-automation github-project-automation Bot moved this from In Review to Approved in Planning Jul 1, 2026
@oguzkilcan

Copy link
Copy Markdown
Member Author

/m

@talos-bot talos-bot merged commit 3ddc040 into siderolabs:main Jul 1, 2026
57 of 58 checks passed
@github-project-automation github-project-automation Bot moved this from Approved to Done in Planning Jul 1, 2026
@oguzkilcan oguzkilcan deleted the test/fix-flaky-audit-and-service-account-tests branch July 1, 2026 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants