Skip to content

feat: configurable data retention for usage and audit#64

Merged
christianromeni merged 2 commits intomainfrom
feat/usage-retention
Apr 7, 2026
Merged

feat: configurable data retention for usage and audit#64
christianromeni merged 2 commits intomainfrom
feat/usage-retention

Conversation

@christianromeni
Copy link
Copy Markdown
Contributor

Summary

Closes #46.

Adds a background cleanup job (retention.Cleaner) that periodically deletes old rows from usage_events and audit_logs based on per-table retention durations. Opt-in: both durations default to 0 which means "keep forever".

This is a privacy/ops feature available in the Community tier. Not license-gated.

Config

settings:
  retention:
    usage_events: 2160h   # 90 days. 0 = keep forever (default)
    audit_logs: 8760h     # 365 days. 0 = keep forever (default)
    interval: 24h         # how often the cleanup job runs

Implementation highlights

  • Dialect-aware SQL: uses new Dialect.TimestampLessThan() helper so the predicate parses correctly on both SQLite (datetime(col) < datetime(?)) and Postgres (col::timestamptz < ($1)::timestamptz). Avoids the string-comparison hazard where SQLite and Postgres CURRENT_TIMESTAMP formats differ.
  • Batch deletes: loop with DELETE FROM t WHERE id IN (SELECT id FROM t WHERE ts < ? LIMIT 10000) and 100ms pause between batches. Avoids long write-lock holds on SQLite.
  • New indexes (migration 0010): single-column idx_usage_events_created_at and idx_audit_logs_timestamp. The existing composite org_id, created_at indexes are not usable by the cleanup query (no org_id predicate).
  • Lifecycle: Cleaner starts in Application.Start() after usage/audit loggers. LIFO shutdown order so retention stops before the write loggers.
  • Initial run on startup gives operators immediate log feedback and trims any backlog on a freshly-enabled instance.
  • Concurrent-safe Stop via sync.Once.
  • Min interval 1 minute enforced in validation to prevent busy loops.

Tests

15 tests in internal/retention/cleaner_test.go, all with real in-memory SQLite (no mocks):

  • Retention disabled (no-op verification)
  • Usage-only / audit-only / both enabled
  • Exact batchSize / batchSize+1 boundaries
  • Cutoff boundary (strict < semantics)
  • Context cancellation (pre-cancelled and mid-batch)
  • Stop() idempotent / safe after disabled Start()
  • Multi-batch audit_logs path
  • Error isolation between tables (one table failure does not stop the other)
  • Context timeout propagation from runOnce through cleanupTable

Test plan

  • CI green (go test, vet, build)
  • CodeQL + Snyk clean
  • Migration 0010 applies cleanly forward and reverses cleanly backward

@snyk-io
Copy link
Copy Markdown

snyk-io bot commented Apr 7, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 72.58065% with 34 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/retention/cleaner.go 75.26% 21 Missing and 2 partials ⚠️
internal/app/app.go 0.00% 11 Missing ⚠️

📢 Thoughts on this report? Let us know!

@christianromeni christianromeni merged commit 2a41eef into main Apr 7, 2026
7 checks passed
@christianromeni christianromeni deleted the feat/usage-retention branch April 7, 2026 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configurable data retention for usage events and audit logs

1 participant