Commit d8fd749
fix: reduce Honeycomb column bloat from otellogrus and dynamic span attributes (#4481)
Replace the shell-based honeycomb_cleanup.sh with a Go tool that handles
API rate limits properly, and fix the three code-level root causes that
continuously create stale Honeycomb columns:
1. otellogrus flattening complex objects: LoggingGatewayMessageHandler
logged full *sdp.Item protos via WithField("item", item), and config
maps were logged via WithFields(MapFromServerConfig(...)). Both get
flattened by the otellogrus hook into one Honeycomb column per leaf
field — producing ~1500+ stale columns. Items now log only the
GloballyUniqueName at normal levels (full proto at debug only), and
config maps are rendered as a single "config" string field.
2. Dynamic indexed attribute keys: fmt.Sprintf("prefix.%d", i) in v6.go
(hypothesisUpdated/hypothesisStatus) and blast_radius_tools.go
(affectedResource) created a new column for each unique index. Replaced
with attribute.StringSlice/IntSlice.
3. The gateway's revlink ingest error path logged full proto items/edges
as log fields, now logs only the GloballyUniqueName / edge endpoints.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adds a new, potentially destructive Honeycomb column deletion CLI and
changes logging/trace attribute shapes across multiple services;
misconfiguration could delete more columns than intended or reduce
diagnostic detail at non-debug levels.
>
> **Overview**
> **Reduces runaway Honeycomb column creation** by changing structured
logs and span attributes that were generating unbounded/dynamic field
keys.
>
> Replaces the shell-based Honeycomb column cleanup scripts with a Go
`honeycomb-cleanup` CLI that scans datasets for stale columns and
deletes them in parallel with shared rate-limit backoff, conflict (409)
handling, progress reporting, and `-dry-run` support.
>
> Updates multiple services to log configs and gateway items/edges more
conservatively (e.g., log a single stringified `config` field, and log
`GloballyUniqueName` at normal levels with full protos only at debug),
and replaces dynamically-indexed OpenTelemetry attributes with
`StringSlice`/`IntSlice` attributes to avoid per-index column churn.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
921cff26ef54dbe3a6b9c9a8a064ae35b2d7f9aa. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
GitOrigin-RevId: acb1f94431e4ddc6d6d53233934a2eb2f11e422d1 parent e58d4b5 commit d8fd749
4 files changed
Lines changed: 924 additions & 11 deletions
0 commit comments