Skip to content

[AAP-73135] Backport: Fix Segment event loss by enabling sync_mode (stable-0.7)#391

Open
cshiels-ie wants to merge 1 commit intostable-0.7from
backport/stable-0.7-383
Open

[AAP-73135] Backport: Fix Segment event loss by enabling sync_mode (stable-0.7)#391
cshiels-ie wants to merge 1 commit intostable-0.7from
backport/stable-0.7-383

Conversation

@cshiels-ie
Copy link
Copy Markdown
Contributor

@cshiels-ie cshiels-ie commented Apr 30, 2026

Backport of #383 to stable-0.7.

Summary

  • Sets analytics.sync_mode = True on the Segment client before emitting chunked track() events, so each request is sent as a separate blocking HTTP call (~25 KB each) rather than batched
  • Prevents silent event drops caused by Segment discarding batch POSTs that exceed the 500 KB body limit while returning HTTP 200
  • Updates StorageSegment tests to assert sync_mode is enabled and flush() is called exactly once

References


Note

Medium Risk
Changes Segment delivery mode from async batching to synchronous requests, which can affect runtime performance/latency and delivery semantics while fixing silent event drops for large uploads.

Overview
For StorageSegment.put(), forces Segment’s Python SDK into analytics.sync_mode = True so each chunked track() call is sent as its own blocking HTTP request rather than being batched into a potentially oversized payload that Segment can silently drop.

Updates tests to assert sync_mode is enabled, track() is called once per chunk, and flush() is invoked exactly once, adding a regression test covering large multi-chunk uploads.

Reviewed by Cursor Bugbot for commit dc57991. Bugbot is set up for automated code reviews on this repo. Configure here.

* [AAP-73135] Use sync_mode to fix flaky Segment event delivery

Replace time.sleep workaround with analytics.sync_mode = True so each
track() call sends synchronously rather than queuing to a background
thread, eliminating the race condition where the process exits before
the background thread finishes flushing.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Flush batches before exceeding Segment's 500KB batch limit

Segment silently drops events from batch POSTs that exceed 500KB and
returns HTTP 200, making the loss invisible to on_error callbacks. Fix
by tracking accumulated batch size and calling flush() before adding a
chunk that would push the batch over 450KB (leaving headroom for the
per-event metadata the SDK appends).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Add tests for Segment batch size limit flushing

Cover two new cases:
- all chunks fit in one batch (flush called once, final only)
- chunks exceed BATCH_SIZE_LIMIT, triggering mid-loop flush (flush
  called more than once, all chunks still tracked)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Revert to sync_mode after batch-limit approach proven insufficient

Testing showed the batch-limit heuristic still dropped events:
- 17 chunks in one batch (~425 KB tracked): only 12 arrived
- 26 chunks split into two batches: only 17 arrived

Root cause: the SDK adds ~2-3 KB of per-event metadata (context,
timestamps, messageId, integrations) that our data-size estimate did
not account for, pushing actual batch bodies over Segment's 500 KB
limit despite our 450 KB threshold.

sync_mode=True sends each track() as a separate blocking HTTP request
(~25 KB each) instead of batching, which eliminates the batch-size
problem entirely. Local end-to-end testing confirmed all 15 chunks
arrive reliably with sync_mode=True.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Strengthen sync_mode test to also verify flush count

Assert both sync_mode=True and flush.call_count==1, confirming that
sync_mode handles per-track delivery and no mid-loop batch flushing
is running alongside it.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Update tests to reflect validated sync_mode fix

- Remove stale top comment
- Add sync_mode assertion to test_put_sends_data_to_segment so every
  put() test verifies the mode is set
- Rename test_put_sync_mode_enabled -> test_put_sync_mode_no_batch_drops
  and document the confirmed end-to-end result (15/15 chunks received
  vs 11-14 without sync_mode)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Enable gzip compression alongside sync_mode

Each chunk is ~25 KB of JSON but compresses to ~3 KB (87% reduction)
due to repeated keys across items. With sync_mode sending one HTTP
request per chunk, gzip significantly reduces per-request transfer
time and overall upload duration.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* [AAP-73135] Revert gzip — Segment silently rejects compressed bodies

Segment's tracking API returns HTTP 200 but discards events when the
request body is gzip-encoded, resulting in 0 events received despite
the SDK reporting success. gzip=True is not a viable optimisation.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5fee5a24-2eb2-43ad-82ac-9a0741096d87

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch backport/stable-0.7-383

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant