[BACKPORT 2026.1][#31521] CDC: Fix upsert producing double DELETE instead of DELETE+INSERT (#30903)#32082
Open
devansh5398 wants to merge 1 commit into
Open
Conversation
…LETE instead of DELETE+INSERT (yugabyte#30903) ## Summary Fix a bug where an upsert (`INSERT ... ON CONFLICT DO UPDATE`) on an existing row emits two DELETE records instead of a DELETE followed by an INSERT in CDC output, when `ysql_yb_skip_redundant_update_ops` is set to false. ## Motivation When a row is upserted, Yugabyte internally writes a tombstone (delete) followed by a packed row (insert) for the same key. The CDC intent processing logic in `PopulateCDCSDKIntentRecord` decides when to start a new CDC record based on key changes, tombstones, and timestamp differences - but it did not account for the case where a packed row follows a tombstone for the same key at the same timestamp. This caused both the tombstone and the packed row to be merged into a single record, producing two DELETEs instead of a DELETE + INSERT. ## Changes ### `src/yb/cdc/cdcsdk_producer.cc` Added a condition to the `new_cdc_record_needed` check: when the current value is a packed row and the previous record was a DELETE, force a new CDC record. This ensures the packed row (INSERT) is emitted as a separate record rather than being folded into the preceding DELETE. ### `src/yb/integration-tests/cdcsdk_ysql-test.cc` Added `UpsertOnExistingRowProducesDeleteThenInsert` test that: - Inserts a row, then upserts the same key with a new value - Verifies CDC output contains exactly 1 DELETE + 1 INSERT (not 2 DELETEs) ## Test plan - [x] New test `UpsertOnExistingRowProducesDeleteThenInsert` validates correct DELETE+INSERT output - [x] Existing CDC tests pass unchanged - [x] End-to-end verified with a CDC consumer app against a local YugabyteDB build — upserts on existing rows produce DELETE followed by INSERT in CDC output <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Touches core CDC intent-to-record grouping logic, which can affect downstream changefeed correctness; change is small and covered by a new integration test for the reported upsert case. > > **Overview** > Fixes a CDCSDK bug where a packed-row write following a tombstone for the same key could be merged into the prior record under `FLAGS_enable_single_record_update`, causing `INSERT ... ON CONFLICT DO UPDATE` to surface as two `DELETE`s. > > Updates `PopulateCDCSDKIntentRecord` to **force a new CDC record** when a packed row is encountered after a `DELETE` for the same primary key, and adds an integration test (`UpsertOnExistingRowProducesDeleteThenInsert`) asserting the expected `DELETE` then `INSERT` sequence with packed rows enabled. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit f6dd846. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --- [CSI](<https://csiweb.dev.yugabyte.com/pull/30903/>) --------- Co-authored-by: Kate Galieva <kate.galieva@shopify.com> Original commit: 9f03ce9 / yugabyte#30903
Contributor
There was a problem hiding this comment.
Code Review
This pull request ensures that an upsert touching a primary key column produces a DELETE + INSERT sequence in the CDC stream instead of DELETE + DELETE by checking for packed rows and delete operations in PopulateCDCSDKIntentRecord. A corresponding integration test has been added. LGTM — no issues found.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Contributor
Author
|
Trigger Jenkins |
|
Jenkins build has been triggered. Results will be available in the CI checks. CSI |
Sumukh-Phalgaonkar
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix a bug where an upsert (
INSERT ... ON CONFLICT DO UPDATE) on anexisting row emits two DELETE records instead of a DELETE followed by an
INSERT in CDC output, when
ysql_yb_skip_redundant_update_opsis set tofalse.
Motivation
When a row is upserted, Yugabyte internally writes a tombstone (delete)
followed by a packed row (insert) for the same key. The CDC intent
processing logic in
PopulateCDCSDKIntentRecorddecides when to start anew CDC record based on key changes, tombstones, and timestamp
differences - but it did not account for the case where a packed row
follows a tombstone for the same key at the same timestamp.
This caused both the tombstone and the packed row to be merged into a
single record, producing two DELETEs instead of a DELETE + INSERT.
Changes
src/yb/cdc/cdcsdk_producer.ccAdded a condition to the
new_cdc_record_neededcheck: when the currentvalue is a packed row and the previous record was a DELETE, force a new
CDC record.
This ensures the packed row (INSERT) is emitted as a separate record
rather than being folded into the preceding DELETE.
src/yb/integration-tests/cdcsdk_ysql-test.ccAdded
UpsertOnExistingRowProducesDeleteThenInserttest that:Inserts a row, then upserts the same key with a new value
Verifies CDC output contains exactly 1 DELETE + 1 INSERT (not 2
DELETEs)
Test plan
New test
UpsertOnExistingRowProducesDeleteThenInsertvalidatescorrect DELETE+INSERT output
End-to-end verified with a CDC consumer app against a local
YugabyteDB build — upserts on existing rows produce DELETE followed by
INSERT in CDC output
Note
Medium Risk
Touches core CDC intent-to-record grouping logic, which can affect
downstream changefeed correctness; change is small and covered by a new
integration test for the reported upsert case.
Overview
Fixes a CDCSDK bug where a packed-row write following a tombstone for
the same key could be merged into the prior record under
FLAGS_enable_single_record_update, causingINSERT ... ON CONFLICT DO UPDATEto surface as twoDELETEs.Updates
PopulateCDCSDKIntentRecordto force a new CDC recordwhen a packed row is encountered after a
DELETEfor the same primarykey, and adds an integration test
(
UpsertOnExistingRowProducesDeleteThenInsert) asserting the expectedDELETEthenINSERTsequence with packed rows enabled.Written by Cursor
Bugbot for commit
f6dd846. This will update automatically
on new commits. Configure
here.
CSI
Co-authored-by: Kate Galieva kate.galieva@shopify.com
Original commit: 9f03ce9 / #30903
Merge conflicts
src/yb/integration-tests/cdcsdk_ysql-test.cc:48— adjacent-but-non-overlapping DECLARE blocks: 2026.1 addedtransaction_resend_applying_interval_usecandTEST_disable_apply_committed_transactions; cherry-pick addedysql_yb_skip_redundant_update_ops; kept all three.src/yb/integration-tests/cdcsdk_ysql-test.cc:779— lint fix: wrapped 101-char comment line to stay within 100-char limit (no logic change).automated · Claude Code (Opus 4.7)