[fix] abort stream load 2pc by label#659
Open
liujiwen-up wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed Changes
This PR improves Stream Load 2PC recovery by aborting lingering pre-committed transactions with the exact load label, instead of issuing an empty Stream Load request and parsing the txn id from the label-already-exists response.
Main changes
LoadStateand support querying Doris FE through:GET /api/{db}/get_load_state?label={label}DorisStreamLoad#abortPreCommitto:/_stream_load_2pcwith thelabelheaderUNKNOWNABORTEDlabelsCOMMITTEDorVISIBLEDorisWriterStatetable identity and subtask id, instead of the current writer identity.Motivation
The previous recovery abort logic depended on reconstructing a label, issuing an empty Stream Load request, then parsing the txn id from the returned label conflict message.
That is fragile for labels generated from long or non-ASCII table names. In those cases the old fallback label used a random UUID, so the label could not be reliably reconstructed during recovery, and the lingering pre-committed transaction might not be aborted.
Using the exact label with Doris Stream Load 2PC abort avoids this indirect txn-id parsing path.
Compatibility Notes
This change requires Doris server support for:
GET /api/{db}/get_load_state?label=.../_stream_load_2pcabort bylabelThese APIs are available in Doris 2.1.0 and later.
For transactions created by older connector versions where the invalid-label fallback used a random UUID, the original label still cannot be reconstructed if it was not persisted in state. This PR guarantees deterministic fallback label generation for newly created 2PC labels.
Tests
mvn -Pflink1 -pl flink-doris-connector-base \ -Dtest=TestLabelGenerator,TestDorisWriter,TestDorisStreamLoad,TestRestService,TestResponseUtil test