-
Notifications
You must be signed in to change notification settings - Fork 6.2k
dumpling: improve string key handling, streaming process with chunking #62172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
takaidohigasi
wants to merge
72
commits into
pingcap:master
Choose a base branch
from
takaidohigasi:improve-string-key-handling
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
5d7e3d4
[pingcap/tidb#61999] improve dumpling string key handling
takaidohigasi 443b673
remove unused adaptive chunking function
takaidohigasi b4e5bb7
fix: enable chunk progress tracking for streaming string key chunking
takaidohigasi 2c909dc
disable ROW_NUMBER() implementation
takaidohigasi dd0d259
add unit tests and integration tests
takaidohigasi 641588e
implement buffering strategy for string chunking to enable concatenab…
takaidohigasi dcf5296
fix: use standard SQL escaping in composite string key test
takaidohigasi 44d8341
fix: update test expectation to match standard SQL escaping behavior
takaidohigasi b0b7735
docs: clarify escapeSQLString is for internal queries only
takaidohigasi f859cb0
fix: update Unicode test expectation for standard SQL escaping mode
takaidohigasi d653691
revert: restore default dumpling escaping behavior in tests
takaidohigasi f6384c5
fix: correct test expectation to match actual dumpling quote behavior
takaidohigasi 9d599bb
fix: correct single quote escaping in test expectation
takaidohigasi 30a00a7
fix: update composite string key test to expect escaped double quotes
takaidohigasi aec5def
fix: configure UTF8MB4 charset for composite string key test
takaidohigasi 58ec5ea
fix: resolve linting warnings and add charset support
takaidohigasi ff6fd4e
revert: remove SET NAMES modification for backward compatibility
takaidohigasi 648abb2
style: format Go files with gofmt
takaidohigasi 004eb20
fix: configure UTF8MB4 charset for composite string key test
takaidohigasi 53cf8a0
fix: implement UTF8MB4 charset detection and SET NAMES output
takaidohigasi 34cfe34
revert: remove collation detection code for separate branch
takaidohigasi a09ebdb
fix: configure TiDB server charset settings for Unicode test
takaidohigasi 70f3576
fix: restore parameter names in WriteInsert functions
takaidohigasi 8b28936
fix: correct chunking logic for row-based vs string-based modes
takaidohigasi 1aa9577
revert: restore original chunking logic
takaidohigasi 870d508
fix: distinguish string chunking vs row chunking by table name
takaidohigasi 3ebc221
Revert "fix: distinguish string chunking vs row chunking by table name"
takaidohigasi 23303f1
feat: add isStringChunking parameter to WriteInsert functions
takaidohigasi c2518cb
fix: add missing isStringChunking parameter to writer_serial_test.go
takaidohigasi 508267a
fix: resolve linting issues
takaidohigasi 02976b9
make bazel_prepare
takaidohigasi 1cd7cc6
delete extra file
takaidohigasi b0b7400
delete useless logic
takaidohigasi 6713d20
store isStringChunking to conf
takaidohigasi 5586425
remove extra condition
takaidohigasi e71d085
fix unused input to _
takaidohigasi 592ae9b
fix chunking for statesize limit
takaidohigasi a4e73d4
fix: resolve INSERT statement duplication in string-based chunking
takaidohigasi 2d67784
refactor: remove unused totalChunks parameter from writer functions
takaidohigasi 2521058
fix test expectations for removed unused params
takaidohigasi 4f3cd56
fix: handle column names containing commas in extractOrderByColumns
takaidohigasi 9cb87f2
fix format
takaidohigasi dd6da56
revert failpoint removal
takaidohigasi 525e0a0
fix -r option expectation
takaidohigasi 238a6af
fix: update composite_string_key test expectations based on actual ro…
takaidohigasi 82f2794
fix: use * for row estimation with string fields to handle composite …
takaidohigasi 8cc1847
fix: add fallback to direct COUNT(*) when EXPLAIN returns 0 for strin…
takaidohigasi d378833
fix: add missing empty function parameter to QuerySQL call
takaidohigasi c13eba4
fix: address gofmt and security linter issues
takaidohigasi 9af9864
fix: update test comments and simplify test assertions for WriteInsert
takaidohigasi 2042078
fix: update composite_string_key test to use chunked result files
takaidohigasi 92da39f
fix: add header comments to all chunk files in test expectations
takaidohigasi b1c4d8b
fix: use all composite key fields for string chunking
takaidohigasi 1a87518
Merge branch 'upstream/master' into improve-string-key-handling
takaidohigasi 919858e
dumpling/tests: add large-scale composite-string-key round-trip test
takaidohigasi 629efbb
dumpling: drop LLM-noise helpers and flatten chunking API
takaidohigasi 54566ad
dumpling/export: document tableChunkStat finalized invariant
takaidohigasi 03f2801
dumpling: verify COUNT(*) when EXPLAIN under-estimates for string chu…
takaidohigasi 0e2468c
dumpling/tests: regenerate composite_string_key expected fixtures
takaidohigasi 0dc993b
dumpling/tests: escape single quotes in composite_string_key_large ge…
takaidohigasi bbbc580
dumpling/tests: scope composite_string_key_large chunk assertions to …
takaidohigasi 42711a0
dumpling/export: satisfy nogo lint — fmt.Fprintf and drop unused numC…
takaidohigasi e2edf36
dumpling/export: gofmt writer_serial_test.go
takaidohigasi 23e7d56
dumpling: close tableChunkStat double-increment race + CodeRabbit nits
takaidohigasi 987f030
dumpling: drop tautology test and redundant WHERE-clause alias
takaidohigasi 4e7015b
dumpling/tests: address CodeRabbit shell-script findings
takaidohigasi 6f6ec69
dumpling/export: fix chunkedTables accounting in concat + TiDB TABLES…
takaidohigasi 6c0c495
dumpling/export: drop tautological tests and tighten extractOrderByCo…
takaidohigasi 5068102
dumpling: add license header and fix run.sh shebangs / backslash esca…
takaidohigasi 3cf73d2
dumpling/export: address CodeRabbit follow-up review
takaidohigasi 8e5eb75
dumpling/export: fail loudly on boundary-sampling errors and unknown …
takaidohigasi 0610dc4
Merge remote-tracking branch 'upstream/master' into improve-string-ke…
takaidohigasi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why this failpoint is deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check again. thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted in dd6da56 — the
failpoint.Inject("EnableLogProgress", ...)insetFinishTableCallBackis present in the current tree (line shifted after the master merge, now at dump.go:366).