-
Notifications
You must be signed in to change notification settings - Fork 298
DM: Add foreign key causality for DM syncer. #12414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ign key integration test.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. |
Summary of ChangesHello @OliverS929, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the DM syncer's capability to handle foreign key constraints. By tracking and incorporating foreign key causality relations into row change processing, the syncer can now ensure data consistency and correct behavior when dealing with complex parent-child table dependencies, especially in multi-worker environments. This improvement is vital for reliable data migration and synchronization where relational integrity is paramount. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces foreign key causality support to the DM syncer, a significant feature for ensuring data consistency in multi-worker scenarios. The changes are well-structured. The core logic in buildForeignKeyRelations correctly traverses the foreign key graph to find root dependencies, handling memoization and cycle detection. The new causality keys are then correctly integrated into the existing DML generation and validation paths. The addition of a comprehensive integration test (foreign_key_multi_worker) is excellent and provides strong validation for this new feature. I have a few suggestions to improve logging for better observability and a fix for a typo in the new test script.
| run_sql_tidb_with_retry "SELECT COUNT(*) FROM fk_chain.parent;" "COUNT(*): 2" | ||
| run_sql_tidb_with_retry "SELECT COUNT(*) FROM fk_chain.child;" "COUNT(*): 4" | ||
| run_sql_tidb_with_retry "SELECT data FROM fk_chain.child WHERE child_id=100;" "data: c100_updated" | ||
| run_sql_tidb_with_retry "SELECT parent_id FROM fk_chain.child WHERE child_id=201;" "parent_id: 10" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the table name in this SELECT query. It should be fk_chain.child instead of fk.chain.child. This will cause the integration test to fail.
| run_sql_tidb_with_retry "SELECT parent_id FROM fk_chain.child WHERE child_id=201;" "parent_id: 10" | |
| run_sql_tidb_with_retry "SELECT parent_id FROM fk_chain.child WHERE child_id=201;" "parent_id: 10" |
| if _, ok := visiting[tableID]; ok { | ||
| return nil, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a foreign key cycle is detected, the function currently returns nil, nil silently. While this correctly breaks the recursion, it would be beneficial for debugging and observability to log a warning when this happens. This would make it easier to diagnose unexpected causality behavior in schemas with cyclic dependencies.
| if _, ok := visiting[tableID]; ok { | |
| return nil, nil | |
| } | |
| if _, ok := visiting[tableID]; ok { | |
| tctx.Logger.Warn("foreign key cycle detected, will be ignored for causality", zap.String("table", tableID)) | |
| return nil, nil | |
| } |
| if idx >= len(values) { | ||
| skip = true | ||
| break | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code currently handles an out-of-bounds index by silently skipping the foreign key relation. While this is safe, it would be better to log a warning. An out-of-bounds index could indicate a discrepancy between the schema information and the row data, which might be a symptom of a deeper issue. Logging this would improve debuggability.
if idx >= len(values) {
log.L().Warn("foreign key child column index out of bounds, skipping relation",
zap.Stringer("table", r.sourceTable),
zap.Int("index", idx),
zap.Int("values-count", len(values)))
skip = true
break
}|
@OliverS929: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: ref #12350
What is changed and how it works?
This PR introduces foreign key-causality support to the DM syncer.
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note