Skip to content

Fix(connector): expose id_column, timestamp_column, metadata_columns for MySQL/PostgreSQL incremental sync#13849

Open
buildearth wants to merge 1 commit intoinfiniflow:mainfrom
buildearth:fix_mysql_datasource_conf
Open

Fix(connector): expose id_column, timestamp_column, metadata_columns for MySQL/PostgreSQL incremental sync#13849
buildearth wants to merge 1 commit intoinfiniflow:mainfrom
buildearth:fix_mysql_datasource_conf

Conversation

@buildearth
Copy link
Copy Markdown

What problem does this PR solve?

The MySQL and PostgreSQL sync classes in sync_data_source.py were not
passing id_column, timestamp_column, and metadata_columns to RDBMSConnector,
making incremental sync and document update impossible even when configured.

  • Without id_column: updated records generate new documents instead of
    overwriting existing ones (doc ID is derived from content hash, so any
    change produces a new ID).
  • Without timestamp_column: poll_source always falls back to full sync,
    ignoring the configured time range.
  • The three fields existed in the frontend default values but had no form
    inputs, so users had no way to fill them in.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)

Changes

  • Backend (rag/svr/sync_data_source.py): pass id_column,
    timestamp_column, and metadata_columns from self.conf to
    RDBMSConnector for both MySQL and PostgreSQL sync classes.
  • Frontend (web/src/pages/user-setting/data-source/constant/index.tsx):
    add ID Column, Timestamp Column, and Metadata Columns form fields
    to MySQL and PostgreSQL data source configuration UI with tooltips.

…for MySQL/PostgreSQL incremental sync

  The MySQL and PostgreSQL sync classes in sync_data_source.py were not
  passing id_column, timestamp_column, and metadata_columns to RDBMSConnector,
  making incremental sync and document update impossible even when configured.

  - Backend: pass id_column, timestamp_column, metadata_columns from self.conf
    to RDBMSConnector for both MySQL and PostgreSQL
  - Frontend: add ID Column, Timestamp Column, Metadata Columns form fields
    to MySQL and PostgreSQL data source configuration UI with tooltips

  Without id_column, updated records generate new documents instead of
  overwriting existing ones. Without timestamp_column, poll_source always
  falls back to full sync.

Signed-off-by: lixintao <lixintao@uniontech.com>
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. 🐞 bug Something isn't working, pull request that fix bug. labels Mar 30, 2026
@yingfeng yingfeng requested a review from Copilot March 30, 2026 08:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes MySQL/PostgreSQL incremental sync configuration wiring by exposing id_column, timestamp_column, and metadata_columns end-to-end (UI → sync classes → RDBMSConnector), enabling stable document IDs, incremental polling, and metadata extraction when configured.

Changes:

  • Backend: pass metadata_columns, id_column, and timestamp_column from sync config into RDBMSConnector for MySQL and PostgreSQL.
  • Frontend: add form inputs (with tooltips) for Metadata Columns, ID Column, and Timestamp Column for MySQL/PostgreSQL data sources.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
web/src/pages/user-setting/data-source/constant/index.tsx Adds MySQL/PostgreSQL form fields to let users configure ID/timestamp/metadata columns.
rag/svr/sync_data_source.py For MySQL/PostgreSQL sync, forwards the new config fields into RDBMSConnector to enable incremental sync and stable IDs.

@yingfeng yingfeng added the ci Continue Integration label Mar 30, 2026
@yingfeng yingfeng marked this pull request as draft March 30, 2026 11:20
@yingfeng yingfeng marked this pull request as ready for review March 30, 2026 11:20
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.72%. Comparing base (cdbbd26) to head (05e0b81).
⚠️ Report is 13 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #13849   +/-   ##
=======================================
  Coverage   96.72%   96.72%           
=======================================
  Files          10       10           
  Lines         702      703    +1     
  Branches      112      112           
=======================================
+ Hits          679      680    +1     
  Misses          5        5           
  Partials       18       18           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yingfeng yingfeng requested a review from Magicbook1108 March 30, 2026 12:16
@Magicbook1108
Copy link
Copy Markdown
Contributor

@MkDev11 Hello can you help with this?

@MkDev11
Copy link
Copy Markdown
Contributor

MkDev11 commented Apr 1, 2026

Sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐞 bug Something isn't working, pull request that fix bug. ci Continue Integration size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants