Feature/generic api connector#13545
Conversation
…array join, type hints, defaults, templates)
Separate base URL from query params to prevent pagination from injecting duplicate keys (e.g. &page=1&page=1). Add a dedicated query_params field (key=value per line, like Postman) so users no longer embed params in the URL.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #13545 +/- ##
=======================================
Coverage 96.52% 96.52%
=======================================
Files 10 10
Lines 690 690
Branches 108 108
=======================================
Hits 666 666
Misses 8 8
Partials 16 16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@Magicbook1108 Hi, Can you please review the PR? |
|
@Magicbook1108 The issue is with the field mapping configuration. The RAGFlow datasets API returns items in a Suggested FixTry this configuration:
The connector auto-detects the items array from the response ( Reference ExampleHere's how I configured it for a news API that returns this structure: Sample API Response```json { "totalStories": 20, "stories": [ { "id": 27856, "title": "رئيس الوزراء العراقي...", "content": "رئيس الوزراء العراقي: استمرار الصراع... ", "published": "2026-03-15T15:47:13+03:00", "category": "أخبار", "country": { "id": "108", "name": "العراق", "ISO2": "IQ" }, "newsType": [{ "name": "عاجل" }], "mediaType": "نص", "source": "تيليجرام" } ] } ```My configuration:
How I Can HelpCould you share the JSON structure of your API response?. I can then tell you exactly which fields to map for Content Fields, Metadata Fields, and ID Field. Also, if your API returns paginated results, make sure to set the Pagination Type accordingly (Page-based, Offset-based, or Cursor-based) — you currently have it set to |
There was a problem hiding this comment.
Pull request overview
This PR adds a generic, configuration-driven REST API connector that allows users to connect any REST API as a data source through the UI without code changes. It implements full and incremental sync, configurable authentication, pagination, and field mapping.
Changes:
- New
RestAPIConnectorclass with support for multiple auth types, pagination strategies, and flexible field extraction - Backend registration in
sync_data_source.pyand a test-connection endpoint inconnector_app.py - Frontend UI forms, i18n strings, and SVG icon for the REST API data source
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| common/data_source/rest_api_connector.py | Core connector implementation with auth, pagination, field mapping |
| common/data_source/config.py | Adds REST_API to DocumentSource enum |
| common/data_source/init.py | Exports RestAPIConnector |
| common/constants.py | Adds REST_API to FileSource enum |
| rag/svr/sync_data_source.py | Wires REST_API sync class into the factory |
| api/apps/connector_app.py | Test connection endpoint |
| web/src/pages/user-setting/data-source/constant/index.tsx | Form fields, defaults, and data source info for REST API |
| web/src/pages/user-setting/data-source/hooks.ts | useTestDataSource hook |
| web/src/pages/user-setting/data-source/data-source-detail-page/index.tsx | Test Connection button |
| web/src/services/data-source-service.ts | testDataSource service call |
| web/src/utils/api.ts | Test endpoint URL |
| web/src/locales/en.ts | English i18n strings |
| web/src/assets/svg/data-source/rest-api.svg | Connector icon |
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…onError immediately without retries
|
All comments have been resolved, ci also working great. Feel free to review :) |
GitHub Actions runners time out cloning infinity resources from Gitee. Default Dockerfile already uses GitHub; workflows were overriding with NEED_MIRROR=1. Set NEED_MIRROR=0 in tests and release workflows.
|
CI failure is unrelated to this PR — the |
|
@Magicbook1108 I have reverted the changes for CI. Please restart the CI! |
|
@Magicbook1108 @yingfeng Could you please complete the review? |
!!! |




feat: Add Generic REST API Connector
What problem does this PR solve?
RAGFlow supports many specific data source connectors (MySQL, Slack, Google Drive, etc.), but there was no way to connect an arbitrary REST API as a data source. Users with custom or third-party APIs had to write a new connector class for each one.
This PR adds a generic, configuration-driven REST API connector that lets users connect any REST API as a data source entirely through the UI — no code changes needed per API.
Features
Core Connector (
common/data_source/rest_api_connector.py)LoadConnectorandPollConnectorinterfaces for full and incremental syncitems,results,data,records, or first list found)country.name), array wildcards (newsType[*].name), type hints, and default values"Title: {title}\nBody: {body}")hash128from a configurable ID field or auto-generated from item contentBackend Registration (
rag/svr/sync_data_source.py,common/constants.py,common/data_source/config.py)REST_APIsync class wired into RAGFlow'sfunc_factoryload_from_state) and incremental polling (poll_source) supportTest Connection Endpoint (
api/apps/connector_app.py)POST /v1/connector/<id>/testvalidates config schema, authentication, and API connectivity without triggering a syncFrontend UI (
web/src/pages/user-setting/data-source/constant/)Controls & Safety
Type of change
##Visual Screenshots of Features

(Connector can be configured within the external data sources tab)
Configuration Parameters:


Connection can be tested before attaching to dataset:

Ingestion tested with API connector (works perfectly fine):

Search & Retrieval works as well with metadata flow:
