Commit 1f5befa
authored
feat: add Apache Iggy connector (#1969)
* Enable CocoIndex to integrate with Apache Iggy streams
Iggy uses stream/topic/partition addressing and payload-only messages rather than Kafka's topic/key/tombstone model. The connector maps that API into CocoIndex live streams and target states while preserving downstream-ready offset storage for source consumption.
Constraint: Iggy Python SDK does not expose Kafka-style message keys, tombstones, assignment callbacks, or per-partition watermarks.
Rejected: Blind Kafka connector port | would silently mis-handle keys, deletes, and readiness semantics.
Confidence: medium
Scope-risk: moderate
Directive: Do not weaken source readiness for multi-partition topics without an SDK-supported per-partition watermark or explicit initial_high_watermark.
Tested: uv run pytest python/tests/connectors/test_iggy_source.py
Tested: uv run ruff check python/cocoindex/connectors/iggy/_source.py python/cocoindex/connectors/iggy/_target.py python/tests/connectors/test_iggy_source.py
Tested: uv run mypy python/cocoindex/connectors/iggy/_source.py python/cocoindex/connectors/iggy/_target.py python/tests/connectors/test_iggy_source.py
Not-tested: Live integration against an Apache Iggy server.
* Keep Iggy connector tests formatter-clean
The CI prek action runs Ruff formatting across all files and rewrites the long async test signature in the Iggy connector test module. Commit the formatter output so the hook has no working-tree changes to report.
Constraint: CI runs prek 0.4.0 with ruff-format over all files
Confidence: high
Scope-risk: narrow
Reversibility: clean
Tested: uv run ruff format --check python/tests/connectors/test_iggy_source.py
Tested: uv run prek run --all-files
* Prevent duplicate Iggy deliveries during live consumption
A real Iggy broker can return an already-delivered offset while manual offset storage is still catching up. The source now tracks offsets delivered during a watch call and skips duplicate partition offsets before sending them downstream. The mocked regression mirrors the live broker sequence observed during the smoke test: 0, 1, 1, 2.
Constraint: Iggy consumer groups use server-side offsets with manual store_offset calls
Rejected: Treat mocked source tests as sufficient | live broker returned a duplicate offset that mocks did not cover
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep live-broker smoke coverage in mind when changing Iggy offset handling
Tested: uv run pytest python/tests/connectors/test_iggy_source.py
Tested: real apache/iggy:latest broker smoke over TCP delivered offsets 0,1,2 through topic_as_stream
Tested: uv run ruff check python/cocoindex/connectors/iggy/_source.py python/tests/connectors/test_iggy_source.py
Tested: uv run mypy
Tested: uv run prek run --all-files1 parent 4421582 commit 1f5befa
6 files changed
Lines changed: 1361 additions & 1 deletion
File tree
- python
- cocoindex/connectors/iggy
- tests/connectors
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
92 | 93 | | |
93 | 94 | | |
94 | 95 | | |
| |||
120 | 121 | | |
121 | 122 | | |
122 | 123 | | |
| 124 | + | |
123 | 125 | | |
124 | 126 | | |
125 | 127 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
0 commit comments