Skip to content

Commit 9ec9687

Browse files
authored
feat(planning): add phases 10-17 to v0.4.1 roadmap (#467)
* feat(planning): add phases 10-15 to v0.4.1 roadmap Map remaining v0.4.1 milestone issues to new phases: - Phase 10: Code cleanups (#461, #466) - Phase 11: Collection.ForkCount (#460) - Phase 12: Delete with limit (#439) - Phase 13: OpenRouter embeddings compatibility (#438) - Phase 14: Twelve Labs EF (#190) - Phase 15: Cloud RRF/GroupBy test coverage (#462) * feat(planning): add phases 11-12, renumber 11-15 to 13-17, add #456 to phase 10 New phases: - Phase 11: Fork double-close bug (#454) - Phase 12: SDK auto-wiring research (#455) Updated: - Phase 10 now includes registry test cleanup (#456) - Old phases 11-15 renumbered to 13-17 - Phase 13 (ForkCount) depends on phases 11+12
1 parent 9556690 commit 9ec9687

2 files changed

Lines changed: 145 additions & 6 deletions

File tree

.planning/PROJECT.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,15 @@ Go applications can use Chroma and embedding providers through a stable, portabl
2222

2323
### Active
2424

25-
None — all v0.4.1 milestone requirements validated.
25+
- Convenience constructors reduce Content API verbosity for common modality+source combinations — Phase 9
26+
- Duplicated path safety utilities consolidated, *context.Context anti-pattern fixed, registry test cleanup added — Phase 10 (issues #456, #461, #466)
27+
- Fork() double-close bug fixed for shared EF pointers — Phase 11 (issue #454)
28+
- SDK auto-wiring behavior traced and documented against official Chroma SDKs — Phase 12 (issue #455)
29+
- Collection.ForkCount provides Go parity with upstream /fork_count endpoint — Phase 13 (issue #460)
30+
- Delete operations support optional limit parameter matching upstream — Phase 14 (issue #439)
31+
- OpenAI embedding function supports OpenRouter provider preferences and encoding_format — Phase 15 (issue #438)
32+
- Twelve Labs multimodal embedding provider added — Phase 16 (issue #190)
33+
- Cloud integration tests cover Search API RRF and GroupBy primitives end-to-end — Phase 17 (issue #462)
2634

2735
### Recently Validated
2836

@@ -32,7 +40,7 @@ None — all v0.4.1 milestone requirements validated.
3240

3341
### Out of Scope
3442

35-
- Shipping every provider on the new multimodal contract in this milestone — Gemini and VoyageAI validate the foundation, remaining providers adopt later
43+
- Shipping every provider on the new multimodal contract in this milestone — Gemini, VoyageAI, and Twelve Labs are in scope; remaining providers adopt later
3644
- Replacing or removing existing `EmbeddingFunction` and image-only multimodal APIs — backwards compatibility is an explicit acceptance criterion
3745
- Changing collection/query semantics outside the embedding abstraction boundary — keep the milestone scoped to shared embedding foundations
3846

@@ -64,4 +72,4 @@ None — all v0.4.1 milestone requirements validated.
6472
| Pivot Phase 7 from vLLM/Nemotron to VoyageAI | vLLM lacks NVOmniEmbedModel support; VoyageAI multimodal validates portability with text/image/video | ✓ Good |
6573

6674
---
67-
*Last updated: 2026-03-23 — v0.4.1 milestone complete. All 8 phases executed: shared contract, capabilities, registry, intent mapping, docs, Gemini multimodal, VoyageAI multimodal, provider documentation and changelog.*
75+
*Last updated: 2026-03-25 — v0.4.1 milestone expanded to 17 phases: added fork double-close bug (#454), SDK auto-wiring research (#455), registry test cleanup (#456) alongside prior additions.*

.planning/ROADMAP.md

Lines changed: 134 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This roadmap initializes GSD planning for the current brownfield milestone focus
66

77
## Milestones
88

9-
- 🚧 **v0.4.1 Provider-Neutral Multimodal Foundations** - Phases 1-9 (current planning milestone)
9+
- 🚧 **v0.4.1 Provider-Neutral Multimodal Foundations** - Phases 1-17 (current planning milestone)
1010

1111
## v0.4.1 Provider-Neutral Multimodal Foundations
1212

@@ -23,6 +23,14 @@ This roadmap initializes GSD planning for the current brownfield milestone focus
2323
- [x] **Phase 7: Voyage Multimodal Adoption** - Wire VoyageAI into the shared multimodal contract with text, image, and video support to validate the foundation end-to-end.
2424
- [x] **Phase 8: Document Gemini and VoyageAI multimodal embedding functions** - Update provider docs, add runnable examples, update README, create changelog. (completed 2026-03-23)
2525
- [ ] **Phase 9: Convenience Constructors and Documentation Polish** - Add shorthand constructors to reduce Content API verbosity and update docs.
26+
- [ ] **Phase 10: Code Cleanups** - Extract shared path safety utilities, fix *context.Context anti-pattern, add registry test cleanup. (issues #456, #461, #466)
27+
- [ ] **Phase 11: Fork Double-Close Bug** - Fix EF pointer sharing in Fork() that causes double-close on client.Close(). (issue #454)
28+
- [ ] **Phase 12: SDK Auto-Wiring Research** - Trace contentEmbeddingFunction auto-wiring behavior in official Chroma SDKs. (issue #455)
29+
- [ ] **Phase 13: Collection.ForkCount** - Add ForkCount endpoint support for upstream /fork_count API. (issue #460)
30+
- [ ] **Phase 14: Delete with Limit** - Add delete-with-limit support for upstream limit parameter. (issue #439)
31+
- [ ] **Phase 15: OpenRouter Embeddings Compatibility** - Add first-class OpenRouter support via provider preferences and encoding_format. (issue #438)
32+
- [ ] **Phase 16: Twelve Labs Embedding Function** - Add Twelve Labs multimodal embedding provider. (issue #190)
33+
- [ ] **Phase 17: Cloud RRF and GroupBy Test Coverage** - Add cloud integration tests for Search API RRF and GroupBy primitives. (issue #462)
2634

2735
## Phase Details
2836

@@ -162,8 +170,17 @@ Plans:
162170
| 4. Provider Mapping and Explicit Failures | 2/2 | Complete | 2026-03-20 |
163171
| 5. Documentation and Verification | 2/2 | Complete | 2026-03-20 |
164172
| 6. Gemini Multimodal Adoption | 2/2 | Complete | 2026-03-20 |
165-
| 7. Voyage Multimodal Adoption | 0/2 | Planning complete | - |
166-
| 8. Document Gemini and VoyageAI | 0/2 | Planning complete | - |
173+
| 7. Voyage Multimodal Adoption | 2/2 | Complete | 2026-03-22 |
174+
| 8. Document Gemini and VoyageAI | 2/2 | Complete | 2026-03-23 |
175+
| 9. Convenience Constructors | 0/0 | Not started | - |
176+
| 10. Code Cleanups | 0/0 | Not started | - |
177+
| 11. Fork Double-Close Bug | 0/0 | Not started | - |
178+
| 12. SDK Auto-Wiring Research | 0/0 | Not started | - |
179+
| 13. Collection.ForkCount | 0/0 | Not started | - |
180+
| 14. Delete with Limit | 0/0 | Not started | - |
181+
| 15. OpenRouter Embeddings | 0/0 | Not started | - |
182+
| 16. Twelve Labs EF | 0/0 | Not started | - |
183+
| 17. Cloud RRF/GroupBy Tests | 0/0 | Not started | - |
167184

168185
### Phase 9: Convenience Constructors and Documentation Polish
169186

@@ -179,3 +196,117 @@ Plans:
179196

180197
Plans:
181198
- [ ] TBD (run /gsd:plan-phase 9 to break down)
199+
200+
### Phase 10: Code Cleanups
201+
**Goal:** Consolidate duplicated path safety utilities into a shared internal package, fix the *context.Context pointer-to-interface anti-pattern across embedding providers, and add registry test cleanup to prevent global state leaks.
202+
**Depends on:** Phase 9
203+
**Issues**: #456, #461, #466
204+
**Success Criteria** (what must be TRUE):
205+
1. A shared `pkg/internal/pathutil` package provides `ContainsDotDot`, `ValidateFilePath`, and `SafePath` utilities.
206+
2. Gemini, Voyage, and default_ef use the shared path utilities instead of local duplicates.
207+
3. Gemini, Nomic, and Mistral use `context.Context` (not `*context.Context`) for DefaultContext.
208+
4. Registry tests use `t.Cleanup` with unregister helpers to prevent global state leaks.
209+
5. All existing tests pass without modification.
210+
**Plans:** 0 plans
211+
212+
Plans:
213+
- [ ] TBD (run /gsd:plan-phase 10 to break down)
214+
215+
### Phase 11: Fork Double-Close Bug
216+
**Goal:** Fix EF pointer sharing in Fork() that causes the same underlying embedding function resource to be closed twice when client.Close() iterates cached collections.
217+
**Depends on:** None (independent, but should precede ForkCount work)
218+
**Issues**: #454
219+
**Success Criteria** (what must be TRUE):
220+
1. Forked collections do not double-close shared EF resources when client.Close() is called.
221+
2. Both `embeddingFunction` and `contentEmbeddingFunction` ownership is handled correctly.
222+
3. Tests cover Fork + Close lifecycle without panics or use-after-close errors.
223+
4. Existing fork tests continue to pass.
224+
**Plans:** 0 plans
225+
226+
Plans:
227+
- [ ] TBD (run /gsd:plan-phase 11 to break down)
228+
229+
### Phase 12: SDK Auto-Wiring Research
230+
**Goal:** Trace contentEmbeddingFunction auto-wiring behavior in official Chroma SDKs (Python, JavaScript) to verify chroma-go's approach is consistent or document deliberate differences.
231+
**Depends on:** None (research task, informs Phase 13)
232+
**Issues**: #455
233+
**Success Criteria** (what must be TRUE):
234+
1. Python SDK auto-wiring behavior documented for get_collection, list_collections, and create_collection.
235+
2. JavaScript SDK auto-wiring behavior documented for equivalent operations.
236+
3. Comparison with chroma-go behavior written up with any recommended changes or documented differences.
237+
**Plans:** 0 plans
238+
239+
Plans:
240+
- [ ] TBD (run /gsd:plan-phase 12 to break down)
241+
242+
### Phase 13: Collection.ForkCount
243+
**Goal:** Add `ForkCount(ctx) (int, error)` to the V2 Collection interface with HTTP transport support, matching upstream Chroma's /fork_count endpoint.
244+
**Depends on:** Phase 11, Phase 12 (benefits from fork bug fix and SDK research)
245+
**Issues**: #460
246+
**Success Criteria** (what must be TRUE):
247+
1. `pkg/api/v2.Collection` includes `ForkCount(ctx context.Context) (int, error)`.
248+
2. HTTP implementation issues `GET .../fork_count` and decodes `{"count": n}`.
249+
3. Embedded/local behavior returns an explicit unsupported error.
250+
4. Tests cover HTTP happy path, failure path, and embedded unsupported path.
251+
5. Forking docs mention the new method.
252+
**Plans:** 0 plans
253+
254+
Plans:
255+
- [ ] TBD (run /gsd:plan-phase 13 to break down)
256+
257+
### Phase 14: Delete with Limit
258+
**Goal:** Add limit parameter support to collection delete operations, matching upstream Chroma PRs #6573/#6582.
259+
**Depends on:** None (independent)
260+
**Issues**: #439
261+
**Success Criteria** (what must be TRUE):
262+
1. Delete operations accept an optional limit parameter.
263+
2. HTTP transport sends the limit when specified.
264+
3. Tests cover delete-with-limit happy path and edge cases.
265+
**Plans:** 0 plans
266+
267+
Plans:
268+
- [ ] TBD (run /gsd:plan-phase 14 to break down)
269+
270+
### Phase 15: OpenRouter Embeddings Compatibility
271+
**Goal:** Extend the OpenAI embedding function to support OpenRouter-specific fields (encoding_format, input_type, provider preferences) and relax model validation for provider-prefixed IDs.
272+
**Depends on:** None (independent)
273+
**Issues**: #438
274+
**Success Criteria** (what must be TRUE):
275+
1. `CreateEmbeddingRequest` supports `encoding_format`, `input_type`, and `provider` fields.
276+
2. `WithModel` accepts provider-prefixed model IDs (e.g. `openai/text-embedding-3-small`).
277+
3. Provider preferences struct covers documented OpenRouter fields with extensibility.
278+
4. Existing OpenAI behavior and tests remain unchanged.
279+
5. Docs include OpenRouter usage example with `WithBaseURL`.
280+
**Plans:** 0 plans
281+
282+
Plans:
283+
- [ ] TBD (run /gsd:plan-phase 15 to break down)
284+
285+
### Phase 16: Twelve Labs Embedding Function
286+
**Goal:** Add a new Twelve Labs multimodal embedding provider supporting text, image, and audio embeddings via the Twelve Labs API.
287+
**Depends on:** Phase 9 (benefits from Content API foundations)
288+
**Issues**: #190
289+
**Success Criteria** (what must be TRUE):
290+
1. `pkg/embeddings/twelvelabs` implements dense embedding and Content API interfaces.
291+
2. Supports text, image, and audio modalities per Twelve Labs API docs.
292+
3. Registered in factory/registry with config round-trip support.
293+
4. Tests cover request construction, modality validation, and config persistence.
294+
5. Docs and examples added for Twelve Labs provider.
295+
**Plans:** 0 plans
296+
297+
Plans:
298+
- [ ] TBD (run /gsd:plan-phase 16 to break down)
299+
300+
### Phase 17: Cloud RRF and GroupBy Test Coverage
301+
**Goal:** Add end-to-end cloud integration tests that exercise Search API RRF and GroupBy primitives against live Chroma Cloud.
302+
**Depends on:** None (independent, but best run last as test hardening)
303+
**Issues**: #462
304+
**Success Criteria** (what must be TRUE):
305+
1. RRF smoke test using dense + sparse KNN ranks with `WithKnnReturnRank`.
306+
2. RRF weighted/custom-k test proves request acceptance and ordering changes.
307+
3. GroupBy MinK/MaxK tests assert per-group caps and flattened limits.
308+
4. All tests tagged `cloud` and use existing cloud test infrastructure.
309+
**Plans:** 0 plans
310+
311+
Plans:
312+
- [ ] TBD (run /gsd:plan-phase 17 to break down)

0 commit comments

Comments
 (0)