-
Notifications
You must be signed in to change notification settings - Fork 95
Open
Labels
L - LargeIntroducing a new module, implementing major features, adjusting system architecture.Introducing a new module, implementing major features, adjusting system architecture.enhancementNew feature or requestNew feature or request
Description
Summary
Expose public APIs for multi-table management built on the existing internal CatalogCodec. Enable users to create, list, and manage multiple tables within a single Tonbo instance, each with its own schema.
Why
Agent data naturally separates into distinct schemas:
| Data Type | Schema Example |
|---|---|
| Trajectory | (run_id, step_id, timestamp, action, observation, reasoning) |
| State | (run_id, checkpoint_name, state_blob, created_at) |
| Artifact | (artifact_id, run_id, content_type, data, hash) |
| Telemetry | (timestamp, span_id, trace_id, level, message, attributes) |
Current state:
- Internal
CatalogCodecexists in manifest - Per-table
TableIdand manifest entries work register_table()with schema validation exists- No public APIs exposed
Without multi-table support, users must either:
- Encode all data types into one wide schema (awkward, inefficient)
- Run separate Tonbo instances per data type (resource duplication)
- Use different storage systems for different data types
What
Public API Surface
- Create table: register a new table with schema and primary key definition
- List tables: enumerate tables in the catalog
- Get table metadata: retrieve schema, stats, version info
- Drop table: remove a table and schedule its data for GC
- Open table handle: obtain a table-specific read/write interface
Resource Sharing
- Shared executor, I/O handles, and caches across tables
- Per-table WAL and SST namespacing
- Per-table manifest entries within unified catalog
Open Questions
| Area | Question | Options |
|---|---|---|
| Transactions | Cross-table transaction support? | Single-table only / Multi-table atomic / Saga pattern |
| Consistency | Catalog DDL atomicity? | Async eventual / Synchronous with manifest CAS |
| Namespace | Table identification? | String name / UUID / Hierarchical path |
| Schema | Cross-table references? | None / Soft references / Foreign keys |
| Isolation | Failure blast radius? | Shared-fate / Per-table isolation |
| WAL | Per-table WAL or shared? | Separate files / Tagged entries in shared WAL |
| Compaction | Scheduling scope? | Global queue / Per-table budgets |
| GC | Cross-table dependencies? | Independent / Coordinated watermarks |
| Quotas | Resource limits? | None / Per-table storage/throughput caps |
| Versioning | Snapshot scope? | Per-table / Cross-table consistent snapshot |
Key Trade-off
Single-table transactions Multi-table transactions
↓ ↓
Simpler manifest Complex coordination
Per-table CAS Cross-table CAS or 2PC
Independent GC Coordinated watermarks
Easier to implement Agent workflows may need this
Success Criteria
- Design decisions documented for open questions above
- Public
CatalogAPI for table lifecycle (create, list, get, drop) -
DB::open_table(name)returns table-scoped handle - Multiple tables share single
TonboManifestinstance - Per-table SST namespacing
- Integration tests with multiple concurrent tables
Non-Goals
- Cross-table joins (push to query engine layer)
- Fine-grained access control (future work)
References
src/manifest/catalog.rs— internalCatalogCodecdocs/rethink-summary.md— Multi-table marked "in progress"- Epic: major compaction improvement #550 — Compaction must handle per-table scheduling
- Implement manifest-driven GC worker and harden WAL reconciliation #547 — GC must handle per-table object cleanup
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
L - LargeIntroducing a new module, implementing major features, adjusting system architecture.Introducing a new module, implementing major features, adjusting system architecture.enhancementNew feature or requestNew feature or request