Skip to content

Commit 4d2853f

Browse files
Donachclaude
andcommitted
icm-store: minimal-diff libSQL/Turso overlay (alias dbcompat as rusqlite)
Re-architect the Turso backend as a thin overlay to minimise the diff against upstream (store.rs is upstream's most-churned file → fewer rebase conflicts). dbcompat is now a faithful rusqlite-0.34-shaped drop-in (exact Params model: empty [], [T;1..32], &[&dyn ToSql]; params! -> &[&dyn ToSql]; phantom-lifetime Row<'_>; types::ToSql). store.rs/schema.rs are byte-identical to upstream except 'use crate::dbcompat as rusqlite;' plus the connection-open path (open_backend + remote PRAGMA guard) — store.rs touched lines: 206 -> ~15 + one isolated fn; schema.rs: 34 -> 3. Tests: 161 pass / 1 perf-test fail (same as before). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 804dac2 commit 4d2853f

10 files changed

Lines changed: 1790 additions & 146 deletions

File tree

Cargo.lock

Lines changed: 1052 additions & 121 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@ strip = true
1919
rusqlite = { version = "0.34", features = ["bundled", "modern_sqlite"] }
2020
sqlite-vec = "0.1"
2121
zerocopy = { version = "0.8", features = ["derive"] }
22+
# libSQL / Turso backend (fork): remote (sqld) + embedded replica + sync.
23+
libsql = { version = "0.9", default-features = false, features = ["core", "replication", "remote", "sync", "tls"] }
24+
libsql-ffi = "0.9"
25+
once_cell = "1"
2226

2327
# Embeddings (optional)
2428
fastembed = "4"

TURSO.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# ICM — libSQL / Turso backend (fork)
2+
3+
This fork lets ICM store its memory in a **libSQL / Turso** database instead of a
4+
single local SQLite file, so **multiple machines / processes can read and write
5+
the same memory concurrently** — the server serialises writes, so there are no
6+
copies to merge and no SQLite-over-NFS corruption.
7+
8+
Upstream ICM is plain `rusqlite` (one local file). The change is isolated to the
9+
`icm-store` crate: a thin synchronous facade (`src/dbcompat.rs`) that mirrors the
10+
slice of the rusqlite API ICM uses, but drives the async `libsql` client under the
11+
hood. ICM's 6,300 lines of SQL logic are unchanged.
12+
13+
## Backends (chosen by environment)
14+
15+
| Env | Backend | Use |
16+
|-----|---------|-----|
17+
| *(none)* | **Local** SQLite file (`--db` / default path) | unchanged single-machine behaviour |
18+
| `TURSO_DATABASE_URL` (or `LIBSQL_URL`) | **Remote** libSQL/Turso server | **recommended for multi-writer** — every ICM process shares one server |
19+
| `…URL` + `ICM_TURSO_REPLICA=1` | **Embedded replica** (local file synced to primary) | local-first reads; see limitations |
20+
21+
Auth token: `TURSO_AUTH_TOKEN` (or `LIBSQL_AUTH_TOKEN`); empty is fine for an
22+
unauthenticated self-hosted `sqld`.
23+
24+
## Run a self-hosted primary (`sqld`) with vector search
25+
26+
ICM's schema uses the `vec0` virtual table (sqlite-vec), so the **server** must
27+
load that extension (the client doesn't — remote queries run server-side):
28+
29+
```bash
30+
# 1. grab the sqlite-vec loadable extension matching the crate version (0.1.6)
31+
mkdir -p ~/.icm-ext && cd ~/.icm-ext
32+
curl -fsSL https://github.com/asg017/sqlite-vec/releases/download/v0.1.6/sqlite-vec-0.1.6-loadable-linux-x86_64.tar.gz | tar xz
33+
sha256sum vec0.so > trusted.lst # sqld trusts extensions listed here
34+
35+
# 2. run the server (sqld is in nixpkgs)
36+
nix run nixpkgs#sqld -- --db-path ~/.icm/primary.sqld \
37+
--http-listen-addr 0.0.0.0:8080 --extensions-path ~/.icm-ext
38+
```
39+
40+
Then point every ICM client at it:
41+
42+
```bash
43+
export TURSO_DATABASE_URL=http://<server-host>:8080
44+
icm store --topic notes --content "shared across machines"
45+
icm recall "shared"
46+
```
47+
48+
## Verified
49+
50+
- Local store/recall: ✅
51+
- Remote (sqld): store/recall ✅, sqlite-vec loaded server-side ✅
52+
- **Concurrency: 16 independent `icm` processes writing at once → 16/16 stored,
53+
zero lost, no corruption**
54+
55+
## Known limitations
56+
57+
- **Embedded replica + vector schema:** libSQL embedded replicas don't forward the
58+
`vec0` `CREATE VIRTUAL TABLE` DDL (`unsupported statement`). Use **remote** mode
59+
for the vector schema; embedded replica is best for keyword-only or read-heavy
60+
use. (Remote mode is the recommended multi-writer setup anyway.)
61+
- **Client-side embeddings:** generating embeddings needs the fastembed model
62+
(downloaded on first use). For pure keyword use, pass `--no-embeddings`. Vector
63+
*search* still runs server-side where `vec0` is loaded.
64+
- **Drop-on-exit warning:** a benign `libsql::hrana … no runtime was available`
65+
line can appear at process exit (the write already committed); it's the async
66+
client closing after the sync shim's runtime context ends.
67+
68+
## Build / run
69+
70+
Needs `libstdc++` and `openssl` at runtime (NixOS keeps them in the store):
71+
72+
```bash
73+
cargo build --release
74+
LD_LIBRARY_PATH="$(nix eval --raw nixpkgs#stdenv.cc.cc.lib)/lib:$(nix eval --raw nixpkgs#openssl.out)/lib" \
75+
./target/release/icm ...
76+
```
77+
78+
A `flake.nix` is provided for a properly-wrapped Nix build (no `LD_LIBRARY_PATH`
79+
needed).

crates/icm-store/Cargo.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,10 @@ edition = "2021"
55

66
[dependencies]
77
icm-core = { path = "../icm-core" }
8-
rusqlite = { workspace = true }
8+
libsql = { workspace = true }
9+
libsql-ffi = { workspace = true }
10+
once_cell = { workspace = true }
11+
tokio = { workspace = true }
912
sqlite-vec = { workspace = true }
1013
zerocopy = { workspace = true }
1114
serde_json = { workspace = true }

0 commit comments

Comments
 (0)