You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/LIP_SPEC.mdx
+21-2Lines changed: 21 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1100,6 +1100,25 @@ lip-protocol/
1100
1100
-[x]**`textDocument/typeDefinition` in all 4 Tier 2 backends** — each symbol now carries an `OwnedRelationship { is_type_definition: true }` pointing to the cross-file definition of its type. Enables "which symbols have type `Foo`?" queries on the blast-radius graph.
1101
1101
-[x]**`textDocument/inlayHints` in rust-analyzer** — local variable bindings (inside function bodies) are now captured as additional `Variable` symbols with their compiler-inferred types. SCIP does not index locals; this is additive coverage unique to LIP.
1102
1102
1103
+
### v1.5 — Shipped ✓
1104
+
1105
+
-[x]**`BatchQueryNearestByText`** — embed N query strings in a single round-trip and return one nearest-neighbour list per query. Replaces N sequential `QueryNearestByText` calls in multi-query workflows.
1106
+
-[x]**`QueryNearestBySymbol`** — find symbols similar to a given symbol URI. The daemon embeds the symbol's text (display_name + signature + doc) on demand and searches the symbol embedding store. `EmbeddingBatch` now routes `lip://` URIs to `symbol_embeddings` and `file://` URIs to `file_embeddings`.
1107
+
-[x]**`BatchAnnotationGet`** — retrieve an annotation key for multiple symbol URIs under a single db lock. Replaces N sequential `AnnotationGet` calls; safe inside `BatchQuery`.
1108
+
-[x]**`IndexChanged` push notification** — emitted to all active sessions after every `Delta::Upsert` via the broadcast channel. Carries `indexed_files` count and `affected_uris`. Enables precise cache invalidation without polling `QueryIndexStatus`.
1109
+
-[x]**`Handshake` / `HandshakeResult`** — clients send `Handshake { client_version }` on connect; daemon replies with `daemon_version` (semver) and `protocol_version` (monotonic integer, currently `1`). Version drift between daemon and client is now detectable at connect time.
1110
+
-[x]**`--managed` flag** (`lip daemon start --managed`) — spawns a background watchdog that polls the parent process every 2 s and exits when the parent has exited. Designed for IDE integrations that manage the daemon as a subprocess.
1111
+
1112
+
### v1.6 — Shipped ✓
1113
+
1114
+
-[x]**`ReindexFiles { uris }`** — force a targeted re-index of specific file URIs from disk, bypassing directory scan. Useful when the client knows exactly which files changed out-of-band. Returns `DeltaAck`. Not permitted inside `BatchQuery`.
1115
+
-[x]**`Similarity { uri_a, uri_b }`** — pairwise cosine similarity of two stored embeddings. Routes `lip://` URIs to symbol embeddings and `file://` URIs to file embeddings. Returns `SimilarityResult { score: Option<f32> }`. Safe inside `BatchQuery`.
1116
+
-[x]**`QueryExpansion { query, top_k, model }`** — embed a query string, find the `top_k` nearest symbols in the symbol store, return their display names as expansion terms. Designed for compound-search paths: expand a short query before running `QueryWorkspaceSymbols`. Not permitted inside `BatchQuery`.
1117
+
-[x]**`Cluster { uris, radius }`** — group URIs by embedding proximity using greedy single-link assignment. Returns `ClusterResult { groups }`. Not permitted inside `BatchQuery`.
1118
+
-[x]**`ExportEmbeddings { uris }`** — return raw stored embedding vectors for a set of URIs as `HashMap<String, Vec<f32>>`. Enables cross-repo federation: export from each root, merge, query with `QueryNearestInStore`. Safe inside `BatchQuery`.
1119
+
-[x]**`lip slice --pip`** — Python dependency slice support. Indexes packages installed in the current Python environment.
1120
+
-[x] 5 new MCP tools: `lip_reindex_files`, `lip_similarity`, `lip_query_expansion`, `lip_cluster`, `lip_export_embeddings`.
1121
+
1103
1122
### v1.7 — Semantic retrieval primitives ✓
1104
1123
1105
1124
-[x]**`QueryNearestByContrast`** — vector-arithmetic contrastive search: `normalize(like − unlike)` → nearest neighbours. Enables "similar to X but different from Y" queries.
@@ -1141,7 +1160,7 @@ lip-protocol/
1141
1160
-[ ] Shared-memory mmap path for zero-copy symbol reads (spec §7.1)
-[x]**`textDocument/typeDefinition` in all 4 Tier 2 backends** — each symbol now carries an `OwnedRelationship { is_type_definition: true }` pointing to the cross-file definition of its type. Enables "which symbols have type `Foo`?" queries on the blast-radius graph.
1106
1106
-[x]**`textDocument/inlayHints` in rust-analyzer** — local variable bindings (inside function bodies) are now captured as additional `Variable` symbols with their compiler-inferred types. SCIP does not index locals; this is additive coverage unique to LIP.
1107
1107
1108
+
### v1.6 — Shipped ✓
1109
+
1110
+
-[x]**`ReindexFiles { uris }`** — force a targeted re-index of specific file URIs from disk, bypassing directory scan. Returns `DeltaAck`. Not permitted inside `BatchQuery`.
1111
+
-[x]**`Similarity { uri_a, uri_b }`** — pairwise cosine similarity of two stored embeddings. Routes `lip://` to symbol embeddings and `file://` to file embeddings. Returns `SimilarityResult { score: Option<f32> }`. Safe inside `BatchQuery`.
1112
+
-[x]**`QueryExpansion { query, top_k, model }`** — embed a query string, find the `top_k` nearest symbols, return display names as expansion terms. Not permitted inside `BatchQuery`.
1113
+
-[x]**`Cluster { uris, radius }`** — group URIs by embedding proximity using greedy single-link assignment. Returns `ClusterResult { groups }`. Not permitted inside `BatchQuery`.
1114
+
-[x]**`ExportEmbeddings { uris }`** — return raw stored embedding vectors as `HashMap<String, Vec<f32>>`. Enables cross-repo federation. Safe inside `BatchQuery`.
1115
+
-[x]**`lip slice --pip`** — Python dependency slice support. Indexes packages in the current Python environment.
1116
+
-[x] 5 new MCP tools: `lip_reindex_files`, `lip_similarity`, `lip_query_expansion`, `lip_cluster`, `lip_export_embeddings`.
-[x]**`QueryOutliers`** — leave-one-out mean cosine similarity; returns files most semantically displaced from their group.
1122
+
-[x]**`QuerySemanticDrift`** — pairwise cosine distance between two stored embeddings. Scalar drift metric.
1123
+
-[x]**`SimilarityMatrix`** — all pairwise cosine similarities for a list of URIs in one call.
1124
+
-[x]**`FindSemanticCounterpart`** — ranked search over a candidate pool; finds the test file covering a changed implementation even when naming conventions differ.
1125
+
-[x]**`QueryCoverage`** — embedding coverage report under a filesystem root, broken down by directory.
-[x]**`FindBoundaries`** — chunk a file into line-windows, embed each, return positions where cosine distance between adjacent windows exceeds a threshold.
1131
+
-[x]**`SemanticDiff`** — embeds two content strings, returns drift distance plus nearest files to the direction of change (`moving_toward`).
1132
+
-[x]**`QueryNearestInStore`** — nearest-neighbour search against a caller-provided embedding store. Enables cross-repo federation.
-[x]**`filter: Option<String>`** on all nearest-neighbour search calls — glob pattern restricts the candidate set before scoring.
1141
+
-[x]**`min_score: Option<f32>`** on the same calls — quality gate that drops results below a cosine-similarity threshold.
1142
+
-[x]**`GetCentroid { uris }`** — compute and return the embedding centroid of a file set server-side. Safe inside `BatchQuery`.
1143
+
-[x]**`QueryStaleEmbeddings { root }`** — report files whose stored embedding is older than their current mtime. Not permitted inside `BatchQuery`.
1144
+
-[x] 2 new MCP tools (`lip_get_centroid`, `lip_stale_embeddings`) + `filter`/`min_score` params on 5 existing tools.
1145
+
1146
+
### v2.0 — Semantic explainability + model provenance ✓
1147
+
1148
+
-[x]**`ExplainMatch { query, result_uri, top_k, chunk_lines, model }`** — explain *why* a result file ranked as a strong match. Chunks `result_uri`'s source into line-windows, batch-embeds each, and cosine-scores against the query embedding. Returns `ExplainMatchResult { chunks: Vec<ExplanationChunk>, query_model }`. Not permitted inside `BatchQuery`. New MCP tool: `lip_explain_match`.
1149
+
-[x]**Model provenance** — every embedding now records the model name that produced it. `QueryFileStatus` returns `embedding_model: Option<String>`. `QueryIndexStatus` returns `mixed_models: bool` and `models_in_index: Vec<String>` with a `⚠ MIXED MODELS` warning when cosine scores are unreliable across a model upgrade boundary.
0 commit comments