Skip to content

Commit 08c0d74

Browse files
ruvnetclaude
andauthored
fix(ruvector-wasm): correct adapter for WASM build's flat-index, distance-score, and metadata gaps (#568)
The published @ruvector/wasm build behaves differently from its generated .d.ts in three ways that bite consumers: 1. HNSW is not active — the wasm32 target compiles without the `hnsw` feature and falls back to a flat (brute-force) index, so search is O(n). The O(log n) win is latent until the WASM HNSW lands. 2. `result.score` is a cosine distance (lower is better), not the "higher is better" similarity the .d.ts advertises (ordering is correct: a, b before c). 3. Metadata does not round-trip — search/get return {}. Add RuvectorWasmAdapter (@ruvector/wasm/adapter) which wraps VectorDB with: - a metadata sidecar so inserted metadata round-trips - similarity = 1 - distance (generalised per metric) with `.score` aliased to similarity, plus the raw `distance` preserved - indexType/usesHnsw + WASM_HNSW_AVAILABLE so callers don't assume HNSW - client-side metadata filtering with over-fetch Includes TS declarations with corrected doc comments, a node:test suite covering all three findings, README guidance, and package exports. Co-authored-by: Claude <noreply@anthropic.com>
1 parent 524751e commit 08c0d74

5 files changed

Lines changed: 688 additions & 0 deletions

File tree

crates/ruvector-wasm/README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,43 @@ results.forEach(result => {
115115
});
116116
```
117117

118+
> ⚠️ **Read this before trusting the raw bindings.** Three behaviours of the
119+
> current WASM build differ from what the generated `.d.ts` advertises:
120+
>
121+
> 1. **HNSW is not active in the WASM build.** It compiles without the `hnsw`
122+
> cargo feature and silently falls back to a brute-force flat index, so search
123+
> is O(n), not O(log n). The HNSW win is latent until the WASM HNSW lands.
124+
> 2. **`result.score` is a cosine *distance* (lower is better)** — the ordering is
125+
> correct, but it is *not* the "higher is better" similarity the `.d.ts`
126+
> describes.
127+
> 3. **Metadata does not round-trip**`search`/`get` return `{}`.
128+
>
129+
> Use the bundled **adapter** instead of the raw `VectorDB` to get these handled
130+
> correctly (see below).
131+
132+
### Recommended: the corrected adapter
133+
134+
`@ruvector/wasm/adapter` wraps `VectorDB` with a metadata sidecar and a real
135+
`similarity = 1 - distance` so the documented "higher is better" contract holds.
136+
137+
```javascript
138+
import { RuvectorWasmAdapter } from '@ruvector/wasm/adapter';
139+
140+
// Loads + inits the WASM module and constructs the VectorDB for you.
141+
const index = await RuvectorWasmAdapter.create({ dimensions: 384, metric: 'cosine' });
142+
143+
index.insert({ id: 'doc_1', vector: embedding, metadata: { title: 'My Document' } });
144+
145+
const results = index.search({ vector: query, k: 10 });
146+
results.forEach(r => {
147+
console.log(r.id, r.similarity); // similarity: higher is better
148+
console.log(r.distance); // raw distance: lower is better
149+
console.log(r.metadata); // round-trips correctly via the sidecar
150+
});
151+
152+
console.log(index.indexType); // 'flat' until WASM HNSW lands
153+
```
154+
118155
### React Integration
119156

120157
```typescript

crates/ruvector-wasm/package.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,20 @@
44
"description": "High-performance Rust vector database for browsers via WASM",
55
"main": "pkg/ruvector_wasm.js",
66
"types": "pkg/ruvector_wasm.d.ts",
7+
"exports": {
8+
".": {
9+
"types": "./pkg/ruvector_wasm.d.ts",
10+
"default": "./pkg/ruvector_wasm.js"
11+
},
12+
"./adapter": {
13+
"types": "./src/adapter.d.ts",
14+
"default": "./src/adapter.js"
15+
}
16+
},
717
"files": [
818
"pkg",
19+
"src/adapter.js",
20+
"src/adapter.d.ts",
921
"src/worker.js",
1022
"src/worker-pool.js",
1123
"src/indexeddb.js"
@@ -18,6 +30,7 @@
1830
"build:bundler": "wasm-pack build --target bundler --out-dir pkg-bundler --release",
1931
"build:all": "npm run build && npm run build:node && npm run build:bundler",
2032
"test": "wasm-pack test --headless --chrome",
33+
"test:adapter": "node --test tests/adapter.test.mjs",
2134
"test:firefox": "wasm-pack test --headless --firefox",
2235
"test:node": "wasm-pack test --node",
2336
"size": "npm run build && gzip -c pkg/ruvector_wasm_bg.wasm | wc -c && echo 'bytes (gzipped)'",
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
/**
2+
* Type declarations for the RuvectorWasmAdapter.
3+
*
4+
* Unlike the generated `pkg/ruvector_wasm.d.ts`, the `score` documented here is
5+
* a real similarity (higher is better); the raw distance is exposed separately.
6+
*
7+
* @module @ruvector/wasm/adapter
8+
*/
9+
10+
/**
11+
* Whether the published WASM build ships an active HNSW index.
12+
* `false` today: the WASM `VectorDB` falls back to a flat (brute-force) index.
13+
*/
14+
export const WASM_HNSW_AVAILABLE: boolean;
15+
16+
/**
17+
* Convert a raw distance (lower is better) into a similarity (higher is better).
18+
* @param metric 'cosine' | 'dot' | 'dotproduct' | 'euclidean' | 'manhattan'
19+
* @param distance Raw score returned by the WASM `search`.
20+
*/
21+
export function distanceToSimilarity(metric: string, distance: number): number;
22+
23+
/** A single search result, with similarity and metadata corrected. */
24+
export interface AdapterSearchResult {
25+
/** Vector id. */
26+
id: string;
27+
/** Similarity score — higher is better. */
28+
similarity: number;
29+
/** Raw distance from the underlying index — lower is better. */
30+
distance: number;
31+
/** Alias of `similarity`, so a `.score` read honours "higher is better". */
32+
score: number;
33+
/** Vector data, when returned by the index. */
34+
vector?: Float32Array;
35+
/** Round-tripped metadata from the sidecar. */
36+
metadata?: Record<string, any>;
37+
}
38+
39+
/** Minimal shape of the underlying WASM (or test-double) VectorDB. */
40+
export interface WasmVectorDBLike {
41+
insert(
42+
vector: Float32Array,
43+
id?: string,
44+
metadata?: Record<string, any>
45+
): string;
46+
insertBatch(
47+
entries: Array<{
48+
id?: string;
49+
vector: Float32Array;
50+
metadata?: Record<string, any>;
51+
}>
52+
): string[];
53+
search(
54+
vector: Float32Array,
55+
k: number,
56+
filter?: Record<string, any>
57+
): Array<{
58+
id: string;
59+
score: number;
60+
vector?: Float32Array;
61+
metadata?: Record<string, any>;
62+
}>;
63+
get(
64+
id: string
65+
): { id?: string; vector?: Float32Array; metadata?: Record<string, any> } | null;
66+
delete(id: string): boolean;
67+
len?(): number;
68+
isEmpty?(): boolean;
69+
}
70+
71+
export interface AdapterOptions {
72+
/** Vector dimensions (informational). */
73+
dimensions?: number;
74+
/** Distance metric the db was created with; controls similarity conversion. */
75+
metric?: string;
76+
/** Override the index-type report. Defaults to {@link WASM_HNSW_AVAILABLE}. */
77+
usesHnsw?: boolean;
78+
}
79+
80+
export interface CreateOptions {
81+
/** Vector dimensions (required). */
82+
dimensions: number;
83+
/** Distance metric. Defaults to 'cosine'. */
84+
metric?: string;
85+
/** Requested at the WASM layer (the build falls back to flat regardless). */
86+
useHnsw?: boolean;
87+
/** Pre-imported WASM module; if omitted, `@ruvector/wasm` is imported. */
88+
module?: any;
89+
}
90+
91+
/** Correct wrapper around the generated WASM `VectorDB`. */
92+
export class RuvectorWasmAdapter {
93+
constructor(db: WasmVectorDBLike, options?: AdapterOptions);
94+
95+
static create(options: CreateOptions): Promise<RuvectorWasmAdapter>;
96+
97+
/** `false` for the current WASM build — flat O(n) search. */
98+
readonly usesHnsw: boolean;
99+
/** 'hnsw' | 'flat' — index type backing this adapter. */
100+
readonly indexType: 'hnsw' | 'flat';
101+
102+
insert(entry: {
103+
id?: string;
104+
vector: Float32Array | number[];
105+
metadata?: Record<string, any>;
106+
}): string;
107+
108+
insertBatch(
109+
entries: Array<{
110+
id?: string;
111+
vector: Float32Array | number[];
112+
metadata?: Record<string, any>;
113+
}>
114+
): string[];
115+
116+
search(query: {
117+
vector: Float32Array | number[];
118+
k: number;
119+
filter?: Record<string, any>;
120+
}): AdapterSearchResult[];
121+
122+
get(
123+
id: string
124+
): { id: string; vector?: Float32Array; metadata?: Record<string, any> } | null;
125+
126+
delete(id: string): boolean;
127+
len(): number;
128+
isEmpty(): boolean;
129+
clearMetadata(): void;
130+
}
131+
132+
export default RuvectorWasmAdapter;

0 commit comments

Comments
 (0)