Conversation
WalkthroughAdds a native Rust core (osgrep-core) with N‑API bindings, replaces JS/ONNX worker pipelines with native embedding/rerank calls, removes the skeleton subsystem and many experiment/benchmark scripts, and migrates build tooling and CI from pnpm to bun. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI / Client
participant Orch as Orchestrator (TS)
participant Native as osgrep-core (Rust N‑API)
participant HF as HuggingFace Hub
participant ONNX as ONNX Runtime (inside Native)
CLI->>Orch: initNative()
Orch->>Native: initModels(denseRepo,colbertRepo)
Native->>HF: fetch model artifacts
HF-->>Native: model files
Native->>ONNX: create sessions / load tokenizers
ONNX-->>Native: sessions ready
Native-->>Orch: initialized
Orch-->>CLI: ready
CLI->>Orch: encodeQuery(text)
Orch->>Native: encodeQueryColbert(query)
Native->>ONNX: tokenize + infer
ONNX-->>Native: query embeddings
Native-->>Orch: embeddings
Orch-->>CLI: embeddings
CLI->>Orch: processFile(fileChunks)
Orch->>Native: embedBatch(chunks)
Native->>ONNX: batch tokenize + infer
ONNX-->>Native: dense + colbert outputs
Native-->>Orch: packed embeddings
Orch-->>CLI: prepared chunks
CLI->>Orch: rerank({queryEmb, docs, topK})
Orch->>Native: rerankColbert(...)
Native->>ONNX: maxsim / rerank
ONNX-->>Native: scores
Native-->>Orch: ranked indices/scores
Orch-->>CLI: top-K results
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tools/eval.ts (1)
65-115: Update eval cases to reference existing orchestrator functions only.The eval cases at lines 67-115 reference removed modules that no longer exist:
- Lines 67-85:
src/lib/workers/pool.ts— 4 cases will fail (file deleted)- Lines 87-104:
src/lib/workers/embeddings/granite.tsandcolbert.ts— 4 cases will fail (files deleted)- Lines 107-115:
src/lib/workers/orchestrator.ts— onlyencodeQueryexists;computeHybridand references topooled_colbert_48ddo not exist, causing 1 additional case to failRemove or update the 9 failing eval cases to reference only existing functions in the current codebase.
🧹 Nitpick comments (6)
src/lib/setup/setup-helpers.ts (1)
21-24: Stale docstring references models.The docstring mentions "ensures osgrep directories and models exist" but the model download logic has been removed. Update to reflect current behavior:
/** - * Idempotent helper that ensures osgrep directories and models exist. + * Idempotent helper that ensures osgrep directories exist. * Returns status about work performed so callers can decide what to show. */src/lib/index/syncer.ts (1)
188-197: Function name is misleading after removing retry logic.
processFileWithRetryno longer retries anything. Consider renaming toprocessFileSafeor just inlining the call since it's now a simple wrapper:- const processFileWithRetry = async ( - relPath: string, - absPath: string, - ): Promise<ProcessFileResult> => { - // No retries needed - native embedding is stable - return await processFile({ - path: relPath, - absolutePath: absPath, - }); - };Then at line 262, replace:
- const result = await processFileWithRetry(relPath, absPath); + const result = await processFile({ + path: relPath, + absolutePath: absPath, + });plan.md (1)
262-265: Add language specifier to fenced code blocks.Per markdownlint, fenced code blocks should have a language specified for proper syntax highlighting. These appear to be file path listings:
-``` +```text src/ ├── index.ts # CLI entrySame applies to code blocks at lines 294 and 299.
src/lib/native/index.ts (2)
79-88: Consider zero-copy optimization with subarray views.The current implementation copies each vector into a new Float32Array. For better performance, consider using
subarray()to create zero-copy views, provided the underlying buffer is not reused:const vectors: Float32Array[] = []; for (let i = 0; i < result.count; i++) { const start = i * dim; vectors.push(result.embeddings.subarray(start, start + dim)); }However, this creates views that share the same buffer. Only apply this optimization if the native code guarantees buffer stability.
165-171: Clarify the interimcolbertOffsetvalue.Line 169 sets
colbertOffset: 0with a comment indicating it will be set later during storage. This creates an intermediate state where the offset is not meaningful. Consider adding a JSDoc comment to theHybridEmbeddinginterface explaining thatcolbertOffsetis initialized to 0 and populated during storage to avoid confusion for callers.src/lib/workers/orchestrator.ts (1)
209-211: UseembedDenseinstead ofembedBatchfor single query.Line 210 uses
embedBatch([text])to get the dense embedding, butembedBatchcomputes both dense and ColBERT embeddings. Since line 214 separately callsencodeQueryColbertfor query-specific ColBERT encoding, the ColBERT portion fromembedBatchis discarded.Apply this diff to use the more appropriate function:
await this.ensureReady(); // Get dense embedding - const hybrids = await embedBatch([text]); - const denseVector = hybrids[0].dense; + const denseVectors = await embedDense([text]); + const denseVector = denseVectors[0]; // Get ColBERT query embedding const colbertFlat = await encodeQueryColbert(text);Don't forget to add
embedDenseto the import at line 21:-import { embedBatch, initNative, encodeQueryColbert, rerankColbert } from "../native"; +import { embedBatch, embedDense, initNative, encodeQueryColbert, rerankColbert } from "../native";
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (2)
bun.lockbis excluded by!**/bun.lockbpnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (38)
package.json(1 hunks)plan.md(1 hunks)src/commands/doctor.ts(2 hunks)src/commands/search.ts(0 hunks)src/commands/serve.ts(0 hunks)src/commands/setup.ts(2 hunks)src/commands/skeleton.ts(0 hunks)src/commands/trace.ts(0 hunks)src/commands/verify.ts(0 hunks)src/config.ts(0 hunks)src/index.ts(0 hunks)src/lib/graph/graph-builder.ts(0 hunks)src/lib/index/syncer.ts(2 hunks)src/lib/native/index.ts(1 hunks)src/lib/output/formatter.ts(0 hunks)src/lib/output/json-formatter.ts(0 hunks)src/lib/search/intent.ts(0 hunks)src/lib/search/searcher.ts(9 hunks)src/lib/setup/model-loader.ts(0 hunks)src/lib/setup/setup-helpers.ts(2 hunks)src/lib/skeleton/body-fields.ts(0 hunks)src/lib/skeleton/index.ts(0 hunks)src/lib/skeleton/retriever.ts(0 hunks)src/lib/skeleton/skeletonizer.ts(0 hunks)src/lib/skeleton/summary-formatter.ts(0 hunks)src/lib/store/types.ts(0 hunks)src/lib/store/vector-db.ts(0 hunks)src/lib/utils/exit.ts(0 hunks)src/lib/workers/colbert-math.ts(0 hunks)src/lib/workers/colbert-tokenizer.ts(0 hunks)src/lib/workers/download-worker.ts(0 hunks)src/lib/workers/embeddings/colbert.ts(0 hunks)src/lib/workers/embeddings/granite.ts(0 hunks)src/lib/workers/orchestrator.ts(8 hunks)src/lib/workers/pool.ts(0 hunks)src/lib/workers/process-child.ts(0 hunks)src/lib/workers/worker.ts(0 hunks)tools/eval.ts(1 hunks)
💤 Files with no reviewable changes (28)
- src/lib/workers/embeddings/granite.ts
- src/lib/skeleton/retriever.ts
- src/commands/trace.ts
- src/index.ts
- src/commands/serve.ts
- src/config.ts
- src/lib/workers/embeddings/colbert.ts
- src/lib/skeleton/skeletonizer.ts
- src/commands/search.ts
- src/lib/store/types.ts
- src/lib/workers/worker.ts
- src/lib/workers/process-child.ts
- src/lib/workers/colbert-math.ts
- src/lib/search/intent.ts
- src/lib/utils/exit.ts
- src/lib/store/vector-db.ts
- src/lib/workers/pool.ts
- src/commands/verify.ts
- src/lib/output/json-formatter.ts
- src/lib/output/formatter.ts
- src/lib/skeleton/body-fields.ts
- src/lib/setup/model-loader.ts
- src/lib/workers/colbert-tokenizer.ts
- src/lib/skeleton/summary-formatter.ts
- src/commands/skeleton.ts
- src/lib/skeleton/index.ts
- src/lib/workers/download-worker.ts
- src/lib/graph/graph-builder.ts
🧰 Additional context used
🧬 Code graph analysis (5)
src/lib/index/syncer.ts (1)
src/lib/workers/orchestrator.ts (2)
processFile(150-200)processFile(324-328)
src/commands/setup.ts (1)
src/lib/native/index.ts (1)
initNative(36-52)
src/commands/doctor.ts (1)
src/lib/native/index.ts (1)
initNative(36-52)
src/lib/search/searcher.ts (2)
src/lib/workers/orchestrator.ts (4)
encodeQuery(202-233)encodeQuery(330-332)rerank(235-308)rerank(334-340)src/lib/utils/filter-builder.ts (1)
escapeSqlString(1-5)
src/lib/native/index.ts (1)
src/config.ts (2)
MODEL_IDS(4-7)CONFIG(18-26)
🪛 LanguageTool
plan.md
[style] ~252-~252: To form a complete sentence, be sure to include a subject.
Context: ...lex role coloring and breadcrumb logic. Could be simplified to just show: - File path...
(MISSING_IT_THERE)
🪛 markdownlint-cli2 (0.18.1)
plan.md
262-262: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
294-294: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
299-299: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (19)
tools/eval.ts (1)
1-8: Import paths updated correctly.The import paths are correctly updated to reference
../src/lib/...from thetools/directory location.src/lib/setup/setup-helpers.ts (1)
31-55: Setup simplification looks good.The removal of model download logic and simplification to only manage directories aligns well with the migration to native model initialization via
initNative().src/commands/doctor.ts (1)
27-34: Good runtime health check via native initialization.The doctor command now actually tests that native models can initialize rather than just checking for directory presence. The error handling provides actionable guidance with the "Try: osgrep setup" message.
src/lib/index/syncer.ts (1)
10-10: Import change aligns with native migration.Switching from pool-based to direct orchestrator
processFileis consistent with the architectural shift to native Rust embeddings.src/lib/search/searcher.ts (4)
16-20: Good consolidation of tuning constants.Moving these from environment variables to static class constants improves maintainability and reduces cognitive load. The values appear reasonable:
PRE_RERANK_K_MIN = 500ensures sufficient candidate poolRERANK_CANDIDATES_K = 80balances quality vs performanceMAX_PER_FILE = 3provides result diversity
140-171: Simplified structure boosting looks good.The removal of intent-based and role-based boosting simplifies the scoring logic significantly. The remaining penalties are straightforward:
- Anchors: 0.99 (minimal penalty)
- Test paths: 0.5 (reasonable de-prioritization)
- Docs/config: 0.6 (reasonable de-prioritization)
218-225: Clean API simplification.Removing the
intentparameter and renaming_search_options/_filterstooptions/filtersimproves the public API. The underscore prefix previously suggested these were unused or internal.
354-368: Verify rerank call matches orchestrator signature.The rerank call passes
query: queryMatrixRawwhich isnumber[][](fromencodeQueryreturn type). The orchestrator'srerankmethod signature isasync rerank(input: { query: number[][], docs: RerankDoc[], colbertDim: number }): Promise<number[]>, whereRerankDocis{ colbert: Buffer | Int8Array | number[] }. The call correctly provides all three required parameters with matching types.src/lib/native/index.ts (5)
14-31: LGTM! Clear lazy-loading pattern with helpful error guidance.The singleton pattern for native module loading is well-implemented with appropriate error handling and actionable error messages.
110-121: LGTM! Clean packed format for ColBERT embeddings.The function correctly wraps the native result in appropriate typed arrays for efficient storage and transmission.
184-190: LGTM! Straightforward query encoding.
40-48: Remove the await suggestion;initModelsis synchronous.Line 44 calls
n.initModels()withoutawait, which is correct. All native function calls in this module (includingembedDense,embedColbertPacked,embedBatch,encodeQueryColbert, andrerankColbert) are invoked synchronously. The TypeScript project uses strict type checking, and the code compiles successfully, indicating thatinitModelsis a synchronous function, not asynchronous. No changes are required.Likely an incorrect or invalid review comment.
225-232: Verify native binding signature forrerankColbertin osgrep-core.Lines 226-227 convert
Float32ArrayandInt8Arrayto regular arrays usingArray.from(). If the native binding accepts typed arrays directly, these conversions create unnecessary overhead for large embeddings.Check the Rust binding definition in osgrep_v2 to determine if the signature accepts typed arrays. If so, remove the conversions:
const result = n.rerankColbert( - Array.from(input.queryEmbedding), - Array.from(input.docEmbeddings), + input.queryEmbedding, + input.docEmbeddings, input.docLengths, input.docOffsets, input.candidateIndices, input.topK );src/lib/workers/orchestrator.ts (6)
12-28: LGTM! Clean imports and focused types.The migration to native bindings simplifies the type surface by removing obsolete fields from
RerankDoc.
63-73: LGTM! Efficient parallel initialization.The
ensureReadymethod correctly initializes both the chunker and native models in parallel with proper promise caching.
75-109: LGTM! Solid chunking implementation.The chunk processing correctly handles anchor chunks and maintains proper indexing.
111-148: LGTM! Proper chunk preparation.The method correctly formats chunks with all necessary metadata and uses the updated
filePathparameter consistently.
150-200: LGTM! Streamlined file processing pipeline.The refactored flow efficiently processes files by chunking and embedding in clear sequential steps, properly handling edge cases with the
shouldDeleteflag.
315-340: LGTM! Clean singleton pattern and convenient API wrappers.The module-level exports provide an ergonomic interface while maintaining a single orchestrator instance.
| "license": "Apache-2.0", | ||
| "description": "Local grep-like search tool for your codebase.", | ||
| "dependencies": { | ||
| "osgrep-core": "file:../osgrep_v2", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the full package.json to understand the context
cat -n package.json | head -60Repository: Ryandonofrio3/osgrep
Length of output: 2126
🏁 Script executed:
# Check if workspaces are configured
rg '"workspaces"' package.json -A 10Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Check if osgrep_v2 directory exists and has a package.json
ls -la ../osgrep_v2/ 2>/dev/null | head -20Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Check parent directory structure
ls -la .. | grep -E "^d.*osgrep"Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Look for any build or publishing configuration
rg '"publishConfig"|"main"|"exports"' package.jsonRepository: Ryandonofrio3/osgrep
Length of output: 85
🏁 Script executed:
# Check if osgrep-core is published to npm
npm view osgrep-core@latest 2>/dev/null | head -20 || echo "Package not found on npm or npm not available"Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Double-check the prepublishOnly script and files array
cat package.json | grep -A 2 -E '"prepublishOnly"|"files"'Repository: Ryandonofrio3/osgrep
Length of output: 162
Local file dependency will break npm distribution.
The dependency "osgrep-core": "file:../osgrep_v2" references a local path outside the package. This will cause installation failures for users installing via npm/pnpm since the relative path won't exist on their systems. The "files" array in package.json does not include this directory, so it cannot be bundled.
Consider one of:
- Publish
osgrep-coreto npm and reference by version - Use a workspace configuration if this is a monorepo
- Bundle the native addon within this package
- Use a postinstall script to build/fetch the native dependency
🤖 Prompt for AI Agents
In package.json at line ~50 (the dependency entry "osgrep-core":
"file:../osgrep_v2"), the package references a local file path which will break
installs for external consumers; replace the file: reference with one of the
recommended approaches: publish osgrep-core to npm and change the dependency to
the published semantic version, convert the repo to a workspace/monorepo and
reference the package via workspace protocol (and ensure workspace config is
committed), bundle the compiled native addon into this package and add it to the
"files" array, or implement a postinstall script that builds or downloads the
native dependency during install; update package.json accordingly and ensure
CI/build scripts reflect the chosen approach.
| // Pre-warm native models (downloads via HuggingFace Hub cache on first run) | ||
| console.log("\nInitializing native models..."); | ||
| await initNative(); | ||
| console.log("✓ Native models ready"); |
There was a problem hiding this comment.
Missing error handling for initNative() call.
Unlike doctor.ts which wraps initNative() in a try/catch with a helpful error message, the setup command will crash ungracefully if native initialization fails. Consider consistent error handling:
// Pre-warm native models (downloads via HuggingFace Hub cache on first run)
console.log("\nInitializing native models...");
- await initNative();
- console.log("✓ Native models ready");
+ try {
+ await initNative();
+ console.log("✓ Native models ready");
+ } catch (error) {
+ console.error("✗ Native model initialization failed:", error);
+ console.error(" Check network connectivity and disk space.");
+ process.exit(1);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Pre-warm native models (downloads via HuggingFace Hub cache on first run) | |
| console.log("\nInitializing native models..."); | |
| await initNative(); | |
| console.log("✓ Native models ready"); | |
| // Pre-warm native models (downloads via HuggingFace Hub cache on first run) | |
| console.log("\nInitializing native models..."); | |
| try { | |
| await initNative(); | |
| console.log("✓ Native models ready"); | |
| } catch (error) { | |
| console.error("✗ Native model initialization failed:", error); | |
| console.error(" Check network connectivity and disk space."); | |
| process.exit(1); | |
| } |
🤖 Prompt for AI Agents
In src/commands/setup.ts around lines 37 to 40, the call to initNative() lacks
error handling and can crash the CLI; wrap the await initNative() call in a
try/catch, log a clear error message including the caught error details (use the
same logger pattern as doctor.ts or console.error), then exit with a non-zero
code (e.g., process.exit(1)) or rethrow after logging so the failure is handled
gracefully.
| // Pack all doc embeddings into a single buffer; offsets are element offsets | ||
| const packedChunks: Int8Array[] = []; | ||
| let totalElements = 0; | ||
|
|
||
| for (let i = 0; i < input.docs.length; i++) { | ||
| const doc = input.docs[i]; | ||
| const col = doc.colbert; | ||
| let colbert: Int8Array; | ||
|
|
||
| let colbert: Int8Array; | ||
| if (col instanceof Int8Array) { | ||
| colbert = col; | ||
| } else if (Buffer.isBuffer(col)) { | ||
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | ||
| } else if ( | ||
| col && | ||
| typeof col === "object" && | ||
| "type" in col && | ||
| (col as any).type === "Buffer" && | ||
| Array.isArray((col as any).data) | ||
| ) { | ||
| // IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer) | ||
| colbert = new Int8Array((col as any).data); | ||
| } else if (Array.isArray(col)) { | ||
| colbert = new Int8Array(col); | ||
| } else { | ||
| colbert = new Int8Array(0); | ||
| } | ||
|
|
||
| const seqLen = Math.floor(colbert.length / input.colbertDim); | ||
| const docMatrix: Float32Array[] = []; | ||
| for (let i = 0; i < seqLen; i++) { | ||
| const start = i * input.colbertDim; | ||
| const row = new Float32Array(input.colbertDim); | ||
| for (let d = 0; d < input.colbertDim; d++) { | ||
| row[d] = (colbert[start + d] * doc.scale) / 127.0; | ||
| } | ||
| docMatrix.push(row); | ||
| } | ||
| const tokenIds = | ||
| Array.isArray(doc.token_ids) && doc.token_ids.length === seqLen | ||
| ? doc.token_ids | ||
| : undefined; | ||
| return maxSim(queryMatrix, docMatrix, tokenIds); | ||
| const used = colbert.subarray(0, seqLen * input.colbertDim); | ||
|
|
||
| docOffsets.push(totalElements); | ||
| docLengths.push(seqLen); | ||
| candidateIndices.push(i); | ||
| packedChunks.push(used); | ||
| totalElements += used.length; | ||
| } |
There was a problem hiding this comment.
Clarify offset semantics: bytes vs. elements.
The implementation uses element offsets (line 276: docOffsets.push(totalElements), line 280: totalElements += used.length), but the RerankInput interface in src/lib/native/index.ts (line 203) documents docOffsets as "Byte offsets per doc."
For Int8Array, elements and bytes are equivalent (1 element = 1 byte), so the code is functionally correct. However, the inconsistent terminology is confusing.
Apply these changes for clarity:
In src/lib/native/index.ts, line 203:
/** Token counts per doc */
docLengths: number[];
- /** Byte offsets per doc */
+ /** Element offsets per doc in docEmbeddings */
docOffsets: number[];In src/lib/workers/orchestrator.ts, line 254:
- // Pack all doc embeddings into a single buffer; offsets are element offsets
+ // Pack all doc embeddings into a single buffer
const packedChunks: Int8Array[] = [];
let totalElements = 0;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Pack all doc embeddings into a single buffer; offsets are element offsets | |
| const packedChunks: Int8Array[] = []; | |
| let totalElements = 0; | |
| for (let i = 0; i < input.docs.length; i++) { | |
| const doc = input.docs[i]; | |
| const col = doc.colbert; | |
| let colbert: Int8Array; | |
| let colbert: Int8Array; | |
| if (col instanceof Int8Array) { | |
| colbert = col; | |
| } else if (Buffer.isBuffer(col)) { | |
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | |
| } else if ( | |
| col && | |
| typeof col === "object" && | |
| "type" in col && | |
| (col as any).type === "Buffer" && | |
| Array.isArray((col as any).data) | |
| ) { | |
| // IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer) | |
| colbert = new Int8Array((col as any).data); | |
| } else if (Array.isArray(col)) { | |
| colbert = new Int8Array(col); | |
| } else { | |
| colbert = new Int8Array(0); | |
| } | |
| const seqLen = Math.floor(colbert.length / input.colbertDim); | |
| const docMatrix: Float32Array[] = []; | |
| for (let i = 0; i < seqLen; i++) { | |
| const start = i * input.colbertDim; | |
| const row = new Float32Array(input.colbertDim); | |
| for (let d = 0; d < input.colbertDim; d++) { | |
| row[d] = (colbert[start + d] * doc.scale) / 127.0; | |
| } | |
| docMatrix.push(row); | |
| } | |
| const tokenIds = | |
| Array.isArray(doc.token_ids) && doc.token_ids.length === seqLen | |
| ? doc.token_ids | |
| : undefined; | |
| return maxSim(queryMatrix, docMatrix, tokenIds); | |
| const used = colbert.subarray(0, seqLen * input.colbertDim); | |
| docOffsets.push(totalElements); | |
| docLengths.push(seqLen); | |
| candidateIndices.push(i); | |
| packedChunks.push(used); | |
| totalElements += used.length; | |
| } | |
| // Pack all doc embeddings into a single buffer | |
| const packedChunks: Int8Array[] = []; | |
| let totalElements = 0; | |
| for (let i = 0; i < input.docs.length; i++) { | |
| const doc = input.docs[i]; | |
| const col = doc.colbert; | |
| let colbert: Int8Array; | |
| if (col instanceof Int8Array) { | |
| colbert = col; | |
| } else if (Buffer.isBuffer(col)) { | |
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | |
| } else if (Array.isArray(col)) { | |
| colbert = new Int8Array(col); | |
| } else { | |
| colbert = new Int8Array(0); | |
| } | |
| const seqLen = Math.floor(colbert.length / input.colbertDim); | |
| const used = colbert.subarray(0, seqLen * input.colbertDim); | |
| docOffsets.push(totalElements); | |
| docLengths.push(seqLen); | |
| candidateIndices.push(i); | |
| packedChunks.push(used); | |
| totalElements += used.length; | |
| } |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/lib/workers/orchestrator.ts (1)
257-292: Offset semantics clarification needed (see past review).As noted in a previous review, the code uses element offsets (line 285:
docOffsets.push(totalElements)), but theRerankInputinterface documentsdocOffsetsas "Byte offsets per doc." ForInt8Array, elements equal bytes, so the code is functionally correct, but the terminology should be clarified.The previous review suggested updating the interface documentation in
src/lib/native/index.tsto say "Element offsets per doc in docEmbeddings" instead of "Byte offsets per doc."
🧹 Nitpick comments (5)
src/lib/search/searcher.ts (3)
16-20: Constants are well-defined with reasonable defaults.The static readonly constants provide clear tuning parameters for the search pipeline. Consider documenting the rationale for these specific values (e.g., why
RERANK_CANDIDATES_K = 80andFUSED_WEIGHT = 0.5) in comments for future maintainers.
21-75: FTS normalization looks good with one minor suggestion.The stopword filtering and query normalization are well-implemented. The 16-token limit prevents pathological FTS parsing. One minor consideration: the minimum token length of 3 (line 70) could filter out meaningful short terms like "id", "io", or "db".
418-438: Defensive handling for token_ids conversion.The nested ternary for
token_idsextraction (lines 423-428) handles multiple input formats correctly. However, the deep nesting reduces readability.Consider extracting to a helper function:
+ private extractTokenIds(doc: any): number[] { + if (Array.isArray(doc.doc_token_ids)) { + return doc.doc_token_ids; + } + if (typeof doc.doc_token_ids?.toArray === "function") { + return (doc.doc_token_ids.toArray() ?? []).filter( + (v: unknown) => typeof v === "number" + ); + } + return []; + }tools/eval.ts (1)
533-540: Avoidance logic is correct but slightly redundant.Line 536 checks
avoidPath ? path.includes(avoidPath) : false, butavoidPathis always truthy due to the nullish coalescing on line 533. The conditionavoidPath ?is technically redundant.- const avoidRank = response.data.findIndex((chunk) => { - const path = chunk.metadata?.path?.toLowerCase() || ""; - return avoidPath ? path.includes(avoidPath) : false; - }); + const avoidRank = response.data.findIndex((chunk) => { + const path = chunk.metadata?.path?.toLowerCase() || ""; + return path.includes(avoidPath); + });src/lib/native/index.ts (1)
79-91: Dense embedding extraction is correct but could use subarray.The manual element-by-element copy works but
Float32Array.prototype.subarray()followed byslice()would be more idiomatic and potentially faster.const vectors: Float32Array[] = []; for (let i = 0; i < result.count; i++) { const start = i * dim; - const vec = new Float32Array(dim); - for (let j = 0; j < dim; j++) { - vec[j] = result.embeddings[start + j]; - } + const vec = result.embeddings.slice(start, start + dim); vectors.push(vec); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (7)
src/commands/search.ts(5 hunks)src/lib/native/index.ts(1 hunks)src/lib/search/searcher.ts(10 hunks)src/lib/store/types.ts(0 hunks)src/lib/store/vector-db.ts(1 hunks)src/lib/workers/orchestrator.ts(8 hunks)tools/eval.ts(6 hunks)
💤 Files with no reviewable changes (1)
- src/lib/store/types.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/store/vector-db.ts
🧰 Additional context used
🧬 Code graph analysis (1)
src/lib/native/index.ts (1)
src/config.ts (2)
MODEL_IDS(4-7)CONFIG(18-26)
🔇 Additional comments (20)
src/lib/search/searcher.ts (4)
11-11: LGTM on orchestrator import migration.The import correctly switches from worker-pool based encoding to the orchestrator-based
encodeQueryandrerankfunctions, aligning with the native Rust backend migration.
195-231: Structure boost simplification is appropriate.The removal of intent-based and complex type boosts aligns with the PR's goal of simplifying the search pipeline. The path-based penalties (tests: 0.5, tools: 0.35, docs: 0.6) are reasonable heuristics. The anchor penalty of 0.99 is minimal and appropriate for recall helpers.
278-289: Search method signature simplification looks correct.The removal of the
intentparameter and simplification of options aligns with the broader refactor. The defaultrerank: trueis a sensible default for quality results.
481-499: Score normalization and confidence labeling are well-implemented.The relative normalization against
maxScoreand the confidence thresholds (>0.8 High, >0.5 Medium, else Low) provide meaningful user-facing quality indicators. The guard againstmaxScore > 0prevents division by zero.tools/eval.ts (3)
1-8: Import path corrections are appropriate.The imports correctly navigate from
tools/to../src/lib/to access the library modules. This maintains consistency with the project structure.
17-18: Smart self-exclusion from eval results.Using
DEFAULT_AVOID_PATH = "tools/eval.ts"prevents the eval script itself from appearing in search results, which is a good practice since this file contains many query-related strings that could match incorrectly.
61-76: Eval cases correctly updated for native binding migration.The test cases now reference
src/lib/native/index.tsfor native core-related functionality (loadNative, initNative, rerankColbert), aligning with the architectural shift from worker-based to native bindings.src/commands/search.ts (4)
264-268: CLI option naming improvement.The change from
-m <max_count>to-m, --max-count <max_count>with a clear default of "5" improves CLI usability and self-documentation.
299-309: Options type correctly updated.The
maxCount: stringtype aligns with Commander's string-based option parsing. The removal of the skeleton option from the type reflects the broader skeleton functionality removal.
316-318: Robust limit validation.The validation ensures a positive finite number with a sensible fallback to 5. This handles edge cases like NaN, negative numbers, and Infinity.
492-498: Consistent limit usage in local search path.The
limitvariable is correctly passed tosearcher.search(), matching the server-side behavior. The parameter ordering aligns with the updatedSearcher.searchsignature.src/lib/native/index.ts (3)
15-17: Module-level state is appropriate for singleton pattern.The three state variables (
native,initPromise,initialized) correctly implement the lazy-loading singleton pattern with initialization guards.
151-179: Hybrid embedding extraction handles all formats correctly.The batch embedding function properly extracts dense vectors, ColBERT embeddings, and token IDs per document. The
tokenCursortracking ensures correct token ID slicing across documents.
231-261: Verify nativererankColbertparameter type expectations againstencodeQueryColbertreturn type.Line 235 converts
input.queryEmbedding(typed asFloat32Array) toFloat64ArrayviaFloat64Array.from(input.queryEmbedding as any). This creates an inconsistency:encodeQueryColbert()at line 197 explicitly returnsFloat32Array, yetrerankColbert()immediately converts it toFloat64Array.According to NAPI-RS documentation, TypedArray types must match their Rust counterparts—mixing
Float32ArrayandFloat64Arraydoes not trigger automatic reinterpretation, so the native function must expect one or the other. If the nativererankColbertfunction acceptsFloat32Array(matchingencodeQueryColbert's output), the conversion introduces unnecessary memory allocation and precision change overhead. The obscuringas anycast makes the intent unclear.Recommend verifying the Rust binding signature to confirm whether the conversion is required or should be removed to maintain type consistency.
src/lib/workers/orchestrator.ts (6)
1-10: Clear module documentation.The header comment clearly explains the module's responsibilities and the rationale for removing the worker pool (Rust ONNX Runtime is fast and stable).
64-74: Parallel initialization is appropriate.Using
Promise.allto initialize both the chunker and native models concurrently is efficient. The.finally()cleanup ensuresinitPromiseis cleared even on failure.
183-201: Clean embedding integration.The
embedBatchcall efficiently embeds all chunks in one native call. Converting ColBERT to Buffer for storage compatibility is appropriate. Thedoc_token_idsare correctly extracted from the hybrid embedding.
204-235: Query encoding correctly reshapes ColBERT output.The function properly converts the flat ColBERT array into a 2D matrix. The calculation
seqLen = colbertFlat.length / dimcorrectly determines the sequence length.
318-327: Score alignment preserves input order.The mapping from result indices back to input order via
candidateIndices.mapensures the Searcher receives scores in the expected order. This is critical for correct result ranking.
334-341: Singleton pattern is correct.The lazy initialization of the orchestrator singleton is appropriate for this use case. The null check ensures only one instance is created.
| export async function initNative(): Promise<void> { | ||
| if (initialized) return; | ||
| if (initPromise) return initPromise; | ||
|
|
||
| initPromise = (async () => { | ||
| const n = await loadNative(); | ||
|
|
||
| if (!n.isInitialized()) { | ||
| n.initModels(MODEL_IDS.embed, MODEL_IDS.colbert); | ||
| } | ||
|
|
||
| initialized = true; | ||
| })(); | ||
|
|
||
| await initPromise; | ||
| initPromise = null; | ||
| } |
There was a problem hiding this comment.
Idempotent initialization with race condition protection.
The initNative() function correctly handles concurrent calls by checking initPromise before starting initialization. However, setting initPromise = null on line 51 after awaiting could cause a race condition if another call starts during the brief window.
Consider keeping the promise cached or using a different pattern:
export async function initNative(): Promise<void> {
if (initialized) return;
if (initPromise) return initPromise;
initPromise = (async () => {
const n = await loadNative();
if (!n.isInitialized()) {
n.initModels(MODEL_IDS.embed, MODEL_IDS.colbert);
}
initialized = true;
})();
await initPromise;
- initPromise = null;
+ // Keep initPromise cached to handle concurrent calls during init
}🤖 Prompt for AI Agents
In src/lib/native/index.ts around lines 36–52, the code clears initPromise after
awaiting which opens a race where another caller could start a new init during
that brief gap; instead, keep initPromise set so concurrent callers continue
awaiting the same promise (i.e., remove the initPromise = null line) or
explicitly set it to a resolved Promise to represent completed initialization,
and wrap the initialization in try/catch so you only clear/reset initPromise on
failure (not on success) to allow retries on error but prevent concurrent
double-initialization.
There was a problem hiding this comment.
Actionable comments posted: 15
♻️ Duplicate comments (1)
src/lib/native/index.ts (1)
36-52: Race condition when clearinginitPromiseafter initialization.Setting
initPromise = nullon line 51 after awaiting could cause a race condition if another call starts during that brief window. This was flagged in a previous review.Apply the previously suggested fix:
await initPromise; - initPromise = null; + // Keep initPromise cached to handle concurrent calls during init }
🧹 Nitpick comments (23)
.gitignore (1)
32-33: Minor: Remove redundant target directory pattern.Line 33's
**/target/already matchesosgrep-core/target/from line 32, making line 32 redundant. Consider removing line 32 to keep the patterns concise.Apply this diff to remove redundancy:
# Native core (Rust / N-API) build outputs -osgrep-core/target/ **/target/ osgrep-core/*.nodeosgrep-core/tsconfig.json (1)
8-8: Consider removing unusedjsxconfiguration.The
jsx: "react-jsx"option appears unnecessary for a native binding package that likely doesn't contain React components. Removing it would simplify the configuration.osgrep-core/package.json (1)
9-17: Consider adding Linux ARM64 target for broader platform support.The triples configuration is missing
aarch64-unknown-linux-gnu, which is increasingly common for AWS Graviton instances, Docker containers on Apple Silicon, and other ARM-based Linux deployments."additional": [ "aarch64-apple-darwin", "x86_64-apple-darwin", "x86_64-unknown-linux-gnu", - "x86_64-pc-windows-msvc" + "x86_64-pc-windows-msvc", + "aarch64-unknown-linux-gnu" ]osgrep-core/src/lib.rs (2)
37-55: TOCTOU race in initialization is benign but could be cleaner.The pattern
if MODEL.get().is_none() { MODEL.set(...) }has a race window where multiple threads could both pass the check and attempt initialization. WhileOnceCell::setsafely returns an error if already set, this means redundant model loading could occur.Consider using
OnceCell::get_or_try_initfor atomic initialization:pub fn init_models(dense_repo: String, colbert_repo: String) -> Result<()> { - // Initialize dense model - if DENSE_MODEL.get().is_none() { - let encoder = DenseEncoderOrt::load_from_hf(&dense_repo, 384) - .map_err(|e| Error::from_reason(format!("Failed to load dense model: {:?}", e)))?; - DENSE_MODEL.set(Mutex::new(encoder)) - .map_err(|_| Error::from_reason("Dense model already initialized"))?; - } + DENSE_MODEL.get_or_try_init(|| { + DenseEncoderOrt::load_from_hf(&dense_repo, 384) + .map(Mutex::new) + .map_err(|e| Error::from_reason(format!("Failed to load dense model: {:?}", e))) + })?; - // Initialize ColBERT model - if COLBERT_MODEL.get().is_none() { - let encoder = ColbertEncoderOrt::load_from_hf(&colbert_repo, 48) - .map_err(|e| Error::from_reason(format!("Failed to load ColBERT model: {:?}", e)))?; - COLBERT_MODEL.set(Mutex::new(encoder)) - .map_err(|_| Error::from_reason("ColBERT model already initialized"))?; - } + COLBERT_MODEL.get_or_try_init(|| { + ColbertEncoderOrt::load_from_hf(&colbert_repo, 48) + .map(Mutex::new) + .map_err(|e| Error::from_reason(format!("Failed to load ColBERT model: {:?}", e))) + })?; Ok(()) }
254-265: Unnecessary clone and double mutex acquisition inembed_batch.
texts.clone()creates an extra allocation, and callingembed_densethenembed_colbert_packedacquires/releases each mutex separately. For better performance with large batches, consider inlining the logic to avoid the clone and reduce lock contention.pub fn embed_batch(texts: Vec<String>) -> Result<EmbedResult> { - let dense = embed_dense(texts.clone())?; - let colbert = embed_colbert_packed(texts)?; + // Inline to avoid clone and reduce lock acquisitions + let dense_model = DENSE_MODEL.get() + .ok_or_else(|| Error::from_reason("Models not initialized. Call init_models() first."))?; + let colbert_model = COLBERT_MODEL.get() + .ok_or_else(|| Error::from_reason("Models not initialized. Call init_models() first."))?; + + let embeddings_f32 = { + let mut encoder = dense_model.lock() + .map_err(|e| Error::from_reason(format!("Failed to lock dense model: {:?}", e)))?; + encoder.encode_batch(texts.clone(), true) + .map_err(|e| Error::from_reason(format!("Dense encoding failed: {:?}", e)))? + }; + + let packed = { + let mut encoder = colbert_model.lock() + .map_err(|e| Error::from_reason(format!("Failed to lock ColBERT model: {:?}", e)))?; + encoder.encode_docs_packed(&texts) + .map_err(|e| Error::from_reason(format!("ColBERT encoding failed: {:?}", e)))? + }; + + let embeddings_i8: Vec<i8> = packed.embeddings.iter() + .map(|&x| (x * 127.0).clamp(-128.0, 127.0) as i8) + .collect(); Ok(EmbedResult { - dense: dense.embeddings, - colbert_embeddings: colbert.embeddings, - colbert_token_ids: colbert.token_ids, - colbert_lengths: colbert.lengths, - colbert_offsets: colbert.offsets, + dense: embeddings_f32.iter().map(|&x| x as f64).collect(), + colbert_embeddings: embeddings_i8, + colbert_token_ids: packed.token_ids, + colbert_lengths: packed.lengths, + colbert_offsets: packed.offsets, }) }Note: The clone is still needed since both encoders need the texts, but this avoids acquiring each mutex twice when called through the individual functions.
osgrep-core/src/colbert_ort.rs (4)
76-79: Silent fallback for missing special tokens could mask configuration issues.Using
unwrap_or(&0)for CLS, SEP, MASK, and PAD tokens will silently default to token ID 0 if these tokens are missing from the vocabulary. This could lead to incorrect encoding behavior that's hard to debug. Consider returning an error or at least logging a warning when critical tokens are missing.- let cls_id = *vocab.get("[CLS]").unwrap_or(&0); - let sep_id = *vocab.get("[SEP]").unwrap_or(&0); - let mask_id = *vocab.get("[MASK]").unwrap_or(&0); - let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD + let cls_id = *vocab.get("[CLS]") + .ok_or_else(|| anyhow::anyhow!("Tokenizer missing [CLS] token"))?; + let sep_id = *vocab.get("[SEP]") + .ok_or_else(|| anyhow::anyhow!("Tokenizer missing [SEP] token"))?; + let mask_id = *vocab.get("[MASK]") + .ok_or_else(|| anyhow::anyhow!("Tokenizer missing [MASK] token"))?; + let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD
496-502: Token offset calculation is O(n) per document, leading to O(n²) complexity.The
token_offsetis recomputed by summing all previous lengths for each document. SincePackedDocEmbeddingsalready storesoffsetsfor embeddings, consider adding atoken_offsetsfield to store precomputed token offsets, making lookups O(1).pub struct PackedDocEmbeddings { pub embeddings: Vec<f32>, pub token_ids: Vec<u32>, pub lengths: Vec<u32>, pub offsets: Vec<u32>, + /// Token ID offsets for each doc (for fast lookup) + pub token_offsets: Vec<u32>, pub hidden_size: usize, }Then update
encode_docs_packedto populatetoken_offsetsalongsideoffsets, and simplifyscore_packed:- let token_offset: usize = packed.offsets[..doc_idx] - .iter() - .zip(&packed.lengths[..doc_idx]) - .map(|(&off, &len)| len as usize) - .sum(); + let token_offset = packed.token_offsets[doc_idx] as usize;
172-183: Consider extracting L2 normalization to a helper function.The L2 normalization logic is duplicated in
encode_query(lines 172-183) andencode_docs(lines 286-297). Extracting this to a helper would reduce duplication and make future changes easier./// L2 normalize a single token embedding in-place within the destination buffer fn l2_normalize_token( src: &[f32], dst: &mut [f32], hidden_size: usize, ) { let mut sum_sq = 0.0f32; for d in 0..hidden_size { sum_sq += src[d] * src[d]; } let norm = sum_sq.sqrt().max(1e-12); for d in 0..hidden_size { dst[d] = src[d] / norm; } }Also applies to: 286-297
65-68: Thread count differs from dense_ort.rs.This uses 8 intra-op threads while
dense_ort.rsuses 4. If intentional due to ColBERT's heavier workload, consider adding a comment explaining the choice. Otherwise, consider using a shared constant for consistency.osgrep-core/test.mjs (2)
24-27: Smoke test lacks assertions.The test logs values but doesn't assert correctness. If
embedDensereturns malformed data, the test would still print "All tests passed". Consider adding basic assertions.const dense = embedDense(texts); console.log(`Dense embed ${texts.length} texts in ${Date.now() - t0}ms`); console.log(` count: ${dense.count}`); - console.log(` embeddings length: ${dense.embeddings.length} (expected: ${texts.length * 384})`); + const expectedLength = texts.length * 384; + console.log(` embeddings length: ${dense.embeddings.length} (expected: ${expectedLength})`); + if (dense.count !== texts.length) { + throw new Error(`Expected count ${texts.length}, got ${dense.count}`); + } + if (dense.embeddings.length !== expectedLength) { + throw new Error(`Expected embeddings length ${expectedLength}, got ${dense.embeddings.length}`); + }
4-5: Consider documenting model dimensions.The expected dimension (384) is hardcoded on line 27. Consider adding a comment here noting the expected dimensions for each model, or querying the dimension from the API if available.
-const DENSE_REPO = 'onnx-community/granite-embedding-30m-english-ONNX'; -const COLBERT_REPO = 'ryandono/mxbai-edge-colbert-v0-17m-onnx-int8'; +const DENSE_REPO = 'onnx-community/granite-embedding-30m-english-ONNX'; // 384-dim +const COLBERT_REPO = 'ryandono/mxbai-edge-colbert-v0-17m-onnx-int8'; // 48-dim ColBERTosgrep-core/bench/rerank.ts (1)
51-57: Consider bounds checking for candidate preview.If
candidates[idx]is shorter than 80 characters,.slice(0, 80)is safe, but ifidxis out of bounds due to a bug in the native code, this would throw. Consider adding a guard.for (let i = 0; i < result.indices.length; i++) { const idx = result.indices[i]; const score = result.scores[i]; - const preview = candidates[idx].slice(0, 80).replace(/\n/g, " ").trim(); + const preview = idx < candidates.length + ? candidates[idx].slice(0, 80).replace(/\n/g, " ").trim() + : "<invalid index>"; console.log(` [${i + 1}] idx=${idx}, score=${score.toFixed(4)}`); console.log(` "${preview}..."`); }osgrep-core/bench/rerank-ort.ts (1)
21-27: Redundant file enumeration.
walkFilesandisCodeFileare called here for logging, butloadRepoChunksinternally calls them again. Consider either removing the redundant calls or refactoringloadRepoChunksto accept pre-computed file lists.// 2. Load chunks console.log("Loading code files and chunking..."); - const files = walkFiles(REPO_DIR); - const codeFiles = files.filter(isCodeFile); const allChunks = loadRepoChunks(REPO_DIR, 80); - console.log(` Files found: ${files.length}`); - console.log(` Code files: ${codeFiles.length}`); console.log(` Total chunks: ${allChunks.length}`); console.log(` Candidates per query: ${NUM_CANDIDATES}`); console.log(` Top-K: ${TOP_K}`);osgrep-core/bench/dense.ts (1)
21-29: Same redundant file enumeration pattern.Like in
rerank-ort.ts,walkFilesandisCodeFileare called separately just for logging, whileloadRepoChunksrecomputes them internally. Consider refactoring for consistency across benchmarks.osgrep-core/index.js (1)
19-31: Linux musl/gnu detection: prefer “try both” over brittle detection.
Reading thelddbinary to detect musl is fragile, and mis-detection forces a hard failure. Consider attemptingmuslload, then falling back tognu(or vice versa) whenarch === "x64".Also applies to: 61-77
osgrep-core/bench/codeatlas-system.ts (1)
276-300: Watch out for memory blowups:number[][]+ per-chunk slices copies embeddings.
For larger repos, this can get expensive fast (many chunks × 384 dims × JS array overhead). Consider keeping a single flatFloat32Array(ornumber[]) plushiddenSizeand computing cosine similarity via offsets, avoiding per-chunk array allocations.osgrep-core/src/dense_ort.rs (3)
21-66: Silent failure when downloading external data file.Line 31 silently ignores errors when downloading
model.onnx_data. While the comment says "ignore if not found," this could mask real download failures (network issues, auth problems) for models that require the external data file.Consider logging a debug message or checking if the model actually needs the external data:
- let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found + // External data file - optional for some models + if let Err(_) = repo.get("onnx/model.onnx_data") { + // External data not found - this is expected for smaller models + }
43-56: Duplicate tokenizer configuration logic betweenload_from_hfandload.The truncation and padding configuration (lines 46-56 and 82-91) is duplicated. Extract this into a helper function to maintain DRY.
fn configure_tokenizer(tokenizer: &mut Tokenizer, max_len: usize) -> anyhow::Result<()> { tokenizer.with_truncation(Some(tokenizers::TruncationParams { max_length: max_len, ..Default::default() })).map_err(|e| anyhow::anyhow!("Failed to set truncation: {}", e))?; tokenizer.with_padding(Some(tokenizers::PaddingParams { strategy: tokenizers::PaddingStrategy::BatchLongest, ..Default::default() })); Ok(()) }Also applies to: 79-91
103-111: Consider avoiding thetexts.clone()in tokenization.The
textsvector is cloned unnecessarily sinceencode_batchshould accept references.- pub fn encode_batch(&mut self, texts: Vec<String>, normalize: bool) -> anyhow::Result<Vec<f32>> { + pub fn encode_batch(&mut self, texts: &[String], normalize: bool) -> anyhow::Result<Vec<f32>> { if texts.is_empty() { return Err(anyhow::anyhow!("Empty input texts")); } // Tokenize let encodings = self.tokenizer - .encode_batch(texts.clone(), true) + .encode_batch(texts.to_vec(), true) .map_err(|e| anyhow::anyhow!("Tokenization failed: {}", e))?;Note: This depends on caller patterns; if texts ownership isn't needed downstream, accepting a slice avoids an allocation at call sites.
osgrep-core/index.d.ts (2)
3-8: Hardcoded dimension 384 in comments may become stale.The comments reference
[batch_size * 384]which couples documentation to a specific model configuration. Consider referencingVECTOR_DIMconceptually or noting it's model-dependent.export interface DenseResult { - /** Flat array of embeddings [batch_size * 384] */ + /** Flat array of embeddings [batch_size * VECTOR_DIM] */ embeddings: number[];Also applies to: 28-39
10-19:ColbertPackedResultmissingtokenIdsdocumentation consistency.The interface has
tokenIdsbut theEmbedResultinterface usescolbertTokenIds. This naming inconsistency could cause confusion.Consider aligning the naming: either both use
tokenIdsor both usecolbertTokenIdsprefix.src/lib/native/index.ts (2)
72-91: Dense embedding extraction is correct but could useFloat32Array.fromfor clarity.The manual loop works but
Float32Array.prototype.subarrayorslicecould be more idiomatic.const vectors: Float32Array[] = []; for (let i = 0; i < result.count; i++) { const start = i * dim; - const vec = new Float32Array(dim); - for (let j = 0; j < dim; j++) { - vec[j] = result.embeddings[start + j]; - } + // Create a copy of the slice (subarray would share the buffer) + const vec = new Float32Array(result.embeddings.slice(start, start + dim)); vectors.push(vec); }Note: This assumes
result.embeddingsis array-like with aslicemethod.
221-226:RerankResultinterface duplicatesosgrep-core/index.d.tsdefinition.Both this file and
osgrep-core/index.d.tsdefineRerankResult. Consider importing from the native binding types to maintain a single source of truth.+import type { RerankResult as NativeRerankResult } from "osgrep-core"; + +// Re-export for consumers +export type RerankResult = NativeRerankResult; -export interface RerankResult { - /** Original indices of top-k docs */ - indices: number[]; - /** MaxSim scores */ - scores: number[]; -}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (3)
osgrep-core/Cargo.lockis excluded by!**/*.lockosgrep-core/bun.lockbis excluded by!**/bun.lockbpnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (24)
.gitignore(2 hunks)osgrep-core/.gitignore(1 hunks)osgrep-core/Cargo.toml(1 hunks)osgrep-core/bench/PARITY.md(1 hunks)osgrep-core/bench/_util.ts(1 hunks)osgrep-core/bench/codeatlas-system.ts(1 hunks)osgrep-core/bench/colbert-parity-rust.ts(1 hunks)osgrep-core/bench/colbert-parity.ts(1 hunks)osgrep-core/bench/dense-ort.ts(1 hunks)osgrep-core/bench/dense.ts(1 hunks)osgrep-core/bench/rerank-ort.ts(1 hunks)osgrep-core/bench/rerank-preindexed.ts(1 hunks)osgrep-core/bench/rerank.ts(1 hunks)osgrep-core/build.rs(1 hunks)osgrep-core/index.d.ts(1 hunks)osgrep-core/index.js(1 hunks)osgrep-core/package.json(1 hunks)osgrep-core/src/colbert_ort.rs(1 hunks)osgrep-core/src/dense_ort.rs(1 hunks)osgrep-core/src/lib.rs(1 hunks)osgrep-core/test.mjs(1 hunks)osgrep-core/tsconfig.json(1 hunks)package.json(1 hunks)src/lib/native/index.ts(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- osgrep-core/.gitignore
- osgrep-core/bench/PARITY.md
🚧 Files skipped from review as they are similar to previous changes (1)
- package.json
🧰 Additional context used
🧬 Code graph analysis (13)
osgrep-core/bench/dense.ts (1)
osgrep-core/bench/_util.ts (4)
cloneRepoIfMissing(8-15)walkFiles(21-43)isCodeFile(48-62)loadRepoChunks(107-121)
osgrep-core/bench/rerank-preindexed.ts (1)
osgrep-core/bench/_util.ts (4)
cloneRepoIfMissing(8-15)walkFiles(21-43)isCodeFile(48-62)loadRepoChunks(107-121)
osgrep-core/bench/codeatlas-system.ts (1)
osgrep-core/bench/_util.ts (3)
ChunkMeta(126-132)chunkOverlapsSpan(198-219)loadRepoChunksWithMeta(142-192)
osgrep-core/bench/rerank-ort.ts (1)
osgrep-core/bench/_util.ts (4)
cloneRepoIfMissing(8-15)walkFiles(21-43)isCodeFile(48-62)loadRepoChunks(107-121)
osgrep-core/test.mjs (1)
src/lib/native/index.ts (2)
embedDense(72-91)embedBatch(139-182)
osgrep-core/bench/rerank.ts (1)
osgrep-core/bench/_util.ts (3)
cloneRepoIfMissing(8-15)loadRepoChunks(107-121)pickCandidates(100-102)
osgrep-core/src/dense_ort.rs (2)
osgrep-core/src/colbert_ort.rs (3)
log_native(7-11)outputs(163-163)outputs(271-271)osgrep-core/test.mjs (2)
texts(22-22)start(15-15)
osgrep-core/bench/dense-ort.ts (3)
osgrep-core/build.rs (1)
main(3-5)osgrep-core/bench/_util.ts (4)
cloneRepoIfMissing(8-15)walkFiles(21-43)isCodeFile(48-62)loadRepoChunks(107-121)osgrep-core/index.js (2)
process(14-14)process(28-28)
osgrep-core/bench/colbert-parity.ts (1)
osgrep-core/bench/_util.ts (1)
loadRepoChunksWithMeta(142-192)
osgrep-core/index.d.ts (1)
src/lib/native/index.ts (5)
RerankResult(221-226)embedDense(72-91)encodeQueryColbert(192-198)rerankColbert(231-261)embedBatch(139-182)
osgrep-core/src/colbert_ort.rs (1)
osgrep-core/src/dense_ort.rs (3)
log_native(6-10)load_from_hf(21-66)hidden_size(205-207)
osgrep-core/src/lib.rs (3)
osgrep-core/src/colbert_ort.rs (1)
load_from_hf(37-103)osgrep-core/src/dense_ort.rs (2)
load_from_hf(21-66)hidden_size(205-207)osgrep-core/index.d.ts (4)
DenseResult(3-8)ColbertPackedResult(10-19)RerankResult(21-26)EmbedResult(28-39)
src/lib/native/index.ts (2)
src/config.ts (2)
MODEL_IDS(4-7)CONFIG(18-26)osgrep-core/index.d.ts (1)
RerankResult(21-26)
🔇 Additional comments (30)
.gitignore (1)
31-37: Excellent addition of Rust backend build outputs.The new ignores properly account for the osgrep-core Rust crate structure and its native N-API bindings:
osgrep-core/*.nodecaptures compiled native modulesosgrep-core/bench/opencode/handles the optional nested benchmark repository- Patterns are syntactically correct and appropriately scoped
osgrep-core/build.rs (1)
1-5: LGTM!Standard napi-rs build script setup. This correctly initializes the N-API build-time integration required for generating native bindings.
osgrep-core/Cargo.toml (1)
1-32: LGTM on the overall manifest structure.The package configuration, dependency choices (napi-rs, tokenizers, hf-hub), and release profile optimizations (LTO, opt-level 3) are appropriate for a native performance core.
osgrep-core/tsconfig.json (1)
1-27: LGTM on the TypeScript configuration.The ESNext/bundler-mode settings align well with modern Node.js native binding development. Strict mode with selective relaxation of unused-variable checks is reasonable for binding code.
osgrep-core/package.json (1)
1-26: LGTM on package manifest structure.The ESM configuration, build scripts, and N-API settings are correctly structured for a native Rust binding package.
osgrep-core/src/lib.rs (2)
122-125: Verify quantization symmetry for negative values.The quantization uses
x * 127.0clamped to[-128, 127], but dequantization (line 203) divides by127.0. This creates slight asymmetry for values near -1.0 (they map to -127, not -128). This is a common trade-off, but ensure it aligns with your precision requirements.
1-265: LGTM on overall architecture.The native core design with global singleton models, proper mutex synchronization, and clean N-API bindings is well-structured. The separation of dense embeddings, ColBERT indexing, and reranking into distinct APIs provides a clear interface for the Node.js layer.
osgrep-core/src/colbert_ort.rs (1)
311-343: MaxSim implementation looks correct.The handling of skiplist tokens and the edge case where all doc tokens are skipped (contributing 0 rather than a negative score) is reasonable for ColBERT scoring semantics.
osgrep-core/test.mjs (1)
7-42: Smoke test structure is appropriate.The test exercises the core API surface (initialization, dense embedding, batch embedding) and provides useful timing information. The async/catch pattern is correct for error handling.
osgrep-core/bench/rerank.ts (1)
1-5: Imports and setup look correct.The benchmark imports utilities from
_utiland native bindings from the index. The@ts-ignoreis appropriate for build-time generated code.osgrep-core/bench/rerank-ort.ts (2)
56-74: Benchmark loop implementation is clean.The loop correctly times each query, accumulates results, and provides useful intermediate output. The use of
Array.fromensures the typed arrays are converted to regular arrays for storage.
76-84: Summary metrics are well-designed.Reporting total time, average time per query, and throughput provides useful performance characterization. The calculations are correct.
osgrep-core/bench/dense.ts (2)
42-53: Batched processing with progress indicator is well-implemented.The batch processing loop correctly handles the last partial batch via
slice, and the progress indicator using\rprovides a good user experience. The throughput calculation during progress is a nice touch.
55-66: Results summary provides useful metrics.The final summary includes all relevant metrics (file count, chunk count, time, throughput, checksum). The checksum enables reproducibility validation across runs.
osgrep-core/bench/colbert-parity-rust.ts (1)
82-95: VerifycolbertPreindexDocs()resets state per call (parity correctness depends on it).
This script assumes a fresh “local packed store per query”, but if the native layer appends docs, indices/ranking parity will be invalid.osgrep-core/src/dense_ort.rs (4)
6-10: No-op logging function is consistent with codebase pattern.This matches the same pattern used in
colbert_ort.rs. Consider extracting to a shared module if more files adopt this pattern.
159-179: Mean pooling implementation is correct.The mean pooling logic properly:
- Accumulates embeddings weighted by attention mask
- Uses
max(1e-9)to avoid division by zero- Iterates correctly over batch, sequence, and hidden dimensions
The indexing
i * max_len * self.hidden_size + j * self.hidden_size + kcorrectly accesses[batch, seq, hidden]layout.
182-193: L2 normalization implementation looks good.Uses
max(1e-12)to prevent division by zero, which is appropriate for numerical stability.
155-157: No verification needed. Thetry_extract_tensor::<f32>()API correctly returns a tuple(&Shape, &[T])where index1is the data slice. This usage is consistent acrossdense_ort.rsandcolbert_ort.rsand matches the documented API for ort2.0.0-rc.10.osgrep-core/bench/_util.ts (6)
21-43:walkFilesimplementation is clean and handles common ignored directories.Good use of a Set for O(1) lookup and recursive walking pattern.
48-62: Comprehensive code file extension list.The extension set covers most common languages. Consider adding
.dockerfile,.makefile, or Makefile (no extension) if needed for broader coverage, but current list is reasonable.
84-95:readFileSafeappropriately guards against large files and handles errors.The 1MB limit is sensible for code files. The catch block with empty object handles TypeScript's unknown error type.
100-102:pickCandidatesis a simple slice - consider documenting that order is preserved.The function name suggests selection but it's just taking the first k elements. This is fine for benchmarks but the behavior should be clear to callers.
198-218:chunkOverlapsSpanpath matching logic handles CodeAtlas format variations well.The three-way match (exact, chunk ends with span, span ends with chunk) handles common path format inconsistencies. The line overlap check is correct for inclusive ranges.
175-188: Off-by-one check inendLinecalculation.Line 184 uses
Math.min(i + linesPerChunk, lines.length)which is correct for the last chunk. However, for non-last chunks,i + linesPerChunkwould exceed the actual lines in the chunk by 1 sinceslice(i, i + linesPerChunk)is exclusive on the end.For example, with
linesPerChunk=80and chunk starting at line 0:
chunkLines = lines.slice(0, 80)→ lines 0-79 (80 lines)endLine = Math.min(0 + 80, lines.length)→ 80 (should be 79 for inclusive)if (text.trim().length > 0) { allChunks.push({ id: chunkId++, file: relativePath, startLine: i + 1, // 1-indexed - endLine: Math.min(i + linesPerChunk, lines.length), // 1-indexed, inclusive + endLine: Math.min(i + chunkLines.length, lines.length), // 1-indexed, inclusive text, }); }Likely an incorrect or invalid review comment.
osgrep-core/index.d.ts (1)
41-60: Type declarations align with native binding usage.The function signatures match the usage in
src/lib/native/index.tsand the flexible union types (Int8Array | number[],Uint32Array | number[]) accommodate both typed arrays and plain arrays from the native binding.src/lib/native/index.ts (4)
19-31:loadNativecorrectly caches the native module.The lazy loading pattern with caching avoids redundant dynamic imports. Error message is helpful for debugging build issues.
110-121:embedColbertproperly converts native result to typed arrays.Clean extraction pattern. The interface
ColbertPackedcorrectly documents the quantized INT8 format.
139-182:embedBatchcorrectly assembles hybrid embeddings per document.The token cursor tracking for
tokenIdsis well-handled. One observation:new Uint32Array(tokenSlice)on line 175 creates a copy, which is appropriate sincesubarrayshares the buffer.
204-219:RerankInputinterface is well-documented.Clear JSDoc comments for each field help consumers understand the expected data format.
| export function cloneRepoIfMissing(repoUrl: string, dir: string): void { | ||
| if (existsSync(dir)) { | ||
| console.log(`Repository already exists at ${dir}`); | ||
| return; | ||
| } | ||
| console.log(`Cloning ${repoUrl} to ${dir}...`); | ||
| execSync(`git clone --depth 1 ${repoUrl} ${dir}`, { stdio: "inherit" }); | ||
| } |
There was a problem hiding this comment.
Command injection vulnerability in cloneRepoIfMissing.
The repoUrl and dir parameters are directly interpolated into a shell command without sanitization. A malicious repo URL or directory name could execute arbitrary commands.
+import { spawnSync } from "child_process";
+
export function cloneRepoIfMissing(repoUrl: string, dir: string): void {
if (existsSync(dir)) {
console.log(`Repository already exists at ${dir}`);
return;
}
console.log(`Cloning ${repoUrl} to ${dir}...`);
- execSync(`git clone --depth 1 ${repoUrl} ${dir}`, { stdio: "inherit" });
+ const result = spawnSync("git", ["clone", "--depth", "1", repoUrl, dir], { stdio: "inherit" });
+ if (result.status !== 0) {
+ throw new Error(`Failed to clone repository: ${result.error?.message ?? 'unknown error'}`);
+ }
}Using spawnSync with an array of arguments avoids shell interpretation.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export function cloneRepoIfMissing(repoUrl: string, dir: string): void { | |
| if (existsSync(dir)) { | |
| console.log(`Repository already exists at ${dir}`); | |
| return; | |
| } | |
| console.log(`Cloning ${repoUrl} to ${dir}...`); | |
| execSync(`git clone --depth 1 ${repoUrl} ${dir}`, { stdio: "inherit" }); | |
| } | |
| import { spawnSync } from "child_process"; | |
| export function cloneRepoIfMissing(repoUrl: string, dir: string): void { | |
| if (existsSync(dir)) { | |
| console.log(`Repository already exists at ${dir}`); | |
| return; | |
| } | |
| console.log(`Cloning ${repoUrl} to ${dir}...`); | |
| const result = spawnSync("git", ["clone", "--depth", "1", repoUrl, dir], { stdio: "inherit" }); | |
| if (result.status !== 0) { | |
| throw new Error(`Failed to clone repository: ${result.error?.message ?? 'unknown error'}`); | |
| } | |
| } |
🤖 Prompt for AI Agents
In osgrep-core/bench/_util.ts around lines 8 to 15, the function
cloneRepoIfMissing interpolates repoUrl and dir into a shell command, creating a
command-injection risk; replace execSync of a single string with a call to
child_process.spawnSync passing the git executable and an argument array (e.g.,
["clone", "--depth", "1", repoUrl, dir]) and set stdio: "inherit" and shell:
false, and additionally validate/normalize repoUrl and dir (reject or sanitize
control characters and disallow shell metacharacters/absolute paths if not
expected) before spawning to ensure inputs cannot inject shell commands.
| // @ts-ignore - generated at build time | ||
| import { | ||
| initDenseOrt, | ||
| denseEmbedOrt, | ||
| initColbertOrt, | ||
| colbertPreindexDocs, | ||
| colbertRerankPreindexed, | ||
| } from "../index.js"; | ||
|
|
There was a problem hiding this comment.
Blocker: imports don’t match osgrep-core/index.js exports (as in this PR).
This benchmark will fail on startup unless osgrep-core/index.js re-exports these names or the script is updated to the new API.
🤖 Prompt for AI Agents
In osgrep-core/bench/codeatlas-system.ts around lines 19 to 27, the named
imports (initDenseOrt, denseEmbedOrt, initColbertOrt, colbertPreindexDocs,
colbertRerankPreindexed) do not match the current exports from
osgrep-core/index.js; update this file so the imports match the actual exported
identifiers (or alternatively add re-exports for these names in
osgrep-core/index.js). Change the import list to the new API names exactly as
exported, or add corresponding export statements in index.js forwarding the real
implementations, and remove any stale or incorrect imports so the benchmark can
start without import errors.
| function topKOverlap(a: number[], b: number[], k: number): number { | ||
| const sa = new Set(a.slice(0, k)); | ||
| const sb = new Set(b.slice(0, k)); | ||
| let inter = 0; | ||
| for (const x of sa) if (sb.has(x)) inter++; | ||
| return inter / k; | ||
| } |
There was a problem hiding this comment.
Guard topKOverlap against k=0 (and short rankings).
return inter / k will produce Infinity/NaN if --topk 0, and can skew results when either ranking has <k items.
function topKOverlap(a: number[], b: number[], k: number): number {
+ if (k <= 0) return 0;
const sa = new Set(a.slice(0, k));
const sb = new Set(b.slice(0, k));
let inter = 0;
for (const x of sa) if (sb.has(x)) inter++;
- return inter / k;
+ return inter / k;
}Also applies to: 62-63, 99-100
| // @ts-ignore - generated at build time | ||
| import { | ||
| initDenseOrt, | ||
| denseEmbedOrt, | ||
| initColbertOrt, | ||
| colbertPreindexDocs, | ||
| colbertRerankPreindexed, | ||
| } from "../index.js"; | ||
|
|
There was a problem hiding this comment.
Blocker: ../index.js doesn’t export these *Ort functions (as in this PR).
Either update this tool to the new API names or add compatibility exports in osgrep-core/index.js.
🤖 Prompt for AI Agents
In osgrep-core/bench/colbert-parity.ts around lines 18 to 26, this file imports
initDenseOrt, denseEmbedOrt, initColbertOrt, colbertPreindexDocs, and
colbertRerankPreindexed from "../index.js", but ../index.js no longer exports
those names. Fix by either (A) updating these imports to the new exported API
names used in osgrep-core/index.js (replace each old name with the corresponding
new function name throughout this file), or (B) add compatibility exports in
osgrep-core/index.js that re-export the new implementations under the legacy
names (export { newInitDense as initDenseOrt, ... } for each of the five
functions) so existing callers keep working.
| } | ||
| } | ||
|
|
||
| main().catch(console.error); |
There was a problem hiding this comment.
Return non-zero exit code on failure.
Same as other bench files.
-main().catch(console.error);
+main().catch((e) => {
+ console.error(e);
+ process.exit(1);
+});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| main().catch(console.error); | |
| main().catch((e) => { | |
| console.error(e); | |
| process.exit(1); | |
| }); |
🤖 Prompt for AI Agents
In osgrep-core/bench/rerank-preindexed.ts around line 94, the catch handler
currently only logs the error; update it to log the error and then exit the
process with a non-zero status so failures propagate to CI. Replace the simple
console.error catch with a handler that prints the error and calls
process.exit(1) (matching the pattern used in the other bench files).
| hf-hub = "0.4" | ||
|
|
||
| # ONNX Runtime (the only ML backend we need) | ||
| ort = { version = "2.0.0-rc.10", default-features = true, features = ["download-binaries"] } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
ort crate 2.0.0 stable version crates.io rust
💡 Result:
No — there is no stable ort 2.0.0 on crates.io. The crate has 2.0.0 release candidates (e.g. 2.0.0-rc.10) while the latest stable series remains in the 1.x line (latest published stable shown as 1.16.3). [1][2][3]
Sources:
- lib.rs crate page (lists 2.0.0-rc.*). [1]
- docs.rs listings for ort 2.0.0-rc.*. [2]
- ort crate latest/stable docs (1.16.3). [3]
Consider pinning to a stable version instead of release candidate.
Using ort = "2.0.0-rc.10" pins to a release candidate version. The latest stable release is 1.16.3; no stable 2.0.0 has been published. Release candidates can introduce breaking changes before the stable release. Decide if the RC is acceptable for your stability requirements, or consider using 1.16.3 if the 2.0 features are not essential.
🤖 Prompt for AI Agents
In osgrep-core/Cargo.toml around line 25, the dependency is pinned to a
release-candidate ort = "2.0.0-rc.10" which is not a stable release; either
change the version to the latest stable ort = "1.16.3" (or another chosen stable
release) to ensure stability, or if 2.0.0-rc.10 is required, add a comment
explaining the rationale and any planned upgrade path and run cargo update /
cargo test to verify compatibility.
| // Clean API exports | ||
| export const { | ||
| // Initialization | ||
| initModels, | ||
| isInitialized, | ||
|
|
||
| // Dense embeddings (384-dim) | ||
| embedDense, | ||
|
|
||
| // ColBERT embeddings (48-dim per token, packed) | ||
| embedColbertPacked, | ||
|
|
||
| // ColBERT reranking | ||
| encodeQueryColbert, | ||
| rerankColbert, | ||
|
|
||
| // Convenience: both embeddings in one call | ||
| embedBatch, | ||
| } = nativeBinding; |
There was a problem hiding this comment.
Blocker: export surface doesn’t match the bench scripts added in this PR.
Given the bench files import initDenseOrt/denseEmbedOrt/initColbertOrt/colbertPreindexDocs/colbertRerankPreindexed/..., this module should either:
- re-export those legacy names as aliases to the new API, or
- the bench scripts should be updated to only use
initModels/embedDense/encodeQueryColbert/rerankColbert/embedBatch.
Right now, the benches will crash on import.
export const {
// Initialization
initModels,
isInitialized,
@@
embedBatch,
} = nativeBinding;
+
+// ---------------------------------------------------------------------------
+// Back-compat aliases for bench scripts (remove once benches are updated)
+// ---------------------------------------------------------------------------
+export const initDenseOrt = nativeBinding.initDenseOrt ?? nativeBinding.initModels;
+export const denseEmbedOrt = nativeBinding.denseEmbedOrt ?? nativeBinding.embedDense;
+export const initColbertOrt = nativeBinding.initColbertOrt ?? nativeBinding.initModels;
+export const colbertPreindexDocs = nativeBinding.colbertPreindexDocs ?? nativeBinding.embedColbertPacked;
+export const colbertRerankPreindexed = nativeBinding.colbertRerankPreindexed ?? nativeBinding.rerankColbert;
+export const colbertRerankOrt = nativeBinding.colbertRerankOrt ?? nativeBinding.rerankColbert;
+export const denseEmbedChecksumOrt = nativeBinding.denseEmbedChecksumOrt;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Clean API exports | |
| export const { | |
| // Initialization | |
| initModels, | |
| isInitialized, | |
| // Dense embeddings (384-dim) | |
| embedDense, | |
| // ColBERT embeddings (48-dim per token, packed) | |
| embedColbertPacked, | |
| // ColBERT reranking | |
| encodeQueryColbert, | |
| rerankColbert, | |
| // Convenience: both embeddings in one call | |
| embedBatch, | |
| } = nativeBinding; | |
| // Clean API exports | |
| export const { | |
| // Initialization | |
| initModels, | |
| isInitialized, | |
| // Dense embeddings (384-dim) | |
| embedDense, | |
| // ColBERT embeddings (48-dim per token, packed) | |
| embedColbertPacked, | |
| // ColBERT reranking | |
| encodeQueryColbert, | |
| rerankColbert, | |
| // Convenience: both embeddings in one call | |
| embedBatch, | |
| } = nativeBinding; | |
| // --------------------------------------------------------------------------- | |
| // Back-compat aliases for bench scripts (remove once benches are updated) | |
| // --------------------------------------------------------------------------- | |
| export const initDenseOrt = nativeBinding.initDenseOrt ?? nativeBinding.initModels; | |
| export const denseEmbedOrt = nativeBinding.denseEmbedOrt ?? nativeBinding.embedDense; | |
| export const initColbertOrt = nativeBinding.initColbertOrt ?? nativeBinding.initModels; | |
| export const colbertPreindexDocs = nativeBinding.colbertPreindexDocs ?? nativeBinding.embedColbertPacked; | |
| export const colbertRerankPreindexed = nativeBinding.colbertRerankPreindexed ?? nativeBinding.rerankColbert; | |
| export const colbertRerankOrt = nativeBinding.colbertRerankOrt ?? nativeBinding.rerankColbert; | |
| export const denseEmbedChecksumOrt = nativeBinding.denseEmbedChecksumOrt; |
🤖 Prompt for AI Agents
In osgrep-core/index.js around lines 106 to 124, the exported names no longer
match the legacy bench script imports so the benches crash; add re-exports
(aliases) for the legacy names to the new API surface: export initDenseOrt ->
initModels, denseEmbedOrt -> embedDense, initColbertOrt -> initModels (or
another appropriate init if separate), colbertPreindexDocs ->
embedColbertPacked, colbertRerankPreindexed -> rerankColbert, and any other
legacy identifiers used by the benches, so existing bench imports continue to
work without changing the bench files.
| let hidden_size = 48usize; | ||
| let query_seq_len = query_embeddings.len() / hidden_size; | ||
|
|
There was a problem hiding this comment.
Potential division by zero if query_embeddings is empty.
If query_embeddings has zero length, hidden_size (48) would cause query_seq_len to be 0, which may lead to unexpected behavior downstream. Consider adding input validation.
let hidden_size = 48usize;
+ if query_embeddings.is_empty() {
+ return Err(Error::from_reason("Query embeddings cannot be empty"));
+ }
let query_seq_len = query_embeddings.len() / hidden_size;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let hidden_size = 48usize; | |
| let query_seq_len = query_embeddings.len() / hidden_size; | |
| let hidden_size = 48usize; | |
| if query_embeddings.is_empty() { | |
| return Err(Error::from_reason("Query embeddings cannot be empty")); | |
| } | |
| let query_seq_len = query_embeddings.len() / hidden_size; | |
🤖 Prompt for AI Agents
In osgrep-core/src/lib.rs around lines 191 to 193, the code calculates
query_seq_len by dividing query_embeddings.len() by a fixed hidden_size (48)
without validating inputs; add validation to ensure query_embeddings is not
empty and its length is divisible by hidden_size, and handle failures by
returning a clear Err (or Result) or panicking with a descriptive message.
Specifically, check if query_embeddings.is_empty() and if query_embeddings.len()
% hidden_size != 0, and return an error (or bail) that identifies the problem
and includes the offending length and expected hidden_size so downstream code
won’t see an invalid sequence length.
| export async function rerankColbert(input: RerankInput): Promise<RerankResult> { | ||
| await initNative(); | ||
| const n = await loadNative(); | ||
|
|
||
| const q = Float64Array.from(input.queryEmbedding as any); | ||
|
|
||
| const docs = | ||
| input.docEmbeddings instanceof Int8Array | ||
| ? input.docEmbeddings | ||
| : new Int8Array(input.docEmbeddings as any); | ||
|
|
||
| const tokenIds = | ||
| input.docTokenIds instanceof Uint32Array | ||
| ? input.docTokenIds | ||
| : Uint32Array.from(input.docTokenIds as any); | ||
|
|
||
| const result = n.rerankColbert( | ||
| q, | ||
| docs, | ||
| tokenIds, | ||
| input.docLengths, | ||
| input.docOffsets, | ||
| input.candidateIndices, | ||
| input.topK | ||
| ); | ||
|
|
||
| return { | ||
| indices: Array.from(result.indices), | ||
| scores: Array.from(result.scores), | ||
| }; | ||
| } |
There was a problem hiding this comment.
Type conversion in rerankColbert uses as any casts.
Lines 235, 240, and 245 use as any to convert between Float32Array and Float64Array. This works but loses type safety. The native binding expects Float64Array for query embeddings but the RerankInput interface declares Float32Array.
Consider aligning the interface type with what the native binding expects:
export interface RerankInput {
- /** Query ColBERT embedding from encodeQueryColbert */
- queryEmbedding: Float32Array;
+ /** Query ColBERT embedding from encodeQueryColbert (will be converted to Float64Array) */
+ queryEmbedding: Float32Array | Float64Array;Or document the conversion explicitly.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/lib/native/index.ts around lines 231 to 261, the function rerankColbert
uses unsafe "as any" casts at lines ~235, ~240 and ~245 to coerce embeddings and
token arrays to the native types; update the code to remove "as any" by either
(a) changing the RerankInput interface so queryEmbedding is declared as
Float64Array and docEmbeddings/docTokenIds have the exact typed arrays the
native binding expects, or (b) keep the interface but perform an explicit,
type-safe conversion/validation (e.g., check instance types and create new
Float64Array/Int8Array/Uint32Array copies where necessary) before calling
n.rerankColbert, and throw a clear error if conversion is not possible; ensure
no "as any" casts remain and that runtime conversions preserve numeric values
(Float32 -> Float64) and validate lengths/offsets before invoking the native
binding.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
osgrep-core/package.json (2)
12-23: N-API configuration looks good for initial release.The platform triple setup with
defaults: falseand explicit targets is a pragmatic approach for V1 that avoids unnecessary cross-compilation overhead while covering the most common deployment targets (macOS, Linux glibc, Windows).Consider adding
aarch64-unknown-linux-gnu(ARM64 Linux) to the platform triples if this is deployed on cloud platforms supporting ARM instances, as it's increasingly common for cost/efficiency reasons.
1-30: Add package metadata fields (optional for V1 but recommended for publishing).The package is missing standard metadata fields that improve discoverability and clarity for maintainers and users:
description: Brief summary of osgrep-core functionalitylicense: Important for OSS projects to clarify termsrepository: Helpful for users and tools that want to navigate to the sourceThese are not blockers for V1 but are best-practice for published packages.
{ "name": "osgrep-core", "version": "0.1.0", + "description": "Native Rust bindings for osgrep core embedding and reranking functionality", "type": "module", "main": "index.js", "types": "index.d.ts", + "license": "MIT", + "repository": { + "type": "git", + "url": "https://github.com/Ryandonofrio3/osgrep.git" + }, "files": [
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (1)
osgrep-core/package.json(1 hunks)
🔇 Additional comments (1)
osgrep-core/package.json (1)
5-6: Entry points exist; clarify intent for native binary declaration.The declared entry points
index.jsandindex.d.tsare present in the osgrep-core directory. However, the"files"field includes"*.node"(line 10) without any .node files currently present. If native bindings are expected to be built at package installation time, confirm this is intentional; otherwise, remove the"*.node"pattern from the declared files list.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lib/workers/orchestrator.ts (1)
155-183: Be explicit about “stored path” vs “read/chunk path” to avoid mismatched metadata.
absolutePathis used for reading/validation, butinput.pathis used for chunking (chunkFile) and persisted intoPreparedChunk.path. Ifinput.pathcan differ from the actual file identity (symlinks, callers passing mismatched values), chunks/anchors and stored records can drift.If the intent is “store relative path, but chunk using the real path”, consider passing
absolutePathintochunkFile()/buildAnchorChunk()while still storinginput.pathinPreparedChunk.path(or rename fields to make this explicit).
♻️ Duplicate comments (1)
src/lib/workers/orchestrator.ts (1)
244-327: Fixrerank()packing invariants before calling native: tokenIds length must equalsum(docLengths), and query flattening must respectcolbertDim.
Two native-boundary hazards here:
docLengthsis derived from embeddings (seqLen), buttokenIdscan be[](missing) or shorter thanseqLen. That makespackedTokenIds.length < sum(docLengths), which is a classic out-of-bounds read risk in native code.Query flattening currently uses each
row.length; docs useinput.colbertDim. If rows are ragged or not exactlycolbertDim, you can send a query vector whose shape doesn’t match the doc shape.Suggested patch:
- const queryEmbedding: number[] = []; - for (const row of input.query) { - for (let i = 0; i < row.length; i++) { - queryEmbedding.push(row[i] ?? 0); - } - } + const queryEmbedding: number[] = []; + for (const row of input.query) { + for (let d = 0; d < input.colbertDim; d++) { + queryEmbedding.push(row?.[d] ?? 0); + } + } for (let i = 0; i < input.docs.length; i++) { const doc = input.docs[i]; const col = doc.colbert; @@ const seqLen = Math.floor(colbert.length / input.colbertDim); const used = colbert.subarray(0, seqLen * input.colbertDim); - const tokenIdsRaw = doc.token_ids ?? []; - const tokenIds = Uint32Array.from( - tokenIdsRaw.slice(0, seqLen).map((v) => (Number.isFinite(v) ? v : 0)), - ); + const tokenIdsRaw = doc.token_ids ?? []; + const tokenIds = new Uint32Array(seqLen); + for (let t = 0; t < seqLen; t++) { + const v = tokenIdsRaw[t]; + tokenIds[t] = Number.isFinite(v) ? (v as number) : 0; + }Also: your prior review note about offset terminology still applies (these are element offsets, which happen to equal byte offsets for
Int8Array).
🧹 Nitpick comments (2)
.gitignore (2)
2-3: Consider removing redundant node_modules pattern.The pattern
node_modules/(line 2) already matchesnode_modulesdirectories at any depth in the git tree. The**/node_modules/(line 3) pattern is redundant.dist/ node_modules/ -**/node_modules/ *.tsbuildinfo
48-49: Consider removing redundant target directory pattern.The pattern
**/target/(line 49) already matches theosgrep-core/target/case on line 48. The explicit path is redundant.# Native core (Rust / N-API) build outputs -osgrep-core/target/ **/target/ osgrep-core/*.node osgrep-core/bench/
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (2)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yamlpublic/bench.pngis excluded by!**/*.png
📒 Files selected for processing (16)
.gitignore(1 hunks)LICENSE(0 hunks)NOTICE(0 hunks)README.md(3 hunks)experiments/mrr-sweep.ts(0 hunks)experiments/quick_check.ts(0 hunks)experiments/ranking-test.ts(0 hunks)experiments/verify-fix.ts(0 hunks)package.json(2 hunks)plugins/osgrep/skills/osgrep/SKILL.md(1 hunks)scripts/compare-engines.ts(0 hunks)scripts/index-bench.sh(0 hunks)scripts/sync-eval-cases.ts(0 hunks)scripts/verify_skeleton.ts(0 hunks)src/lib/workers/orchestrator.ts(8 hunks)test_skeleton.py(0 hunks)
💤 Files with no reviewable changes (11)
- scripts/index-bench.sh
- LICENSE
- experiments/quick_check.ts
- experiments/ranking-test.ts
- NOTICE
- scripts/verify_skeleton.ts
- scripts/compare-engines.ts
- experiments/mrr-sweep.ts
- scripts/sync-eval-cases.ts
- test_skeleton.py
- experiments/verify-fix.ts
🧰 Additional context used
🧬 Code graph analysis (1)
src/lib/workers/orchestrator.ts (4)
src/lib/index/chunker.ts (3)
TreeSitterChunker(211-761)ChunkWithContext(60-65)chunk(268-313)src/lib/native/index.ts (4)
initNative(36-52)embedBatch(139-182)encodeQueryColbert(192-198)rerankColbert(231-261)src/lib/store/types.ts (2)
PreparedChunk(7-29)VectorRecord(31-35)src/config.ts (1)
CONFIG(18-26)
🔇 Additional comments (5)
.gitignore (1)
47-51: Rust backend additions are correct and well-organized.The new entries for osgrep-core build artifacts (
osgrep-core/*.node,osgrep-core/bench/,**/target/) align perfectly with the introduction of a native Rust/N-API binding layer. The ignore rules will keep compiled bindings, Rust build outputs, and benchmark directories out of version control.package.json (1)
28-28: Verify bun availability in CI and development setup.The
prepublishOnlyscript now usesbun run buildinstead ofpnpm. Ensure bun is available in your CI pipeline and documented as a development prerequisite. If developers are expected to use bun, verify the development documentation (README, CONTRIBUTING guides) clearly states this requirement.README.md (1)
259-261: Ensure bun is documented as a development requirement.The development commands have switched to bun. Verify that a CONTRIBUTING guide or setup documentation clearly states bun as a prerequisite (e.g., "Install bun from https://bun.sh/docs/installation"). This complements the change to use bun in the CI/build pipeline.
plugins/osgrep/skills/osgrep/SKILL.md (1)
49-56: No changes needed—documentation is correct.The SKILL.md examples use
-m 20(lowercase) and--compact, both of which are valid osgrep flags. There is no-Muppercase variant;-m(lowercase) is the correct flag for maximum results. The documentation matches the actual osgrep CLI.src/lib/workers/orchestrator.ts (1)
64-74: Init gating + singleton wrapper pattern looks solid.
ensureReady()correctly coalesces init, and the thin wrappers keep call sites simple.Also applies to: 334-359
| "files": [ | ||
| "dist", | ||
| "plugins", | ||
| "README.md", | ||
| "LICENSE", | ||
| "NOTICE" | ||
| "README.md" | ||
| ], |
There was a problem hiding this comment.
Add LICENSE file to the published distribution.
The files array should include the LICENSE file to comply with Apache-2.0 license distribution requirements. While "license": "Apache-2.0" is declared in the manifest, the actual LICENSE file must be included in npm packages for proper legal attribution.
Apply this diff to add the LICENSE file:
"files": [
"dist",
"plugins",
- "README.md"
+ "README.md",
+ "LICENSE"
],Consider also adding the NOTICE file if it contains important attribution information.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "files": [ | |
| "dist", | |
| "plugins", | |
| "README.md", | |
| "LICENSE", | |
| "NOTICE" | |
| "README.md" | |
| ], | |
| "files": [ | |
| "dist", | |
| "plugins", | |
| "README.md", | |
| "LICENSE" | |
| ], |
🤖 Prompt for AI Agents
In package.json around lines 36 to 40, the "files" array omits the LICENSE file;
update the array to include "LICENSE" (and optionally "NOTICE" if present) so
the LICENSE is packaged and published; modify the files entry to add "LICENSE"
(and "NOTICE" if applicable) ensuring the repository root contains the LICENSE
file before publishing.
| "dependencies": { | ||
| "@clack/prompts": "^0.11.0", | ||
| "@huggingface/transformers": "^3.8.0", | ||
| "osgrep-core": "file:./osgrep-core", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if osgrep-core directory exists locally
echo "=== Checking osgrep-core directory ==="
if [ -d "osgrep-core" ]; then
echo "✓ osgrep-core directory found locally"
if [ -f "osgrep-core/package.json" ]; then
echo "✓ osgrep-core has package.json"
head -20 osgrep-core/package.json
fi
else
echo "✗ osgrep-core directory NOT found locally"
fi
# Check package.json for workspaces configuration
echo -e "\n=== Checking workspace configuration ==="
if [ -f "package.json" ]; then
rg '"workspaces"' package.json -A 10 || echo "No workspaces config found"
fi
# Check for postinstall/postbuild scripts
echo -e "\n=== Checking scripts in package.json ==="
rg '"(postinstall|postbuild|prepare|build)"' package.json -A 1 || echo "No postinstall/postbuild found"
# Check files array in package.json
echo -e "\n=== Checking files array ==="
rg '"files"' package.json -A 15 || echo "No files array found"
# Check main package.json for osgrep-core reference
echo -e "\n=== Checking osgrep-core dependency ==="
rg '"osgrep-core"' package.json -B 2 -A 2Repository: Ryandonofrio3/osgrep
Length of output: 1518
🏁 Script executed:
#!/bin/bash
# Check npm registry for osgrep-core package
echo "=== Checking if osgrep-core is published to npm ==="
npm view osgrep-core 2>&1 | head -20
# Check root package.json for complete build/install workflow
echo -e "\n=== Checking root package.json scripts ==="
rg '"scripts"' package.json -A 20
# Check if there's a build workflow that handles both packages
echo -e "\n=== Checking for build configuration files ==="
ls -la | grep -E "tsconfig|webpack|vite|esbuild|build"
find . -maxdepth 2 -name "*.json" -type f | head -20Repository: Ryandonofrio3/osgrep
Length of output: 1341
Fix osgrep-core distribution before publishing to npm—this will cause installation failures.
The dependency "osgrep-core": "file:./osgrep-core" references a local path and cannot be distributed via npm. Verification confirms:
- osgrep-core is not published to npm (404 not found)
- The
filesarray does not include osgrep-core directory - No workspace configuration exists
- The
prepublishOnlyscript only runs TypeScript compilation and does not build osgrep-core
When external users install from npm, the package will fail because the relative file: reference will point to a non-existent directory.
Choose one of these approaches:
- Use npm workspaces (recommended for monorepos): Add
"workspaces": ["osgrep-core"]to rootpackage.json - Publish osgrep-core separately: Publish osgrep-core to npm and reference by version (e.g.,
"^0.1.0") - Bundle osgrep-core: Add
osgrep-coreto thefilesarray, build native bindings inprepublishOnly, and include prebuilt.nodefiles - Use postinstall: Add a postinstall script that builds or downloads osgrep-core
🤖 Prompt for AI Agents
package.json around line 44: the dependency "osgrep-core": "file:./osgrep-core"
references a local path which will break installs from npm; fix by either (A)
enable npm workspaces for the monorepo by adding a "workspaces": ["osgrep-core"]
entry to the root package.json and remove the file: reference, or (B) publish
osgrep-core to npm and change the dependency to its published version (e.g.,
"^0.1.0"), or (C) bundle osgrep-core into the package by adding it to the
"files" array and updating prepublishOnly to build native artifacts (.node) into
the published package, or (D) add a postinstall script that builds or downloads
osgrep-core on install; pick one approach and update package.json scripts/fields
accordingly and ensure the chosen flow is covered by CI before publishing.
| Natural-language search that works like `grep`. Fast, local, and built for coding agents. | ||
|
|
||
| - **Semantic:** Finds concepts ("where do transactions get created?"), not just strings. | ||
| - **Call Graph Tracing:** Map dependencies with `trace` to see who calls what. | ||
| - **Role Detection:** Distinguishes `ORCHESTRATION` (high-level logic) from `DEFINITION` (types/classes). | ||
| - **Local & Private:** 100% local embeddings via `onnxruntime-node`. | ||
| - **Auto-Isolated:** Each repository gets its own index automatically. | ||
| - **Agent-Ready:** Native output with symbols, roles, and call graphs. | ||
| - **Agent-Ready:** Native output with symbols and roles. |
There was a problem hiding this comment.
Update outdated embeddings technology reference.
Line 24 still references onnxruntime-node for embeddings, but this dependency has been removed from package.json (replaced with Rust-native bindings via osgrep-core). The README should reflect the new architecture.
Apply this diff to update the embeddings description:
- **Local & Private:** 100% local embeddings via `onnxruntime-node`.
+ **Local & Private:** 100% local embeddings via Rust-native bindings.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Natural-language search that works like `grep`. Fast, local, and built for coding agents. | |
| - **Semantic:** Finds concepts ("where do transactions get created?"), not just strings. | |
| - **Call Graph Tracing:** Map dependencies with `trace` to see who calls what. | |
| - **Role Detection:** Distinguishes `ORCHESTRATION` (high-level logic) from `DEFINITION` (types/classes). | |
| - **Local & Private:** 100% local embeddings via `onnxruntime-node`. | |
| - **Auto-Isolated:** Each repository gets its own index automatically. | |
| - **Agent-Ready:** Native output with symbols, roles, and call graphs. | |
| - **Agent-Ready:** Native output with symbols and roles. | |
| Natural-language search that works like `grep`. Fast, local, and built for coding agents. | |
| - **Semantic:** Finds concepts ("where do transactions get created?"), not just strings. | |
| - **Role Detection:** Distinguishes `ORCHESTRATION` (high-level logic) from `DEFINITION` (types/classes). | |
| - **Local & Private:** 100% local embeddings via Rust-native bindings. | |
| - **Auto-Isolated:** Each repository gets its own index automatically. | |
| - **Agent-Ready:** Native output with symbols and roles. |
🤖 Prompt for AI Agents
In README.md around lines 20 to 26, update the embeddings/technology reference:
replace the outdated mention of `onnxruntime-node` with the current Rust-native
bindings via `osgrep-core`. Edit the bullet "Local & Private" to state that
embeddings are produced locally using the Rust-native `osgrep-core` bindings (or
similar phrasing), and ensure wording reflects that `onnxruntime-node` is no
longer used and the package.json now depends on `osgrep-core`.
| const colbertFlat = await encodeQueryColbert(text); | ||
| const dim = CONFIG.COLBERT_DIM; | ||
| const seqLen = colbertFlat.length / dim; | ||
|
|
||
| // Reshape to matrix | ||
| const matrix: number[][] = []; | ||
|
|
||
| for (let s = 0; s < seq; s++) { | ||
| let sumSq = 0; | ||
| const offset = s * dim; | ||
| for (let d = 0; d < dim; d++) { | ||
| const val = data[offset + d]; | ||
| sumSq += val * val; | ||
| } | ||
| const norm = Math.sqrt(sumSq); | ||
|
|
||
| for (let s = 0; s < seqLen; s++) { | ||
| const row: number[] = []; | ||
| if (norm > 1e-9) { | ||
| for (let d = 0; d < dim; d++) { | ||
| row.push(data[offset + d] / norm); | ||
| } | ||
| } else { | ||
| for (let d = 0; d < dim; d++) { | ||
| row.push(data[offset + d]); | ||
| } | ||
| } | ||
| matrix.push(row); | ||
| } | ||
|
|
||
| // Compute pooled embedding (mean of tokens) | ||
| const pooled = new Float32Array(dim); | ||
| for (const row of matrix) { | ||
| for (let d = 0; d < dim; d++) { | ||
| pooled[d] += row[d]; | ||
| } | ||
| } | ||
| // Normalize pooled | ||
| let sumSq = 0; | ||
| for (let d = 0; d < dim; d++) { | ||
| pooled[d] /= matrix.length || 1; | ||
| sumSq += pooled[d] * pooled[d]; | ||
| } | ||
| const norm = Math.sqrt(sumSq); | ||
| if (norm > 1e-9) { | ||
| for (let d = 0; d < dim; d++) { | ||
| pooled[d] /= norm; | ||
| row.push(colbertFlat[s * dim + d]); | ||
| } | ||
| matrix.push(row); | ||
| } | ||
|
|
There was a problem hiding this comment.
Floor seqLen (and ideally assert divisibility) when reshaping query embeddings.
const seqLen = colbertFlat.length / dim; can be non-integer; the loop will then run one extra iteration and index past the end of the typed array.
- const seqLen = colbertFlat.length / dim;
+ const seqLen = Math.floor(colbertFlat.length / dim);
+ // Optional: if (colbertFlat.length % dim !== 0) throw new Error("Invalid ColBERT query embedding length");🤖 Prompt for AI Agents
In src/lib/workers/orchestrator.ts around lines 216 to 229, seqLen is computed
as colbertFlat.length / dim which may be non-integer and cause out-of-bounds
access when reshaping; compute seqLen = Math.floor(colbertFlat.length / dim) and
add an assertion or explicit check that colbertFlat.length % dim === 0 (throwing
or logging a clear error if not) so you either fail fast on bad input or handle
the remainder intentionally before running the nested loops.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (6)
.gitignore (3)
2-3: Redundantnode_modulespatterns are ok, but you can simplify to one.
Keeping bothnode_modules/and**/node_modules/works; if you want to reduce noise,**/node_modules/alone is typically sufficient.
9-10: Deduplicate.osgrep/vs.osgrepto avoid ambiguity.
You currently ignore both.osgrep/(directory) and.osgrep(file). If only the directory exists, drop the file pattern; if both can exist, consider a clarifying comment.Also applies to: 52-52
38-40:.env*may unintentionally ignore.env.example/.env.sample.
If you rely on committed env templates, tighten this (e.g., ignore.env,.env.local,.env.*.localexplicitly) and keep.env.exampletracked..github/workflows/release.yml (1)
56-56: Avoid redundant napi-rs CLI installation.The
@napi-rs/cliis installed in bothbuild-native(line 56) andpublish(line 92). While idempotent, this adds unnecessary steps and time to the publish job.Install napi-rs CLI once in a shared setup step or cache it across jobs. Alternatively, if job parallelism is important, this redundancy is acceptable—document the rationale with a comment.
Confirm whether these jobs can share a cached global npm installation or if separate installs are intentional for build isolation.
Also applies to: 92-92
.github/workflows/ci.yml (1)
54-57: Node setup may be redundant after bun setup.Both bun (line 51-52) and Node (line 54-57) are configured in the ci job. Bun includes Node.js, so the separate Node setup may be unnecessary unless there's a specific version requirement or tooling dependency on npm/pnpm.
Verify whether the explicit Node 20 setup is required. If bun's built-in Node is sufficient for typecheck/test/build scripts, this step can be removed to simplify the workflow.
Confirm whether the osgrep project requires a specific npm/pnpm version or global tools that are not available through bun's Node.
osgrep-core/package.json (1)
39-44: Platform package version pinning is strict; ensure consistency with release workflow.The optional dependencies on platform-specific packages (e.g.,
@osgrep-core/darwin-arm64@0.1.0) use exact version constraints. Verify that the release workflow publishes platform packages with matching versions, and confirm this version synchronization is documented or automated.Confirm that the
.github/workflows/release.ymlbuild-native and publish jobs ensure all platform-specific packages are published at the same version (0.1.0) as declared in this package.json. If versions drift, npm resolution will fail.Consider documenting the version constraint strategy (e.g., via a comment or CONTRIBUTING.md) to prevent future version mismatches between the meta package and platform packages.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (9)
.github/workflows/ci.yml(1 hunks).github/workflows/release.yml(2 hunks).gitignore(1 hunks)osgrep-core/package.json(1 hunks)src/lib/index/chunker.ts(1 hunks)src/lib/index/grammar-loader.ts(1 hunks)src/lib/output/formatter.ts(2 hunks)src/lib/output/json-formatter.ts(0 hunks)src/lib/utils/formatter.ts(1 hunks)
💤 Files with no reviewable changes (1)
- src/lib/output/json-formatter.ts
✅ Files skipped from review due to trivial changes (1)
- src/lib/utils/formatter.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/output/formatter.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-native
🔇 Additional comments (8)
.gitignore (3)
24-29: Double-check ignoringpublic/(often a tracked asset directory).
Ifpublic/contains shipped/static assets, ignoring it will cause accidental deletions/untracked content for other devs/CI. If it’s truly dev-only generated output, all good—just worth confirming.
42-45: Lockfile policy: consider explicitly addressingbun.lockb.
You’re ignoring npm/pnpm/yarn lockfiles “(using bun)”; make sure the intended bun lockfile behavior is explicit (committed vs ignored).
47-51: Native/Rust output ignores look correct;**/target/is broad—confirm scope.
If the repo includes non-Rust tooling that also usestarget/, this could hide relevant artifacts; otherwise it’s a good catch-all for Rust builds.src/lib/index/chunker.ts (1)
4-4: LGTM! Consistent with the import path refactoring.The import path update from
../core/languagesto../store/languagesis consistent with the change ingrammar-loader.tsand aligns with the repository-wide restructuring.src/lib/index/grammar-loader.ts (1)
5-5: All imports from the old path have been successfully updated.The search for remaining references to
../core/languagesreturned no results, confirming that the refactoring is complete. The import in this file correctly uses the new path../store/languages..github/workflows/release.yml (2)
108-123: Strong tag and version verification in place.The workflow correctly verifies that the tag commit is an ancestor of main (lines 108–114) and that the git tag matches the package.json version (lines 116–123). This prevents publishing mismatched or out-of-branch releases.
60-60: napi-rs CLI command syntax is correct and properly supports cross-compilation.The
napi build --platform --release --target ${{ matrix.target }}command uses the documented napi-rs v3.x CLI flags correctly. The--targetflag is directly passed to cargo and supports all five target triples specified in the workflow matrix (aarch64-apple-darwin, x86_64-apple-darwin, x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, x86_64-pc-windows-msvc)..github/workflows/ci.yml (1)
65-75: Bun-based workflow replaces pnpm successfully.All three scripts referenced in the workflow are properly defined in package.json:
typecheck(tsc --noEmit),test(vitest run), andbuild(tsc). The transition from pnpm to bun is clean and functional.
| run: | | ||
| cd osgrep-core | ||
| for dir in artifacts/bindings-*/; do | ||
| mv "$dir"*.node . 2>/dev/null || true | ||
| done | ||
| rm -rf artifacts |
There was a problem hiding this comment.
Artifact movement script lacks error visibility.
The defensive || true silently ignores errors when moving .node files. If artifacts fail to copy due to unexpected directory structure or permissions, the publish job will still proceed with missing binaries.
Add explicit verification that artifacts were moved:
cd osgrep-core
for dir in artifacts/bindings-*/; do
- mv "$dir"*.node . 2>/dev/null || true
+ if ! mv "$dir"*.node . 2>/dev/null; then
+ echo "Warning: Failed to move artifacts from $dir"
+ fi
done
rm -rf artifacts
- ls -la *.node
+ if ! ls *.node 1>/dev/null 2>&1; then
+ echo "Error: No .node files found in osgrep-core after moving artifacts"
+ exit 1
+ fi
+ ls -la *.nodeCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
.github/workflows/release.yml around lines 100 to 105: the current loop uses "||
true" which hides failures when moving .node artifacts; replace the silent
ignore with explicit checks: attempt to move files, capture whether any .node
files were moved, emit a clear error message if none were moved (including
listing the artifacts/bindings-*/ directories for diagnosis) and exit non‑zero
to fail the job; ensure permissions errors are surfaced by removing the "||
true" and adding a conditional that fails the step when the moved-file count is
zero.
| { | ||
| "name": "osgrep-core", | ||
| "version": "0.1.0", | ||
| "description": "Native Rust core for osgrep (embeddings + reranking via ONNX)", |
There was a problem hiding this comment.
Update package description to reflect actual functionality.
The description mentions "ONNX" but based on the PR objective and summary, this is a native Rust core. The ONNX reference may be outdated or a copy-paste error from a previous iteration.
Update the description to accurately reflect the package's purpose:
- "description": "Native Rust core for osgrep (embeddings + reranking via ONNX)",
+ "description": "Native Rust core for osgrep (embeddings + reranking)",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "description": "Native Rust core for osgrep (embeddings + reranking via ONNX)", | |
| "description": "Native Rust core for osgrep (embeddings + reranking)", |
🤖 Prompt for AI Agents
In osgrep-core/package.json around line 4, the package description incorrectly
references "ONNX"; update the "description" field to accurately describe the
package as the native Rust core for osgrep (e.g., mentioning embeddings and
reranking implemented in Rust) by replacing the current string with a concise,
factual description that removes/onflicts any mention of ONNX.
| "files": [ | ||
| "index.js", | ||
| "index.d.ts" | ||
| ], |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for index.js and index.d.ts files in osgrep-core directory
echo "=== Checking for index.js and index.d.ts in osgrep-core ==="
ls -la osgrep-core/index.{js,d.ts} 2>&1
echo ""
echo "=== Checking osgrep-core/package.json content ==="
# Show lines 14-17 (files whitelist)
echo "Lines 14-17 (files whitelist):"
sed -n '14,17p' osgrep-core/package.json
echo ""
echo "Lines 36-37:"
sed -n '36,37p' osgrep-core/package.json
echo ""
echo "Lines 39-44:"
sed -n '39,44p' osgrep-core/package.json
echo ""
echo "=== Full package.json for context ==="
cat osgrep-core/package.jsonRepository: Ryandonofrio3/osgrep
Length of output: 1937
🌐 Web query:
@napi-rs/cli latest stable version and version history
💡 Result:
Latest stable: @napi-rs/cli v3.4.0 (released 2025-10-23). (napi.rs)
Recent version history (selected releases):
- 3.4.0 — 2025-10-23. (napi.rs)
- 3.2.0 — 2025-09-11. (napi.rs)
- 3.1.5 — 2025-08-16. (napi.rs)
- 3.1.2 / 3.1.1 / 3.1.0 — Aug 2025 (Aug 8 / Aug 8 / Aug 7). (napi.rs)
- 3.0.0 — 2025-07-17 (NAPI-RS v3 milestone; many 3.0.0-alpha builds in June–July 2025). (napi.rs)
Sources / full changelog: NAPI-RS changelog (napi-cli) and the npm package page. (napi.rs)
If you want the complete list of all published versions (including all alphas) or links to individual changelog entries, I can fetch and paste them.
Citations:
- 1: https://napi.rs/changelog/napi-cli?utm_source=openai
- 2: https://napi.rs/changelog/napi-cli?utm_source=openai
- 3: https://napi.rs/changelog/napi-cli?utm_source=openai
- 4: https://napi.rs/changelog/napi-cli?utm_source=openai
- 5: https://napi.rs/changelog/napi-cli?utm_source=openai
- 6: https://napi.rs/changelog/napi-cli?utm_source=openai
- 7: https://napi.rs/changelog/napi-cli?utm_source=openai
Update @napi-rs/cli to stable version 3.4.0.
The @napi-rs/cli dependency is pinned to ^3.0.0-alpha.63, an alpha version from early development. Stable version 3.4.0 (released October 2025) is available and should be used instead. Pinning to an alpha version may introduce unexpected breaking changes and prevent automatic updates to stable releases. Update to version ^3.4.0 or at minimum ^3.0.0.
The files whitelist (lines 14–17) correctly references existing files (index.js and index.d.ts).
🤖 Prompt for AI Agents
In osgrep-core/package.json around lines 14–17 the "files" array is fine, but
the @napi-rs/cli dependency must be upgraded from the pinned alpha to a stable
release: change the package.json entry for @napi-rs/cli to use "^3.4.0" (or at
minimum "^3.0.0"), then run your package manager (npm/yarn/pnpm) to update the
lockfile and reinstall, rebuild any native artifacts, and run tests to verify
nothing breaks.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
README.md (1)
20-27: README “Local & Private” still looks out of sync with the Rust-native binding story.
Given the PR introducesosgrep-core+ N-API, this bullet should likely describe “Rust-native core/bindings” (and only mention onnxruntime if it’s intentionally user-relevant).#!/bin/bash # Verify what the JS/TS side actually depends on and whether onnxruntime is still user-visible. rg -n "onnxruntime|onnxruntime-node|osgrep-core|napi|node-addon-api" -S . rg -n "\"dependencies\"|\"optionalDependencies\"|\"peerDependencies\"" -S package.json osgrep-core/package.json 2>/dev/null || truesrc/lib/native/index.ts (2)
27-44: Race condition in initialization remains.The
initPromise = nullin.finally()still creates a window for duplicate initialization attempts. Whilen.isInitialized()provides some protection, concurrent callers during init could still trigger redundant work.initPromise = (async () => { const n = await loadNative(); if (!n.isInitialized()) { n.initModels(MODEL_IDS.embed, MODEL_IDS.colbert); } initialized = true; - })().finally(() => { - initPromise = null; - }); + })(); return initPromise;
223-253: Type conversions still useas anycasts.The
as anycasts at lines 227, 232, and 237 remain. While functionally correct, they bypass TypeScript's type checking. Consider updating theRerankInputinterface to explicitly allow the types that are actually accepted, or add runtime type guards.export interface RerankInput { - queryEmbedding: Float32Array; - docEmbeddings: Int8Array; - docTokenIds: Uint32Array; + queryEmbedding: Float32Array | Float64Array; + docEmbeddings: Int8Array | Buffer | ArrayBufferView; + docTokenIds: Uint32Array | number[];src/lib/workers/orchestrator.ts (1)
216-229: FloorseqLento prevent potential out-of-bounds access.
seqLenis computed ascolbertFlat.length / dimwhich may be non-integer if the embedding length is not perfectly divisible. This could cause the loop to access past the array bounds.- const seqLen = colbertFlat.length / dim; + const seqLen = Math.floor(colbertFlat.length / dim);
🧹 Nitpick comments (4)
src/lib/index/chunker.ts (1)
106-136: Avoid duplicating the file label (and handle emptyfilePath).
Right nowdisplayTextincludes both// File: ...and (usually) a breadcrumb starting withFile: .... Also,filePath === ""produces// File:while breadcrumb usesunknown.export function formatChunkText( chunk: ChunkWithContext, filePath: string, ): { content: string; displayText: string } { const breadcrumb = [...chunk.context]; const fileLabel = `File: ${filePath || "unknown"}`; const hasFileLabel = breadcrumb.some( (entry) => typeof entry === "string" && entry.startsWith("File: "), ); - if (!hasFileLabel) { - breadcrumb.unshift(fileLabel); - } + // If we always emit a `// File:` header line, keep breadcrumbs focused on symbols/structure. + // (Retain existing file labels if already present.) const sections: string[] = []; // 1. File path (always first) - sections.push(`// File: ${filePath}`); + sections.push(`// File: ${filePath || "unknown"}`); // 2. Imports (if available) if (chunk.imports && chunk.imports.length > 0) { sections.push( chunk.imports .map((imp) => (imp.startsWith("import") ? imp : `import ${imp}`)) .join("\n"), ); } // 3. Breadcrumb - const header = breadcrumb.length > 0 ? breadcrumb.join(" > ") : fileLabel; + const header = + breadcrumb.length > 0 + ? breadcrumb.filter((b) => !String(b).startsWith("File: ")).join(" > ") + : fileLabel; sections.push(header);tests/git-worktree.test.ts (1)
40-44: Minor: Consider try-catch in cleanup for robustness.If
rmSyncthrows for one directory, cleanup of subsequent directories won't occur. For test isolation, consider wrapping in try-catch:afterEach(() => { for (const dir of created.splice(0)) { - fs.rmSync(dir, { recursive: true, force: true }); + try { + fs.rmSync(dir, { recursive: true, force: true }); + } catch { /* ignore cleanup errors */ } } });src/lib/workers/orchestrator.ts (1)
64-74: Minor race condition in initialization pattern.Setting
initPromise = nullinfinallycreates a brief window where concurrent callers could start duplicate initialization. However, since this is an internal method andinitNative()itself guards against double-init, the practical impact is minimal.src/lib/native/index.ts (1)
196-211: Consider clarifying offset semantics in documentation.Line 205 documents
docOffsetsas "Byte offsets per doc" which is technically correct forInt8Array(1 element = 1 byte), but "element offsets" would be more universally accurate and consistent with usage inorchestrator.ts.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (14)
README.md(3 hunks)src/commands/serve.ts(2 hunks)src/commands/utility/doctor.ts(2 hunks)src/commands/utility/list.ts(1 hunks)src/index.ts(1 hunks)src/lib/index/chunker.ts(2 hunks)src/lib/native/index.ts(1 hunks)src/lib/output/formatter.ts(1 hunks)src/lib/utils/ansi.ts(1 hunks)src/lib/utils/formatter.ts(1 hunks)src/lib/utils/git.ts(2 hunks)src/lib/workers/orchestrator.ts(8 hunks)tests/chunking.test.ts(1 hunks)tests/git-worktree.test.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/index.ts
- src/commands/serve.ts
🧰 Additional context used
🧬 Code graph analysis (4)
tests/git-worktree.test.ts (1)
src/lib/utils/git.ts (3)
isWorktree(22-46)getGitCommonDir(48-82)getMainRepoRoot(87-96)
src/lib/workers/orchestrator.ts (4)
src/lib/index/chunker.ts (3)
ChunkWithContext(60-65)chunk(268-313)formatChunkText(106-151)src/lib/native/index.ts (4)
initNative(27-44)embedBatch(131-174)encodeQueryColbert(184-190)rerankColbert(223-253)src/lib/store/types.ts (2)
PreparedChunk(7-29)VectorRecord(31-35)src/config.ts (1)
CONFIG(18-26)
src/commands/utility/doctor.ts (1)
src/lib/native/index.ts (1)
initNative(27-44)
src/lib/native/index.ts (2)
src/config.ts (2)
MODEL_IDS(4-7)CONFIG(18-26)osgrep-core/index.d.ts (1)
RerankResult(21-26)
🪛 GitHub Check: ci
src/commands/utility/doctor.ts
[failure] 6-6:
Cannot find module '../lib/native' or its corresponding type declarations.
[failure] 5-5:
Cannot find module '../config' or its corresponding type declarations.
🔇 Additional comments (24)
src/commands/utility/list.ts (1)
4-5: No action needed. The import paths have been correctly updated as part of the reorganization.gracefulExitis exported fromsrc/lib/utils/exit.ts, and bothensureProjectPathsandfindProjectRootare exported fromsrc/lib/utils/project-root.ts, matching the import statements at lines 4-5.tests/chunking.test.ts (1)
63-76: Test expectation update matches new header format.
Looks consistent withformatChunkText()now emitting// File: {path}.README.md (1)
256-262: Bun workflow + license line updates look fine.Also applies to: 272-275
src/lib/utils/formatter.ts (1)
1-19: Import and shared ANSI style alignment looks good; extension format handling is correct.The
detectLanguagefunction correctly usespath.extname()which returns extensions with leading dots (e.g., ".ts"), and this matches the format stored in theLANGUAGESarray (e.g.,[".ts", ".tsx", ".js"]). The lookup viagetLanguageByExtension()works as intended.src/lib/output/formatter.ts (1)
1-11: The extension handling is correct; no issue here.The code properly passes
path.extname()result (which includes the leading dot like.ts) togetLanguageByExtension(), which expects extensions in this format. The LANGUAGES array stores extensions with leading dots (.ts,.js, etc.) and the function normalizes input to lowercase before matching.src/lib/utils/ansi.ts (1)
1-10: LGTM!Clean utility that correctly respects TTY detection and the
NO_COLORconvention. ANSI codes and reset sequences are accurate.tests/git-worktree.test.ts (4)
7-14: LGTM!Helper functions are concise and appropriate for test fixture setup.
16-35: LGTM!The fixture correctly simulates a git worktree structure with the
commondirfile pointing to the main repository's.gitdirectory.
46-67: LGTM!Test cases correctly verify the three main scenarios: normal repos, gitfile-based worktrees, and expected return values using
realpathSync.nativefor consistency with the implementation.
69-85: LGTM!Good handling of the Windows platform limitation for symlinks. The conditional skip ensures the test suite passes on all platforms while still testing symlink behavior where supported.
src/lib/utils/git.ts (5)
4-10: LGTM!Good defensive helper that gracefully handles non-existent paths or inaccessible symlinks by falling back to the original path.
12-20: LGTM!Correctly parses the git gitfile format with appropriate error handling.
22-46: LGTM!Correctly handles both standard worktree detection via gitfile and the less common symlinked
.gitdirectory scenario. Thecommondircheck is the correct discriminator for worktree detection.
48-82: LGTM with minor observation.The implementation correctly handles both gitfile and symlink-based worktrees. The fallback at line 66 for missing
commondirin gitfile scenarios is reasonable for non-standard setups, though it differs from the directory case which returnsnullwhencommondiris absent.
84-96: LGTM!Clean implementation that correctly derives the main repository root by taking the parent of the common git directory.
src/lib/workers/orchestrator.ts (5)
1-51: LGTM!Clean module structure with appropriate imports. The removal of the
scalefield fromRerankDocaligns with the native reranking implementation.
76-149: LGTM!Chunking and preparation logic is well-structured. Good use of
crypto.randomUUID()for chunk IDs and proper context window handling for prev/next chunks.
151-202: LGTM!Clean file processing pipeline with proper validation, chunking, and native embedding. Progress callbacks are appropriately placed throughout the flow.
297-330: LGTM!Packing logic and result mapping are correctly implemented. The score alignment back to input order via
candidateIndicesis the expected behavior for the Searcher.
337-362: LGTM!Clean singleton pattern with well-designed convenience wrappers for the public API surface.
src/lib/native/index.ts (4)
64-83: LGTM!Correctly splits the flat embedding buffer into isolated per-text vectors.
89-113: LGTM!Clean interface for packed ColBERT embeddings with proper typed array handling.
131-174: LGTM!Efficient hybrid embedding extraction with proper cursor management for token IDs. The comment about
colbertOffsetbeing set during storage is helpful.
184-190: LGTM!Clean wrapper for query encoding.
| let colbert: Int8Array; | ||
| if (col instanceof Int8Array) { | ||
| colbert = col; | ||
| } else if (Buffer.isBuffer(col)) { | ||
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | ||
| } else if ( | ||
| col && | ||
| typeof col === "object" && | ||
| "type" in col && | ||
| (col as any).type === "Buffer" && | ||
| Array.isArray((col as any).data) | ||
| ) { | ||
| // IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer) | ||
| colbert = new Int8Array((col as any).data); | ||
| } else if (ArrayBuffer.isView(col)) { | ||
| // Handles Uint8Array and other typed arrays (e.g. from LanceDB) | ||
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | ||
| } else if (Array.isArray(col)) { | ||
| colbert = new Int8Array(col); | ||
| } else { | ||
| colbert = new Int8Array(0); | ||
| } |
There was a problem hiding this comment.
Potential data corruption when handling non-Int8 typed arrays.
The ArrayBuffer.isView branch at lines 271-273 will accept any typed array (e.g., Float32Array, Uint16Array) and reinterpret its bytes as Int8Array. This could produce incorrect embeddings if the source isn't already Int8Array or Uint8Array.
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
- } else if (ArrayBuffer.isView(col)) {
- // Handles Uint8Array and other typed arrays (e.g. from LanceDB)
+ } else if (col instanceof Uint8Array) {
+ // Handles Uint8Array (e.g. from LanceDB)
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (Array.isArray(col)) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let colbert: Int8Array; | |
| if (col instanceof Int8Array) { | |
| colbert = col; | |
| } else if (Buffer.isBuffer(col)) { | |
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | |
| } else if ( | |
| col && | |
| typeof col === "object" && | |
| "type" in col && | |
| (col as any).type === "Buffer" && | |
| Array.isArray((col as any).data) | |
| ) { | |
| // IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer) | |
| colbert = new Int8Array((col as any).data); | |
| } else if (ArrayBuffer.isView(col)) { | |
| // Handles Uint8Array and other typed arrays (e.g. from LanceDB) | |
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | |
| } else if (Array.isArray(col)) { | |
| colbert = new Int8Array(col); | |
| } else { | |
| colbert = new Int8Array(0); | |
| } | |
| let colbert: Int8Array; | |
| if (col instanceof Int8Array) { | |
| colbert = col; | |
| } else if (Buffer.isBuffer(col)) { | |
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | |
| } else if (col instanceof Uint8Array) { | |
| // Handles Uint8Array (e.g. from LanceDB) | |
| colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength); | |
| } else if (Array.isArray(col)) { | |
| colbert = new Int8Array(col); | |
| } else { | |
| colbert = new Int8Array(0); | |
| } |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (5)
src/commands/utility/doctor.ts (1)
5-8: Avoid import-time native crashes by lazy-loading../../lib/nativeinside the action.If
../../lib/nativeever loads/validates the N-API binary at module import time,osgrep doctor(and potentiallyosgrep --help) can crash before yourtry/catchruns. Consider dynamically importing within thetryso failures are contained.-import { initNative } from "../../lib/native"; +// NOTE: keep native load inside the command action so failures are catchable.try { - await initNative(); + const { initNative } = await import("../../lib/native"); + await initNative(); console.log("✅ Native models initialized"); } catch (e) {osgrep-core/src/colbert_ort.rs (1)
373-373: Consider making batch size configurable.The hardcoded
batch_size = 64works well for most cases, but making it configurable (e.g., via a method parameter with a default) would allow tuning for different hardware or workload characteristics.osgrep-core/src/dense_ort.rs (3)
132-170: Avoid avoidable clones + per-row allocations in the hot path
encode_batch(texts: Vec<String>, ...)immediately doestexts.clone()just for tokenization.- Mean pooling allocates
sum_hidden = vec![...]for every row.This is likely on your query-time critical path; consider changing the API to accept
&[String](orimpl IntoIterator<Item = impl AsRef<str>>) and reusing buffers for pooling.- pub fn encode_batch(&mut self, texts: Vec<String>, normalize: bool) -> anyhow::Result<Vec<f32>> { + pub fn encode_batch(&mut self, texts: &[String], normalize: bool) -> anyhow::Result<Vec<f32>> { if texts.is_empty() { return Err(anyhow::anyhow!("Empty input texts")); } // Tokenize - let encodings = self.tokenizer - .encode_batch(texts.clone(), true) + let encodings = self.tokenizer + .encode_batch(texts.to_vec(), true) // consider a non-cloning form if supported by your tokenizers version .map_err(|e| anyhow::anyhow!("Tokenization failed: {}", e))?; @@ - for i in 0..batch_size { - let mut sum_hidden = vec![0.0f32; self.hidden_size]; + let mut sum_hidden = vec![0.0f32; self.hidden_size]; + for i in 0..batch_size { + sum_hidden.fill(0.0); let mut sum_mask = 0.0f32;Also applies to: 188-207
24-82: De-duplicate session/tokenizer construction betweenload_from_hf()andload()
The two constructors are ~90% identical (session builder + tokenizer config). A small private helper (e.g.,fn build_session(model_path: impl AsRef<Path>) -> ...andfn build_tokenizer(tokenizer_path: ...) -> ...) will reduce drift and make it easier to evolve options (threads, EPs, max_len) consistently.Also applies to: 85-130
42-56: Hardcoded intra-op threads (with_intra_threads(4)) is a portability/perf footgun
This will underutilize big machines and oversubscribe small ones (and may fight JS-layer concurrency). Consider defaulting tonum_cpus(or making it configurable from the caller / env).Also applies to: 91-105
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (6)
.gitignore(1 hunks)osgrep-core/Cargo.toml(1 hunks)osgrep-core/src/colbert_ort.rs(1 hunks)osgrep-core/src/dense_ort.rs(1 hunks)src/commands/agent/mcp.ts(1 hunks)src/commands/utility/doctor.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- osgrep-core/Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (3)
src/commands/utility/doctor.ts (1)
src/lib/native/index.ts (1)
initNative(27-44)
osgrep-core/src/dense_ort.rs (1)
osgrep-core/src/colbert_ort.rs (4)
log_native(10-14)load_from_hf(40-120)outputs(180-180)outputs(288-288)
osgrep-core/src/colbert_ort.rs (1)
osgrep-core/src/dense_ort.rs (4)
log_native(9-13)load_from_hf(24-82)hidden_size(234-236)outputs(185-185)
🔇 Additional comments (13)
.gitignore (4)
25-29: Verify that entire directories (scripts/, tools/, experiments/, public/) are intentionally ignored.Ignoring complete directories can mask important source code from version control. Based on the PR summary mentioning removal of "skeleton/trace/verify CLI commands" and "many worker, embedding, and experiment scripts," please confirm these directories have been deleted or reorganized and are not critical for the project.
Please verify:
- Have the
scripts/,tools/,experiments/, andpublic/directories been deleted entirely from the codebase, or are they being relocated?- If they still exist, do they contain build artifacts or generated files that should be ignored, or is there important source code that should remain tracked?
3-3: Pattern clarity:**/node_modules/is correct but adds scope.The addition of
**/node_modules/on line 3 complements the root-levelnode_modules/on line 2. This ensures node_modules in any subdirectory (e.g.,osgrep-core/node_modules/) is ignored, which is appropriate given the introduction of a Rust subproject with potential Node bindings.
48-52: Rust and N-API build artifacts are properly ignored.The new patterns for osgrep-core build outputs are well-structured:
osgrep-core/target/→ Cargo build artifacts**/target/→ Fallback for any Rust project in subdirectoriesosgrep-core/*.node→ Compiled N-API bindingsosgrep-core/bench/→ Benchmark outputs
42-47: Migration to bun is reflected correctly in lockfile ignores.The removal of
pnpm-lock.yaml,package-lock.json,yarn.lockfrom tracked files and addition ofbun.lockbcorrectly reflects the tooling migration. Ensurebun.lock(human-readable lockfile) is tracked if your project uses it, or confirmbun.lockbis the only lockfile being generated.Please verify that the build tooling migration includes updating CI/CD workflows, package installation scripts, and documentation to use
buninstead ofpnpm.src/commands/agent/mcp.ts (1)
10-12: All updated import paths resolve correctly—no issues found.The three modules exist at their expected locations and the import paths are correct:
syncer.ts→src/lib/index/syncer.ts✓setup-helpers.ts→src/lib/setup/setup-helpers.ts✓project-root.ts→src/lib/utils/project-root.ts✓osgrep-core/src/colbert_ort.rs (8)
1-14: LGTM!The imports are appropriate for ColBERT-ORT functionality, and the no-op logging pattern matches the approach in
dense_ort.rs.
16-22: LGTM!The constants are well-defined with clear documentation about the MaxSim cost implications.
24-37: LGTM!The struct design is well-organized with appropriate types for each field. The use of
Optionfor marker tokens andHashSetfor skip IDs demonstrates good design choices.
122-207: LGTM!The query encoding logic is well-implemented with proper handling of optional marker tokens, careful truncation accounting for special tokens, and correct L2 normalization with numerical stability protection.
209-325: LGTM!The document encoding implementation properly handles batching, dynamic padding, and preserves token IDs for skiplist filtering. The L2 normalization is correctly applied per-token.
327-360: LGTM!The MaxSim scoring algorithm is correctly implemented with proper skiplist filtering and numerical stability.
455-495: LGTM!The packed encoding implementation efficiently concatenates embeddings for storage and reuse. The offset calculation correctly tracks the running position in the flat embeddings array.
560-608: LGTM!The fast query-time reranking path correctly uses pre-encoded documents and properly maps scores back to the original document indices.
| .DS_Store | ||
|
|
||
| # Local osgrep data | ||
| .osgrep/ |
There was a problem hiding this comment.
Redundant entry: remove line 53 .osgrep.
Line 53 (.osgrep) is a duplicate of the new entry on line 9 (.osgrep/). The trailing slash is more precise as it targets only directories. Remove the old entry to avoid confusion.
-.osgrepAlso applies to: 53-53
🤖 Prompt for AI Agents
In .gitignore around lines 9 and 53, there is a duplicate entry for `.osgrep`
(line 9 has `.osgrep/` and line 53 has `.osgrep`); remove the redundant line 53
entry so only the more precise directory pattern `.osgrep/` remains, ensuring no
other references to the same pattern are left elsewhere in the file.
| .build(), | ||
| ])? | ||
| .with_optimization_level(GraphOptimizationLevel::Level3)? | ||
| .with_intra_threads(8)? |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check intra_threads usage across the codebase
rg -n 'intra_threads' --type=rust -C 2
# Also search for any configuration or documentation about thread settings
rg -n 'thread' --type=rust osgrep-core/src/ | grep -i 'intra\|config\|session'Repository: Ryandonofrio3/osgrep
Length of output: 2561
🏁 Script executed:
#!/bin/bash
# Check for comments around the thread configuration in both files
echo "=== colbert_ort.rs context ==="
sed -n '70,90p' osgrep-core/src/colbert_ort.rs
echo -e "\n=== dense_ort.rs context ==="
sed -n '40,60p' osgrep-core/src/dense_ort.rs
# Search for any comments about thread tuning or performance in the codebase
rg -i 'colbert|thread|performance|tuning' osgrep-core/src/*.rs | head -20Repository: Ryandonofrio3/osgrep
Length of output: 3587
Add a comment explaining the thread count difference.
colbert_ort.rs uses intra_threads(8) while dense_ort.rs uses intra_threads(4). This difference is consistent across both CoreML and CPU paths in each file, but lacks documentation. Add a comment explaining whether this is intentional based on model characteristics (e.g., different model sizes/architecture) or performance tuning rationale.
🤖 Prompt for AI Agents
In osgrep-core/src/colbert_ort.rs around line 78, add a short comment next to
the .with_intra_threads(8)? call explaining why this file uses 8 intra-thread
workers whereas dense_ort.rs uses 4 (e.g., intentional difference due to model
size/architecture, different layer parallelism, or performance
tuning/benchmarks); if the choice is not intentional, note that it should be
unified and add a TODO referencing where to run/record a quick benchmark or link
to an issue so maintainers can validate and standardize the thread counts across
both files.
| let cls_id = *vocab.get("[CLS]").unwrap_or(&0); | ||
| let sep_id = *vocab.get("[SEP]").unwrap_or(&0); | ||
| let mask_id = *vocab.get("[MASK]").unwrap_or(&0); | ||
| let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Execute the provided verification script
rg -n -C3 'PAD.*MASK|pad.*fallback' --type=rust
# Also check for pad_id/mask_id patterns
rg -n -C3 'pad_id.*mask_id' --type=rust -g '**/*test*.rs'Repository: Ryandonofrio3/osgrep
Length of output: 860
🏁 Script executed:
# Get context around line 96 in colbert_ort.rs
head -120 osgrep-core/src/colbert_ort.rs | tail -40Repository: Ryandonofrio3/osgrep
Length of output: 1557
🏁 Script executed:
# Search for where mask_id comes from and how it's used
rg -n 'mask_id' osgrep-core/src/colbert_ort.rs -B5 -A5Repository: Ryandonofrio3/osgrep
Length of output: 2078
🏁 Script executed:
# Check if there are any comments or documentation about token handling
rg -n 'vocab|token' osgrep-core/src/colbert_ort.rs | head -30Repository: Ryandonofrio3/osgrep
Length of output: 1820
🏁 Script executed:
# Search for where pad_id is used
rg -n 'pad_id' osgrep-core/src/colbert_ort.rsRepository: Ryandonofrio3/osgrep
Length of output: 252
🏁 Script executed:
# Search across entire codebase for pad_id usage
rg -n 'pad_id' --type=rustRepository: Ryandonofrio3/osgrep
Length of output: 376
🏁 Script executed:
# Check how padding is actually handled in the code
rg -n 'pad|padding' osgrep-core/src/colbert_ort.rs -iRepository: Ryandonofrio3/osgrep
Length of output: 578
🏁 Script executed:
# Check if there are any related tests or documentation in the repo
find . -name "*.md" -o -name "*.txt" | xargs grep -l "pad\|PAD" 2>/dev/null | head -10Repository: Ryandonofrio3/osgrep
Length of output: 151
Remove unused pad_id field or implement its actual usage.
The pad_id variable is assigned at line 96 with a fallback to mask_id, but it is never used in the encoding or processing logic. The code pads with mask_id directly (line 160) or leaves positions as 0 (line 274), making pad_id dead code. Either remove this field and its assignment, or implement the intended padding logic using pad_id instead of mask_id if semantically required for the model.
| pub token_ids: Vec<u32>, | ||
| /// Number of tokens per document | ||
| pub lengths: Vec<u32>, | ||
| /// Byte offsets into embeddings buffer for each doc (for fast lookup) |
There was a problem hiding this comment.
Correct the misleading comment.
The comment states "Byte offsets into embeddings buffer" but the offsets are actually element indices (f32 counts), not byte offsets. This is evident from line 481 where all_embeddings.len() (an element count) is stored.
Apply this diff to correct the comment:
- /// Byte offsets into embeddings buffer for each doc (for fast lookup)
+ /// Element offsets into embeddings buffer for each doc (for fast lookup)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Byte offsets into embeddings buffer for each doc (for fast lookup) | |
| /// Element offsets into embeddings buffer for each doc (for fast lookup) |
🤖 Prompt for AI Agents
In osgrep-core/src/colbert_ort.rs around line 449, the comment "Byte offsets
into embeddings buffer for each doc (for fast lookup)" is misleading because the
values are element indices (counts of f32 elements) not byte offsets; update the
comment to clearly state these are element offsets (e.g., "Element offsets (f32
indices) into embeddings buffer for each doc (for fast lookup)") so it
accurately reflects that values like all_embeddings.len() are stored as element
counts.
| let token_offset: usize = packed.offsets[..doc_idx] | ||
| .iter() | ||
| .zip(&packed.lengths[..doc_idx]) | ||
| .map(|(&off, &len)| len as usize) | ||
| .sum(); |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Simplify the token offset calculation.
The code zips offsets[..doc_idx] with lengths[..doc_idx] but then only uses the lengths in the map function (the &off binding is unused). This is unnecessarily complex.
Apply this diff to simplify:
- let token_offset: usize = packed.offsets[..doc_idx]
- .iter()
- .zip(&packed.lengths[..doc_idx])
- .map(|(&off, &len)| len as usize)
- .sum();
+ let token_offset: usize = packed.lengths[..doc_idx]
+ .iter()
+ .map(|&len| len as usize)
+ .sum();Or even more concisely:
- let token_offset: usize = packed.offsets[..doc_idx]
- .iter()
- .zip(&packed.lengths[..doc_idx])
- .map(|(&off, &len)| len as usize)
- .sum();
+ let token_offset: usize = packed.lengths[..doc_idx]
+ .iter()
+ .sum::<u32>() as usize;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let token_offset: usize = packed.offsets[..doc_idx] | |
| .iter() | |
| .zip(&packed.lengths[..doc_idx]) | |
| .map(|(&off, &len)| len as usize) | |
| .sum(); | |
| let token_offset: usize = packed.lengths[..doc_idx] | |
| .iter() | |
| .map(|&len| len as usize) | |
| .sum(); |
🤖 Prompt for AI Agents
In osgrep-core/src/colbert_ort.rs around lines 515 to 519, the token_offset
calculation zips offsets[..doc_idx] with lengths[..doc_idx] but only uses the
lengths, so remove the unnecessary zip and unused binding and compute
token_offset by summing the lengths slice up to doc_idx (casting each length to
usize) — replace the zip/map with a direct iterator over
packed.lengths[..doc_idx].iter().map(|&len| len as usize).sum() (or equivalent)
to simplify and clarify the calculation.
| // Download model and tokenizer files | ||
| // ONNX model is in onnx/ subdirectory for onnx-community repos | ||
| // Also download the external data file if it exists | ||
| let model_path = repo.get("onnx/model.onnx")?; | ||
| let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found | ||
| let tokenizer_path = repo.get("tokenizer.json")?; | ||
|
|
There was a problem hiding this comment.
Don’t silently ignore model.onnx_data download failures (can mask the real root cause)
let _ = repo.get("onnx/model.onnx_data") will swallow all errors (not just “not found”), which makes external-data model failures harder to debug and can hide transient network/auth issues. Consider logging the error (at least) and/or only ignoring a “not found” case.
- let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found
+ match repo.get("onnx/model.onnx_data") {
+ Ok(path) => log_native(format!("[ORT] Downloaded external data: {:?}", path)),
+ Err(e) => {
+ // Many models won't have external data; still surface unexpected failures in logs.
+ log_native(format!("[ORT] External data not downloaded (may be absent): {e}"));
+ }
+ }Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In osgrep-core/src/dense_ort.rs around lines 30–36, the call `let _ =
repo.get("onnx/model.onnx_data")` swallows all errors; change it to only ignore
a “not found”/missing-file error and for any other error either log the error
with context (including the path and error) or return it to the caller.
Concretely, call repo.get and match the Result: if the error indicates not
found, continue silently; otherwise log processLogger/error (or use the crate
logger) with the path and error details and propagate the error (or return Err)
so transient/network/auth issues aren’t hidden.
| // Initialize ONNX Runtime session | ||
| // On macOS, use CoreML for GPU acceleration with CPU fallback | ||
| #[cfg(target_os = "macos")] | ||
| let session = Session::builder()? | ||
| .with_execution_providers([ | ||
| CoreMLExecutionProvider::default() | ||
| .with_subgraphs(true) // Enable CoreML for subgraphs | ||
| .build(), | ||
| ])? | ||
| .with_optimization_level(GraphOptimizationLevel::Level3)? | ||
| .with_intra_threads(4)? | ||
| .commit_from_file(&model_path)?; | ||
|
|
||
| #[cfg(not(target_os = "macos"))] | ||
| let session = Session::builder()? | ||
| .with_optimization_level(GraphOptimizationLevel::Level3)? | ||
| .with_intra_threads(4)? | ||
| .commit_from_file(&model_path)?; | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the dense_ort.rs file to understand the full context
fd dense_ort.rsRepository: Ryandonofrio3/osgrep
Length of output: 92
🏁 Script executed:
# Check for the file and read it
fd -e rs dense_ortRepository: Ryandonofrio3/osgrep
Length of output: 92
🏁 Script executed:
# List all rust files in osgrep-core/src
find osgrep-core/src -name "*.rs" -type f 2>/dev/null | head -20Repository: Ryandonofrio3/osgrep
Length of output: 146
🏁 Script executed:
# Check Cargo.toml to see ort version
find . -name Cargo.toml -type f | head -5Repository: Ryandonofrio3/osgrep
Length of output: 88
🏁 Script executed:
# Read the dense_ort.rs file
cat -n osgrep-core/src/dense_ort.rsRepository: Ryandonofrio3/osgrep
Length of output: 10847
🏁 Script executed:
# Read the Cargo.toml
cat osgrep-core/Cargo.tomlRepository: Ryandonofrio3/osgrep
Length of output: 943
🌐 Web query:
ort 2.0.0-rc.10 session inputs API rust
💡 Result:
Here's a concise summary of how to provide session inputs with ort 2.0.0-rc.10 (Rust), with minimal examples.
Key points
- Use the inputs! macro to build inputs easily (either positional array or named map). [1]
- Session.run accepts types convertible Into<SessionInputs<'i,'v,N>>; common usage is session.run(ort::inputs![...]) or session.run(ort::inputs!{"name" => value}?). [1][2]
- You can pass Tensor, TensorRef, Value, Vec, arrays, or HashMap<String,Value> depending on whether you build statically or dynamically. For async/timeout control use run_with_options / run_async with RunOptions. [1]
Minimal examples
- Positional array:
let outputs = session.run(ort::inputs![Tensor::from_array(([1, 3, 224, 224], image_vec))?])?; - Named inputs:
let outputs = session.run(ort::inputs!{
"input_ids" => Tensor::from_array(([1, seq_len], ids_vec))?,
"attention_mask" => Tensor::from_array(([1, seq_len], mask_vec))?
})?; - Using TensorRef (avoid extra copy):
let outputs = session.run(ort::inputs![TensorRef::from_array_view(&ndarray_array)?])?;
References
- Session API (Session.run / run_with_options, Session.inputs/outputs). [1]
- inputs! macro docs and examples. [2]
- ort 2.0.0-rc.10 introduction / getting started. [3]
Sources
[1] Session (ort crate) API docs. [2] inputs! macro docs. [3] ort 2.0.0-rc.10 guide (getting started).
🌐 Web query:
ort rust 2.0.0-rc.10 extract tensor shape
💡 Result:
Use the OrtOwnedTensor’s underlying ndarray view (it implements Deref to an ndarray::ArrayView) and call shape() or dim().
Example:
let outputs: Vec<OrtOwnedTensor<f32,_>> = session.run(inputs)?;
let out = &outputs[0]; // OrtOwnedTensor
let shape: &[usize] = out.shape(); // e.g. &[1,3,224,224]
let dim = out.ndim(); // number of axesReferences: OrtOwnedTensor docs (deref to ndarray::ArrayView) and DynArrayRef shape helper. [1][2]
🌐 Web query:
ort 2.0.0-rc.10 session.inputs name field rust
💡 Result:
Yes — in ort 2.0.0-rc.10 Session.inputs is a Vec and Input has a pub name: String. Example:
let session = Session::builder()?.commit_from_file("model.onnx")?;
for input in &session.inputs {
println!("input name = {}", input.name);
}
Docs: Input struct (shows name field). [1] Session struct (shows inputs: Vec). [2]
Sources
[1] https://docs.rs/ort/latest/ort/session/struct.Input.html
[2] https://docs.rs/ort/latest/ort/session/struct.Session.html
🌐 Web query:
ort crate rust Session inputs input metadata
💡 Result:
Short answer: the ort crate’s Session exposes its inputs as Session.inputs: Vec, where Input has name, input_type (TensorElementDataType) and dimensions (Vec<Option>) and helper methods; model-level metadata is available via Session::metadata(). Use the inputs! macro or Value/Tensor helpers to build run inputs. Example:
Code:
-
Inspect inputs / metadata
let mut session = Session::builder()?.with_model_from_file("model.onnx")?;
for input in &session.inputs {
println!("name={} type={:?} dims={:?}", input.name, input.input_type, input.dimensions());
}
let meta = session.metadata()?;
println!("model name: {:?}", meta.name()); -
Run using inputs! macro:
let outputs = session.run(ort::inputs! { "input_name" => Tensor::from_array(([1,3,224,224], data))? })?;
Sources:
- Session (fields & methods). [1]
- Input struct (name, input_type, dimensions + dimensions() iterator). [2]
- inputs! macro (convenience macro for building inputs). [3]
References:
[1] docs.rs/ort :: Session.
[2] docs.rs/ort :: session.rs (Input struct).
[3] docs.rs/ort :: inputs! macro.
Add runtime validation for ORT input names and output tensor shape
The encode_batch() method unconditionally passes token_type_ids (line 181), but many embedding models don't accept this input. Additionally, there is no validation that the output tensor has the expected shape [batch, seq, hidden] before indexing into it.
Add a conditional check for the token_type_ids input by inspecting self.session.inputs, and validate the embeddings tensor shape after extraction:
// Run inference
- let outputs = self.session.run(ort::inputs![
- "input_ids" => input_ids,
- "attention_mask" => attention_mask_tensor,
- "token_type_ids" => token_type_ids
- ])?;
+ // Some models do not accept token_type_ids
+ let outputs = if self.session.inputs.iter().any(|i| i.name == "token_type_ids") {
+ self.session.run(ort::inputs![
+ "input_ids" => input_ids,
+ "attention_mask" => attention_mask_tensor,
+ "token_type_ids" => token_type_ids
+ ])?
+ } else {
+ self.session.run(ort::inputs![
+ "input_ids" => input_ids,
+ "attention_mask" => attention_mask_tensor
+ ])?
+ };
// Get the last_hidden_state output (typically first output)
let embeddings_tensor = outputs[0].try_extract_tensor::<f32>()?;
+ // Validate output shape before indexing
+ let shape = embeddings_tensor.shape();
+ if shape.len() != 3 || shape[0] != batch_size || shape[1] != max_len || shape[2] != self.hidden_size {
+ return Err(anyhow::anyhow!(
+ "Unexpected embeddings shape {:?}; expected [{}, {}, {}]",
+ shape, batch_size, max_len, self.hidden_size
+ ));
+ }
let embeddings_data: &[f32] = embeddings_tensor.1;Committable suggestion skipped: line range outside the PR's diff.
| try { | ||
| await initNative(); | ||
| console.log("✅ Native models initialized"); | ||
| } catch (e) { | ||
| const msg = e instanceof Error ? e.message : String(e); | ||
| console.log(`❌ Native init failed: ${msg}`); | ||
| console.log(" Try: osgrep setup"); | ||
| } |
There was a problem hiding this comment.
Return a non-zero exit code (and use console.error) when native init fails.
Right now failures are only logged; consumers (CI/scripts) can’t reliably detect an unhealthy state. Consider setting process.exitCode = 1 (or accumulating a healthy flag) and emitting errors to stderr.
} catch (e) {
const msg = e instanceof Error ? e.message : String(e);
- console.log(`❌ Native init failed: ${msg}`);
- console.log(" Try: osgrep setup");
+ console.error(`❌ Native init failed: ${msg}`);
+ console.error(" Try: osgrep setup");
+ process.exitCode = 1;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| await initNative(); | |
| console.log("✅ Native models initialized"); | |
| } catch (e) { | |
| const msg = e instanceof Error ? e.message : String(e); | |
| console.log(`❌ Native init failed: ${msg}`); | |
| console.log(" Try: osgrep setup"); | |
| } | |
| try { | |
| await initNative(); | |
| console.log("✅ Native models initialized"); | |
| } catch (e) { | |
| const msg = e instanceof Error ? e.message : String(e); | |
| console.error(`❌ Native init failed: ${msg}`); | |
| console.error(" Try: osgrep setup"); | |
| process.exitCode = 1; | |
| } |
🤖 Prompt for AI Agents
In src/commands/utility/doctor.ts around lines 27 to 34, the catch block only
logs failures to stdout and doesn't signal an error to callers; change the catch
to write messages to stderr using console.error (both the "Native init failed:
<msg>" and the "Try: osgrep setup" hint) and set process.exitCode = 1 (or
otherwise ensure the command returns a non-zero exit) so CI/scripts can detect
the failure.
…n osgrep. Added process checks for running servers, improved project root detection, and introduced file watching for live indexing. Updated search command to handle server responses and added tracing functionality. Enhanced ignore patterns for generated files in search results.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (7)
osgrep-core/src/colbert_ort.rs (3)
498-502: Simplify the unnecessary zip operation in token offset calculation.The past review comment on this issue remains unaddressed. The code zips
offsets[..doc_idx]withlengths[..doc_idx]but only uses the lengths in the map function (the&offbinding is unused).Apply this diff to simplify:
- let token_offset: usize = packed.offsets[..doc_idx] - .iter() - .zip(&packed.lengths[..doc_idx]) - .map(|(&off, &len)| len as usize) - .sum(); + let token_offset: usize = packed.lengths[..doc_idx] + .iter() + .map(|&len| len as usize) + .sum();Or even more concisely:
- let token_offset: usize = packed.offsets[..doc_idx] - .iter() - .zip(&packed.lengths[..doc_idx]) - .map(|(&off, &len)| len as usize) - .sum(); + let token_offset: usize = packed.lengths[..doc_idx] + .iter() + .sum::<u32>() as usize;
79-79: Remove unusedpad_idfield or implement its usage.The past review comment on this issue remains unaddressed. The
pad_idis assigned here (with fallback tomask_id) but is never referenced anywhere in the encoding logic. The code usesmask_iddirectly for padding (line 143) or leaves positions as 0. Either remove this dead code or implement the intended padding logic usingpad_id.Apply this diff to remove the unused field:
let cls_id = *vocab.get("[CLS]").unwrap_or(&0); let sep_id = *vocab.get("[SEP]").unwrap_or(&0); let mask_id = *vocab.get("[MASK]").unwrap_or(&0); - let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD // ColBERT marker tokens (may not exist in all tokenizers)And remove from struct definition (line 29):
mask_id: u32, - pad_id: u32, query_marker_id: Option<u32>,And from struct initialization (line 98):
mask_id, - pad_id, query_marker_id,And from logging (line 85-86):
- log_native(format!("[ColBERT-ORT] Token IDs: CLS={}, SEP={}, MASK={}, PAD={}", - cls_id, sep_id, mask_id, pad_id)); + log_native(format!("[ColBERT-ORT] Token IDs: CLS={}, SEP={}, MASK={}", + cls_id, sep_id, mask_id));
432-432: Correct the misleading comment about offsets.The past review comment on this issue remains unaddressed. The comment states "Byte offsets into embeddings buffer" but the values are actually element indices (f32 counts), not byte offsets. This is evident from line 464 where
all_embeddings.len()(an element count) is stored as an offset.Apply this diff to correct the comment:
- /// Byte offsets into embeddings buffer for each doc (for fast lookup) + /// Element offsets into embeddings buffer for each doc (for fast lookup)osgrep-core/src/dense_ort.rs (2)
31-31: Silently ignoring all errors for external data download can hide real issues.The past review comment on this line remains unaddressed. Using
let _ =swallows all errors, not just "file not found". This makes debugging difficult when external data downloads fail due to network issues, authentication problems, or other transient failures.Consider logging the error at minimum:
- let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found + match repo.get("onnx/model.onnx_data") { + Ok(path) => log_native(format!("[ORT] Downloaded external data: {:?}", path)), + Err(e) => { + // Many models won't have external data; log unexpected failures for debugging + log_native(format!("[ORT] External data not downloaded (may be absent): {e}")); + } + }
149-153: Runtime failure on models that don't accepttoken_type_idsinput.The past review comment on this issue remains unaddressed. Many embedding models don't accept the
token_type_idsinput, which will cause this code to fail at runtime. You need to conditionally check which inputs the model accepts by inspectingself.session.inputs.Apply this fix to conditionally pass
token_type_ids:- // Run inference - let outputs = self.session.run(ort::inputs![ - "input_ids" => input_ids, - "attention_mask" => attention_mask_tensor, - "token_type_ids" => token_type_ids - ])?; + // Run inference - some models don't accept token_type_ids + let outputs = if self.session.inputs.iter().any(|i| i.name == "token_type_ids") { + self.session.run(ort::inputs![ + "input_ids" => input_ids, + "attention_mask" => attention_mask_tensor, + "token_type_ids" => token_type_ids + ])? + } else { + self.session.run(ort::inputs![ + "input_ids" => input_ids, + "attention_mask" => attention_mask_tensor + ])? + };src/lib/native/index.ts (2)
45-47: Race condition remains from previous review.The
.finally(() => { initPromise = null; })on line 46 still creates a race window. If a caller arrives afterinitPromiseresolves but before thefinallyblock completes, it will seeinitialized = trueand return immediately (line 28). However, if another caller arrives during the brief window wheninitPromiseis being set tonull, it could start a new initialization.As noted in the past review comment, you should either keep
initPromisecached permanently or wrap initialization in try/catch to only reset on failure.
240-250: Type casts remain from previous review.Lines 240, 245, and 250 still use
as anycasts. While instanceof checks are present, theas anycasts bypass type safety.As noted in the past review comment, consider:
- Updating the
RerankInputinterface to match what the native binding expects, or- Performing explicit, type-safe conversions with validation
The Float32Array → Float64Array conversion on line 240 is necessary but could be documented.
🧹 Nitpick comments (3)
plugins/osgrep/hooks/start.js (1)
61-61: Consider configurable log path.The log path is hardcoded to
/tmp/osgrep.log. Consider using a user-specific path (e.g.,~/.osgrep/logs/hook.log) to avoid permission issues and conflicts with other users on shared systems.src/commands/trace.ts (1)
7-13: Consider centralizing ANSI utilities.The inline ANSI style helpers duplicate functionality that could be in a shared utility (e.g.,
src/lib/utils/ansi.tsmentioned in the PR). While functional, centralizing would improve maintainability.src/commands/serve.ts (1)
613-690: Consider adding timeout/abort handling to /trace.The /trace endpoint lacks the timeout and AbortController handling present in /search (lines 518-578). While graph queries may be faster, adding timeout protection would make the endpoint more resilient to pathological cases.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
bun.lockbis excluded by!**/bun.lockb
📒 Files selected for processing (14)
osgrep-core/Cargo.toml(1 hunks)osgrep-core/index.d.ts(1 hunks)osgrep-core/src/colbert_ort.rs(1 hunks)osgrep-core/src/dense_ort.rs(1 hunks)plugins/osgrep/hooks/start.js(1 hunks)plugins/osgrep/skills/osgrep/SKILL.md(2 hunks)src/commands/search.ts(8 hunks)src/commands/serve.ts(11 hunks)src/commands/trace.ts(1 hunks)src/index.ts(1 hunks)src/lib/graph/graph-builder.ts(1 hunks)src/lib/index/ignore-patterns.ts(1 hunks)src/lib/native/index.ts(1 hunks)src/lib/search/searcher.ts(10 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- src/index.ts
- osgrep-core/Cargo.toml
- plugins/osgrep/skills/osgrep/SKILL.md
- osgrep-core/index.d.ts
🧰 Additional context used
🧬 Code graph analysis (7)
plugins/osgrep/hooks/start.js (3)
src/lib/utils/server-registry.ts (2)
isProcessRunning(33-40)getServerForProject(66-70)plugins/osgrep/hooks/stop.js (4)
pid(32-32)fs(1-1)data(31-31)cwd(25-25)src/lib/utils/project-root.ts (1)
findProjectRoot(14-29)
src/commands/trace.ts (3)
src/lib/graph/graph-builder.ts (2)
CallGraph(13-17)GraphBuilder(26-173)src/lib/utils/project-root.ts (2)
findProjectRoot(14-29)ensureProjectPaths(31-51)src/lib/utils/exit.ts (1)
gracefulExit(3-20)
src/commands/search.ts (3)
src/lib/output/formatter.ts (1)
formatResults(101-108)src/lib/utils/server-registry.ts (1)
unregisterServer(50-54)src/lib/store/vector-db.ts (1)
VectorDB(20-302)
src/lib/native/index.ts (2)
src/config.ts (2)
MODEL_IDS(4-7)CONFIG(18-26)osgrep-core/index.d.ts (1)
RerankResult(21-26)
src/commands/serve.ts (8)
src/lib/utils/server-registry.ts (5)
isProcessRunning(33-40)registerServer(42-48)unregisterServer(50-54)listServers(56-64)getServerForProject(66-70)src/lib/store/meta-cache.ts (2)
MetaCache(12-60)MetaEntry(6-10)src/lib/store/types.ts (1)
VectorRecord(31-35)src/lib/utils/file-utils.ts (1)
isIndexableFile(44-63)src/lib/workers/orchestrator.ts (2)
processFile(151-202)processFile(346-350)src/lib/utils/lock.ts (1)
acquireWriterLockWithRetry(87-109)src/lib/index/ignore-patterns.ts (1)
DEFAULT_IGNORE_PATTERNS(3-60)src/lib/graph/graph-builder.ts (1)
GraphBuilder(26-173)
src/lib/graph/graph-builder.ts (2)
src/lib/utils/filter-builder.ts (2)
escapeSqlString(1-5)normalizePath(12-14)src/lib/store/types.ts (1)
VectorRecord(31-35)
osgrep-core/src/colbert_ort.rs (1)
osgrep-core/src/dense_ort.rs (3)
log_native(6-10)load_from_hf(21-66)hidden_size(205-207)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci
🔇 Additional comments (39)
src/lib/native/index.ts (4)
70-89: LGTM: embedDense implementation is correct.The function properly initializes, calls the native binding, and splits the flat embedding array into per-document vectors.
108-119: LGTM: embedColbert correctly wraps native result.The function properly wraps the packed ColBERT result in typed arrays.
137-183: LGTM: embedBatch correctly assembles hybrid embeddings.The function properly extracts dense vectors, ColBERT embeddings, and token IDs from the packed native result, with correct offset/length tracking for each document.
193-202: LGTM: encodeQueryColbert is straightforward.plugins/osgrep/hooks/start.js (4)
14-42: Helper duplication is acceptable for hook isolation.The
isProcessRunning,findProjectRoot, andgetServerForProjectfunctions duplicate logic fromsrc/lib/utils/server-registry.tsandsrc/lib/utils/project-root.ts. This is acceptable since this hook runs in a separate Node.js process and cannot import TypeScript modules.
47-57: LGTM: Clean early exit for disabled auto-serve.The environment variable check provides a clear escape hatch for users who don't want automatic server startup.
59-71: LGTM: Server spawn logic is correct.The code correctly checks for an existing server before spawning, and uses the proper detached + unref pattern for background processes.
73-83: LGTM: Response structure is clear and informative.src/lib/index/ignore-patterns.ts (1)
62-98: LGTM: Comprehensive generated file patterns.The regex patterns provide good coverage of common generated/auto-generated code across multiple languages and frameworks. The case-insensitive matching is appropriate.
src/commands/trace.ts (4)
19-41: LGTM: formatPlain provides clean agent-friendly output.
46-92: LGTM: formatTree provides clean visual hierarchy.The Unicode box-drawing characters create a clear visual structure for the call graph.
97-118: LGTM: formatJson provides complete structured data.
130-163: LGTM: Trace command correctly integrates GraphBuilder.The command properly:
- Handles project root discovery with fallback
- Passes parsed options to GraphBuilder
- Selects output format based on flags and TTY status
- Cleans up resources in finally block
src/lib/search/searcher.ts (5)
17-62: LGTM: Search tuning constants are reasonable.The static constants provide sensible defaults for the search pipeline. The FTS stopwords list covers common English words appropriately.
64-76: LGTM: FTS normalization is well-designed.The normalization pipeline (lowercase, tokenize, filter short/stopwords, limit tokens) provides good query preparation for full-text search.
198-242: LGTM: Structure boost penalties align with search intent.The scoring adjustments correctly de-prioritize:
- Generated files (0.4x)
- Test files (0.5x)
- Tooling/scripts (0.35x)
- Documentation (0.6x)
The penalties appropriately guide results toward production code.
287-413: LGTM: Search initialization and dual retrieval are well-structured.The method correctly:
- Handles abort signals at multiple checkpoints
- Builds safe SQL filters with escapeSqlString
- Performs parallel vector and FTS search
- Gracefully handles FTS failures with fallback
The empty FTS query guard on line 396 is a good defensive check.
415-530: LGTM: Fusion, reranking, and result assembly are well-designed.The pipeline correctly:
- Fuses vector and FTS results with RRF
- Applies ColBERT reranking with fallback
- Blends rerank and fusion scores
- Deduplicates and diversifies per-file
- Normalizes scores and assigns confidence levels
src/commands/serve.ts (10)
79-88: Good defensive registration for spawned server.Registering the spawned server immediately with port 0 prevents race conditions where multiple concurrent sessions might spawn duplicate servers before the first one finishes indexing and starts listening.
50-92: LGTM: Server check and background spawn are correct.The code properly:
- Checks for existing servers before spawning
- Uses detached process with unref() for background operation
- Registers spawned server early to prevent races
114-140: LGTM: Live indexing setup with write serialization.The
dbWriteBarrierpattern correctly serializes database writes to avoid LanceDB contention. The watch configuration provides good flexibility via environment variables.
192-322: Excellent two-phase update pattern.The
flushPendingfunction correctly:
- Phase 1: Prepares updates outside the lock (hashing, embedding - slow operations)
- Phase 2: Applies under writer lock (DB updates - fast operations)
This minimizes lock hold time and maximizes throughput. The race detection on lines 280-295 (checking mtime/size before applying) is a nice safety measure.
347-365: Good EMFILE error recovery.The watcher error handler correctly detects EMFILE (too many open files) and falls back to polling mode. The
didLogWatchFallbackflag prevents log spam.
454-465: LGTM: Health endpoint provides useful status.
518-578: Excellent request lifecycle management.The /search endpoint correctly:
- Limits payload size (1MB)
- Validates path to prevent traversal
- Sets up AbortController on both request abort and response close
- Applies configurable timeout (60s default)
- Passes abort signal through to searcher
The defensive checks for
!res.headersSent && !res.writableEndedprevent double-response errors.
579-603: Good error handling with appropriate status codes.The error handler correctly distinguishes:
- AbortError → 504 (timeout/cancelled)
- isWriting (indexing) → 503 (service unavailable)
- Generic errors → 500
647-672: LGTM: Trace endpoint correctly integrates GraphBuilder.The endpoint properly parses options, builds the graph, and returns a structured JSON response.
742-791: LGTM: Shutdown sequence correctly cleans up resources.The shutdown function properly:
- Unregisters server from registry
- Closes filesystem watchers
- Closes HTTP server (with timeout fallback)
- Closes VectorDB and MetaCache
- Calls gracefulExit()
The resource cleanup order and error handling are appropriate.
src/lib/graph/graph-builder.ts (5)
5-24: LGTM: Interface design is clean and well-structured.The separation of
GraphNode,CallGraph, andTraceOptionsprovides clear contracts for the graph builder API.
33-44: LGTM: findDefinitions correctly queries for symbol definitions.
50-72: LGTM: findCallers with self-definition filtering.The method correctly:
- Queries for chunks that reference the symbol
- Filters out self-definitions (where chunk also defines the symbol)
- Determines caller symbol with appropriate fallbacks
The extra fetch (limit + 20) accounts for filtered records.
103-143: LGTM: buildGraph correctly assembles the call graph.The method properly:
- Finds definitions with optional path filtering
- Selects first definition as center
- Gets callers and callees based on options flags
- Filters internal callees (have definitions in index)
145-173: LGTM: Helper methods handle LanceDB type coercion.The
recordToNodeandtoStringArrayhelpers correctly handle the various formats that LanceDB might return (arrays, objects with toArray(), etc.).src/commands/search.ts (6)
264-268: LGTM: Option naming is clear.The change from
-mto-m, --max-countprovides both short and long forms, improving CLI usability.
335-364: Excellent defensive health check.The fast health preflight (150-250ms timeout) prevents hung servers from adding multi-second latency to every query. This is a good UX improvement.
366-463: LGTM: Server request with proper timeout and formatting.The server request flow correctly:
- Applies configurable timeout (700-1200ms)
- Handles response with multiple format options
- Filters results by minScore
- Supports compact, plain, and formatted output paths
467-481: Good defensive server cleanup.Unregistering the server on
ECONNREFUSEDor health check failures (lines 471-472) prevents future queries from paying the timeout cost. This is a good optimization.
487-577: LGTM: Local search setup with proper initialization.The fallback path correctly:
- Ensures setup and initializes VectorDB
- Checks for lock and warns about partial results
- Syncs if needed (first run or --sync flag)
- Creates FTS index after sync
579-651: LGTM: Local search execution and cleanup.The search execution properly:
- Calls searcher with rerank enabled
- Filters by minScore
- Formats output (compact/plain/formatted)
- Cleans up VectorDB in finally block
- Calls gracefulExit
| async filterToInternal(symbols: string[]): Promise<string[]> { | ||
| if (symbols.length === 0) return []; | ||
|
|
||
| // 2. Get Callers | ||
| const callers = await this.getCallers(symbol); | ||
|
|
||
| // 3. Get Callees (from center) | ||
| const callees = center ? center.calls : []; | ||
| const table = await this.db.ensureTable(); | ||
| const internal: string[] = []; | ||
|
|
||
| // Batch check which symbols have definitions | ||
| // For efficiency, we use a single query with OR conditions | ||
| // But LanceDB doesn't support OR in array_contains well, so we check individually | ||
| // This is acceptable since callees are typically <20 per function | ||
| for (const sym of symbols) { | ||
| const whereClause = `array_contains(defined_symbols, '${escapeSqlString(sym)}')`; | ||
| const records = await table.query().where(whereClause).limit(1).toArray(); | ||
| if (records.length > 0) { | ||
| internal.push(sym); | ||
| } | ||
| } | ||
|
|
||
| return { center, callers, callees }; | ||
| return internal; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/lib/graph/graph-builder.ts | head -110Repository: Ryandonofrio3/osgrep
Length of output: 4339
🏁 Script executed:
# Check for calls to filterToInternal to understand typical symbol counts
rg -n "filterToInternal" src/lib/graph/Repository: Ryandonofrio3/osgrep
Length of output: 254
🏁 Script executed:
# Check LanceDB array_contains capabilities and OR support
rg -n "array_contains\|LanceDB\|lance" src/ -i | head -30Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Look for any test files that might show typical usage
fd -e test.ts -e spec.ts src/lib/graph/Repository: Ryandonofrio3/osgrep
Length of output: 425
🏁 Script executed:
cat -n src/lib/graph/graph-builder.ts | sed -n '103,160p'Repository: Ryandonofrio3/osgrep
Length of output: 2257
🏁 Script executed:
# Check what rawCallees looks like and where it comes from
rg -n "rawCallees" src/lib/graph/graph-builder.ts -B2 -A2Repository: Ryandonofrio3/osgrep
Length of output: 364
🏁 Script executed:
# Look for test files or usage examples showing callees counts
fd . src/lib/graph/ --type fRepository: Ryandonofrio3/osgrep
Length of output: 94
🏁 Script executed:
# Check LanceDB documentation or any docs about OR limitations
fd -e md -e txt . | xargs rg -l "LanceDB\|array_contains\|OR" 2>/dev/null | head -5Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Find VectorRecord type definition
rg -n "interface VectorRecord\|type VectorRecord" src/Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Check for any stats, benchmarks, or test data
fd -e test.ts -e spec.ts -e test.js src/ --type fRepository: Ryandonofrio3/osgrep
Length of output: 385
🏁 Script executed:
# Check package.json for LanceDB version and any docs
cat package.json | grep -A2 -B2 "lance\|LanceDB" -iRepository: Ryandonofrio3/osgrep
Length of output: 234
🏁 Script executed:
# Search for any documentation about LanceDB limitations
fd -e md . | head -20Repository: Ryandonofrio3/osgrep
Length of output: 111
🏁 Script executed:
cat -n src/lib/store/types.tsRepository: Ryandonofrio3/osgrep
Length of output: 2478
🏁 Script executed:
cat -n src/lib/store/vector-db.ts | head -80Repository: Ryandonofrio3/osgrep
Length of output: 2657
🏁 Script executed:
cat README.mdRepository: Ryandonofrio3/osgrep
Length of output: 10338
🌐 Web query:
LanceDB 0.22.3 array_contains OR conditions SQL WHERE clause
💡 Result:
Yes — LanceDB SQL supports array-contains operations and logical OR. Use the array-contains functions (array_contains_any / array_contains_all — docs sometimes show array_has_any/array_has_all for LABEL_LIST indexes) or combine array checks with OR. Example WHERE clauses:
-
any of a set (preferred):
WHERE array_contains_any(tags, 'tagA', 'tagB') -
explicit OR:
WHERE array_contains(tags, 'tagA') OR array_contains(tags, 'tagB')
Also note: to accelerate these queries on List columns, create a LABEL_LIST scalar index (supports array_contains_any / array_contains_all). (lancedb.github.io)
Citations:
🌐 Web query:
LanceDB SQL WHERE OR operator array_contains support 2024 2025
💡 Result:
Short answer: Yes.
- LanceDB supports SQL WHERE with logical operators including OR (and AND/NOT) for vector and table queries (docs/current as of 2024–2025). [1]
- LanceDB also supports queries over list/array columns (functions exposed for array membership) and provides a LabelList scalar index to accelerate array queries (array_contains_all / array_contains_any or similarly named array_has_all/array_has_any in SDKs). Use a LabelList/Label_List index for best performance. [2][3]
Example (conceptual):
- table.search(...).where("(price > 20) OR (array_contains_any(tags, ['tag1','tag2']))")
References:
- LanceDB — Metadata Filtering / SQL filtering (AND / OR etc.). [turn0search2].
- LanceDB — Scalar Indexes (LabelList / array_contains_any & array_contains_all support). [turn0search0].
- LanceDB Python/SDK docs — LabelList / array list indexing details. [turn0search1].
🏁 Script executed:
# Check if array_contains_any is used anywhere else in codebase
rg -n "array_contains_any\|array_has_any" src/Repository: Ryandonofrio3/osgrep
Length of output: 46
🏁 Script executed:
# Check if there are performance tests or benchmarks
fd . src/ -name "*bench*" -o -name "*perf*" -o -name "*metric*"Repository: Ryandonofrio3/osgrep
Length of output: 294
🏁 Script executed:
# Look for any TODO or FIXME comments related to performance optimization
rg -n "TODO.*batch\|FIXME.*batch\|TODO.*performance\|FIXME.*performance" src/Repository: Ryandonofrio3/osgrep
Length of output: 46
Remove sequential querying and use LanceDB's array_contains_any() for batch lookups.
LanceDB 0.22.3 supports OR conditions and the array_contains_any() function, contrary to the comment on lines 85–87. The current code makes N individual database queries (one per symbol), but a single query using array_contains_any(defined_symbols, [symbol1, symbol2, ...]) would be far more efficient. For typical callees (<20 symbols), this change would reduce database round-trips from ~20 to 1.
Update the implementation to construct a single query with all symbols instead of looping, and correct the misleading comment about LanceDB limitations.
| throw new Error( | ||
| `Failed to load osgrep-core native binding. Run 'npm run build:release' in osgrep/osgrep-core/: ${e}` | ||
| ); |
There was a problem hiding this comment.
Update build instructions for bun.
The error message references npm run build:release, but the PR migrates to bun. Consider updating to bun run build:release for consistency.
Apply this diff:
- `Failed to load osgrep-core native binding. Run 'npm run build:release' in osgrep/osgrep-core/: ${e}`
+ `Failed to load osgrep-core native binding. Run 'bun run build:release' in osgrep/osgrep-core/: ${e}`🤖 Prompt for AI Agents
In src/lib/native/index.ts around lines 16 to 18, the thrown error message still
tells users to run "npm run build:release"; update that text to instruct "bun
run build:release" instead (or otherwise make it package-manager-agnostic) so
the message matches the project's migration to bun and guides users to the
correct build command.
…documentation. Update server response handling for initial sync state and improve output formatting for search results.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
80-80: Default value inconsistency with code.README states
-m <n>defaults to25, butsrc/commands/search.tsline 267 sets the default to"5". Align these values.
♻️ Duplicate comments (1)
README.md (1)
24-24: Outdated embeddings technology reference persists.The AI summary indicates ONNX dependencies were replaced with Rust-native bindings (
osgrep-core), but line 24 still mentionsonnxruntime. This is similar to the past review comment but the reference was only partially updated.-- **Local & Private:** 100% local embeddings via `onnxruntime`. +- **Local & Private:** 100% local embeddings via native Rust bindings.
🧹 Nitpick comments (9)
plugins/osgrep/skills/osgrep/SKILL.md (1)
58-64: Add language specifier to fenced code block.The output code block is missing a language identifier. Consider using
textorplaintextfor generic output.**Output:** -``` +```text handleRequest def: src/server/handler.ts:45 calls: validateAuth routeRequest sendResponse called_by: src/index.ts:12 src/api/router.ts:87</blockquote></details> <details> <summary>src/commands/trace.ts (1)</summary><blockquote> `176-187`: **Inconsistent callee deduplication between formatTree and formatPlainTree.** `formatPlainTree` deduplicates callees and shows counts (e.g., `foo (x3)`), but `formatTree` iterates over `graph.callees` directly, potentially showing duplicates. This creates inconsistent output between `--pretty` and `--plain` modes. Consider applying the same deduplication logic: ```diff if (hasCallees) { const branch = hasCallers ? "\u251c\u2500\u2500" : "\u2514\u2500\u2500"; lines.push(`${branch} ${style.yellow("calls:")}`); - graph.callees.forEach((callee, i) => { + + // Dedupe callees + const counts = new Map<string, number>(); + for (const c of graph.callees) { + counts.set(c, (counts.get(c) || 0) + 1); + } + const calleeList = [...counts.entries()]; + + calleeList.forEach(([callee, count], i) => { const prefix = hasCallers ? "\u2502 " : " "; const sym = - i === graph.callees.length - 1 + i === calleeList.length - 1 ? "\u2514\u2500\u2500" : "\u251c\u2500\u2500"; - lines.push(`${prefix}${sym} ${callee}`); + const countStr = count > 1 ? style.dim(` (x${count})`) : ""; + lines.push(`${prefix}${sym} ${callee}${countStr}`); }); }src/commands/serve.ts (3)
156-180: Consider consolidating ignore logic to reduce duplication.
shouldIgnoreRelPathmanually reimplements patterns fromDEFAULT_IGNORE_PATTERNS. While the inline string checks are faster for the hot watcher path, the duplicated logic may drift over time. Consider extracting shared constants or documenting that this is intentionally separate for performance.
387-392: Simplify filename type handling.The nested type checks are defensive but verbose. Node's
fs.watchcan returnBufferon some platforms, so thetoString("utf-8")fallback is correct. Consider simplifying:- const name = - typeof filename === "string" - ? filename - : typeof (filename as any)?.toString === "function" - ? String((filename as any).toString("utf-8")) - : ""; + const name = typeof filename === "string" + ? filename + : Buffer.isBuffer(filename) + ? filename.toString("utf-8") + : "";
656-733: Add timeout handling for /trace endpoint.The
/searchendpoint has timeout and AbortController support, but/tracedoes not.buildGraphcould be slow for deep traversals. Consider adding similar timeout protection:+ const ac = new AbortController(); + const timeout = setTimeout(() => ac.abort(), 30000); + req.on("end", async () => { if (aborted) return; try { // ... existing code ... + } catch (err) { + if (err instanceof Error && err.name === "AbortError") { + res.statusCode = 504; + res.setHeader("Content-Type", "application/json"); + res.end(JSON.stringify({ error: "trace_timeout" })); + return; + } // ... existing error handling ... + } finally { + clearTimeout(timeout); } });plugins/osgrep/hooks/start.js (4)
24-30:findProjectRootcurrently reduces to justresolve(startDir)Given the current implementation, the
.osgrep/.gitexistence check does not influence the return value: the function always returnsstart. That’s correct but a bit misleading to future readers and easy to let drift fromsrc/lib/utils/project-root.ts.Consider either:
- Simplifying to a single
return _path.resolve(startDir || process.cwd());, or- Mirroring the TS utility’s behavior (including any special handling like
PATHS.globalRoot) if you intend this hook to share the same semantics.
32-42: Manualservers.jsonaccess can drift fromserver-registrybehavior
getServerForProjectreimplements registry lookup and directly parses$HOME/.osgrep/servers.json. This works, but it duplicates logic that is already centralized insrc/lib/utils/server-registry.ts(vialistServers()/getServerForProject), and any future changes to the registry path or shape will require remembering to update this hook as well.If feasible, consider:
- Reusing a single source of truth for computing the registry path and validating entries, or
- At minimum, adding a brief comment pointing at
src/lib/utils/server-registry.tsso maintainers know to keep these behaviors in sync.
47-57: Optional: still report an already-running server when auto-serve is disabledThe
OSGREP_DISABLE_AUTO_SERVE === "1"branch always returns a generic “auto-start disabled” message without checking for or reporting an already-running server for this project. That’s probably fine for most cases, but it may be useful for users runningosgrep servemanually to still see PID/port info in this message.If you want that behavior, you could:
- Compute
projectRoot/existingeven when auto-serve is disabled, and- Extend
additionalContextaccordingly when a server is already up.
59-71: Background spawn logic is reasonable; be aware of potential double-spawn windowUsing
getServerForProject(projectRoot)beforespawn("osgrep", ["serve", "--background"])is a good guard, anddetached + unrefis appropriate for a background daemon.There are a couple of minor considerations:
- There’s an inherent race window where
servers.jsonhasn’t been updated yet by a just-started server, so a second fast invocation of this hook could still spawn anotherosgrep serve. If that’s a concern in your environment, you might later introduce a lightweight lock or a cheap health check before spawning again.- Since
logPathis opened once and passed as stdio, you could optionallyfs.closeSync(out)afterchild.unref()to be explicit about parent FD cleanup, though in this short-lived process it’s unlikely to matter in practice.None of these are correctness blockers; just things to keep in mind.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (6)
README.md(4 hunks)plugins/osgrep/hooks/start.js(1 hunks)plugins/osgrep/skills/osgrep/SKILL.md(2 hunks)src/commands/search.ts(8 hunks)src/commands/serve.ts(10 hunks)src/commands/trace.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
plugins/osgrep/hooks/start.js (2)
src/lib/utils/server-registry.ts (2)
isProcessRunning(33-40)getServerForProject(66-70)src/lib/utils/project-root.ts (1)
findProjectRoot(14-29)
src/commands/trace.ts (4)
src/lib/graph/graph-builder.ts (2)
CallGraph(13-17)GraphBuilder(26-173)src/lib/utils/project-root.ts (2)
findProjectRoot(14-29)ensureProjectPaths(31-51)src/lib/store/vector-db.ts (1)
VectorDB(20-302)src/lib/utils/exit.ts (1)
gracefulExit(3-20)
src/commands/serve.ts (7)
src/lib/setup/setup-helpers.ts (1)
ensureSetup(25-56)src/lib/store/meta-cache.ts (2)
MetaCache(12-60)MetaEntry(6-10)src/lib/utils/file-utils.ts (1)
isIndexableFile(44-63)src/lib/utils/lock.ts (1)
acquireWriterLockWithRetry(87-109)src/lib/index/ignore-patterns.ts (1)
DEFAULT_IGNORE_PATTERNS(3-60)src/lib/utils/exit.ts (1)
gracefulExit(3-20)src/lib/utils/project-root.ts (1)
findProjectRoot(14-29)
src/commands/search.ts (1)
src/lib/utils/server-registry.ts (1)
unregisterServer(50-54)
🪛 LanguageTool
plugins/osgrep/skills/osgrep/SKILL.md
[style] ~68-~68: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...on and need to know what calls it - You need to understand what a function depends on -...
(REP_NEED_TO_VB)
🪛 markdownlint-cli2 (0.18.1)
README.md
196-196: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
plugins/osgrep/skills/osgrep/SKILL.md
59-59: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci
🔇 Additional comments (19)
plugins/osgrep/skills/osgrep/SKILL.md (1)
50-80: Documentation for trace command is comprehensive.The trace command documentation clearly explains the feature with practical examples for
--callers,--callees, and--pathoptions. This aligns well with the implementation insrc/commands/trace.ts.src/commands/trace.ts (9)
7-13: LGTM!Inline ANSI escape sequences are a reasonable choice for minimal dependencies. The reset codes are correctly matched to their opening codes.
18-26: LGTM!Clean implementation for deduplicating and counting callees.
31-45: LGTM!Grouping and sorting logic is correct.
50-54: LGTM!Straightforward counting logic for format selection.
60-124: LGTM!The adaptive format selection and tree rendering are well-implemented. The deduplication of callees in
formatPlainTreeis a nice touch.
129-156: LGTM!Compact format implementation is clean and reuses helper functions appropriately.
212-233: LGTM!JSON output preserves raw data appropriately, allowing consumers to handle deduplication as needed.
253-258: Verify: depth=0 falls back to 1 due to falsy OR.
Number.parseInt(cmd.depth, 10) || 1will treat an explicit--depth 0asdepth: 1because0is falsy. If depth=0 is intentionally invalid, this is fine. Otherwise, consider using nullish coalescing:- depth: Number.parseInt(cmd.depth, 10) || 1, + depth: Number.parseInt(cmd.depth, 10) ?? 1,Or add validation to reject depth < 1 explicitly with an error message.
273-278: LGTM!Resource management is correct—
db.close()in the finally block ensures cleanup regardless of success or failure, andgracefulExit()handles process termination appropriately.src/commands/serve.ts (3)
78-88: Early registration for spawned child is a good race-condition mitigation.Registering the child process immediately (before it finishes indexing) allows discovery/stop commands to find it. However, if the child fails to start or crashes during setup, this stale entry may linger until the next
serveorstopcommand cleans it up.
330-331: Potential unhandled rejection in write barrier chain.
dbWriteBarrier.then(apply, apply)passesapplyas both the fulfillment and rejection handler. Ifapplythrows, the rejection propagates but the next chainedapplycall may not properly serialize. Consider wrapping in a catch to ensure the chain continues:- dbWriteBarrier = dbWriteBarrier.then(apply, apply); + dbWriteBarrier = dbWriteBarrier.then(apply).catch((err) => { + console.error("[serve] write barrier error:", err); + }).then(apply);Or simply ensure
applynever throws by wrapping its body in try/catch (which it already does), so the current pattern should work. Verify that any throw insideapplyis caught.
872-901: Robust process termination handling.The
stopPidfunction properly handles edge cases: sendingSIGCONTto resume stopped processes beforeSIGTERM, and escalating toSIGKILLafter timeout. The polling interval (75ms) and timeouts (2s each) are reasonable.src/commands/search.ts (3)
351-364: Health check scope block limits AbortController reuse.The inner
{ ... }block scopesacandtimeoutto the health check, preventing accidental reuse. This is intentional and clean.
483-496: Proactive stale server cleanup is a good UX optimization.Unregistering servers that fail health checks or refuse connections prevents repeated timeout penalties on subsequent searches.
504-506: Debug timing instrumentation is well-structured.The
t/tehelper pattern keeps timing opt-in and non-invasive. The guards (DEBUG_TIMING && ...) ensure no overhead when disabled.README.md (1)
312-315: Development commands updated to bun.The migration from pnpm to bun is reflected correctly in the development section.
plugins/osgrep/hooks/start.js (2)
14-22:isProcessRunningimplementation looks solidThe PID validation plus
process.kill(pid, 0)pattern is standard and matches the behavior insrc/lib/utils/server-registry.ts. No issues from a correctness standpoint.
77-79: Session message formatting is clear and user-friendlyThe conditional
additionalContextthat distinguishes between “running (PID / Port)” and “starting (indexing in background)” is clear and conveys the important state to the caller.
| ``` | ||
| handleRequest | ||
| def: src/server/handler.ts:45 | ||
| calls: validateAuth routeRequest sendResponse | ||
| called_by: src/index.ts:12 src/api/router.ts:87 | ||
| ``` |
There was a problem hiding this comment.
Add language specifier to fenced code block.
Per markdown lint (MD040), fenced code blocks should have a language specified. Since this shows command output, use text or plaintext:
-```
+```text
handleRequest
def: src/server/handler.ts:45
calls: validateAuth routeRequest sendResponse
called_by: src/index.ts:12 src/api/router.ts:87
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>
196-196: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
In README.md around lines 196 to 201, the fenced code block lacks a language
specifier which violates MD040; update the block opener to include a language
such as "text" (e.g., ```text) so the fenced block is annotated appropriately
while leaving the block contents unchanged.
</details>
<!-- fingerprinting:phantom:medusa:ocelot -->
<!-- This is an auto-generated comment by CodeRabbit -->
| .option( | ||
| "-m <max_count>, --max-count <max_count>", | ||
| "-m, --max-count <max_count>", | ||
| "The maximum number of results to return (total)", | ||
| "5", | ||
| ) |
There was a problem hiding this comment.
Option default mismatch with README.
The default here is "5", but README.md line 80 says -m <n> defaults to 25. Either update the code or the documentation.
🤖 Prompt for AI Agents
In src/commands/search.ts around lines 264 to 268, the CLI option default for
-m/--max-count is set to "5" but README.md documents a default of 25; update the
code to match the documented default by changing the option default to "25" (and
ensure any code that reads this option converts it to a number), or
alternatively update README.md to state the default is 5—pick one source of
truth and make the values consistent.
| try { | ||
| nativeWatcher?.close(); | ||
| } catch {} | ||
| nativeWatcher = null |
There was a problem hiding this comment.
Missing semicolon.
- nativeWatcher = null
+ nativeWatcher = null;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| nativeWatcher = null | |
| nativeWatcher = null; |
🤖 Prompt for AI Agents
In src/commands/serve.ts around line 804, the assignment statement
"nativeWatcher = null" is missing a trailing semicolon; add the semicolon to
make it "nativeWatcher = null;" and ensure the file passes the project's
linter/formatter.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
README.md (2)
24-26: Update “Local & Private” blurb to reflect Rust-native core instead of onnxruntimeThis bullet still calls out “100% local embeddings via
onnxruntime”, but the PR (and past review) indicate embeddings now come from the Rust-nativeosgrep-corebackend. Consider updating this line to explicitly mention the Rust core (and drop the onnxruntime reference) so the README matches the current architecture.
200-206: Add a language to the fenced trace output block (MD040)The
osgrep traceoutput example is a bare fenced block; markdownlint (MD040) wants a language. Since this is plain CLI output,textis appropriate:-``` +```text handleRequest def: src/server/handler.ts:45 calls: validateAuth routeRequest sendResponse called_by: src/index.ts:12 src/api/router.ts:87</blockquote></details> <details> <summary>src/commands/serve.ts (1)</summary><blockquote> `827-838`: **Missing semicolon after `nativeWatcher = null`** This assignment is the only statement in the block without a trailing semicolon and was previously flagged; adding it keeps the file consistent with the rest of the style and avoids linter complaints: ```diff - nativeWatcher = null + nativeWatcher = null;src/commands/search.ts (1)
379-382: Align-m/--max-countdefault with README (5 vs 25 mismatch)The CLI default here is
"5", but README’sosgrep searchtable still advertises a default of25. This mismatch was flagged previously and is still present. Either:
- Bump the default here (and the fallback in
limitcomputation) to"25", or- Update the README to say the default is 5.
Right now users can’t rely on the documented default.
🧹 Nitpick comments (5)
src/lib/search/expander.ts (1)
514-565: Content-cleaning logic duplicated withSearcher.mapRecordToChunk
mapRecordToChunkhere reimplements similar header-stripping/truncation and symbol array handling asSearcher.mapRecordToChunk. If both diverge over time, CLI/server vs expansion output can drift in subtle ways. Consider extracting a shared helper (or factoring into a small utility module) so both code paths stay in sync.tests/expander.test.ts (2)
154-183: Strengthen “circular references” assertion to actually exercise the loopThe “circular references” test description implies guarding against cycles, but currently:
- Only a single
mockResolvedValueOnceis set up, so depth-2 symbol expansion never actually sees the back-edge, and- The assertion
expect(ids.size).toBe(expanded.expanded.length);is tautologically true even whenexpanded.expanded.lengthis 0 or 1.It would be more meaningful to (a) configure two successive query responses that actually create A ↔ B, and (b) assert that the same chunk isn’t added twice (e.g., by checking for a specific
id/pathappearing only once).
260-267: Neighbor expansion test’s core assertion is effectively a no-op
expect(neighbors.length).toBeGreaterThanOrEqual(0);will always pass. Since you’re already guarding withif (neighbors.length > 0), you could instead asserttoBe(1)(given the mock) or at leasttoBeGreaterThan(0)to make the test informative.src/commands/serve.ts (1)
80-107: VerifyregisterServerbehavior with “port: 0” early registrationYou now register the background child (and the foreground server) with
port: 0before the HTTP server actually binds, then register again with the real port oncelistensucceeds. This relies onregisterServerdeduping on PID or project root; otherwise you may accumulate duplicate entries (one with port 0, one with the real port) in the registry.If
registerServerdoesn’t already replace existing entries for the same PID/project, consider updating it accordingly, or document that assumption here.src/lib/search/searcher.ts (1)
393-407: Avoid treating empty normalized FTS queries as “failed” searches
normalizeFtsQuerycan legitimately return an empty string for very short/stopword-heavy queries, but the code throwsnew Error("Empty FTS query after normalization"), which then logs as[Searcher] FTS search failed: .... That makes normal input look like an error.You might instead short‑circuit without logging in this case:
const ftsText = Searcher.normalizeFtsQuery(query); if (!ftsText) { ftsResults = []; } else { // existing FTS query logic }This preserves the graceful fallback without noisy warnings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (8)
README.md(6 hunks)src/commands/agent/opencode.ts(0 hunks)src/commands/search.ts(11 hunks)src/commands/serve.ts(10 hunks)src/lib/search/expander.ts(1 hunks)src/lib/search/expansion-types.ts(1 hunks)src/lib/search/searcher.ts(11 hunks)tests/expander.test.ts(1 hunks)
💤 Files with no reviewable changes (1)
- src/commands/agent/opencode.ts
🧰 Additional context used
🧬 Code graph analysis (6)
tests/expander.test.ts (2)
src/lib/store/types.ts (1)
ChunkType(50-67)src/lib/search/expander.ts (1)
Expander(107-567)
src/lib/search/expander.ts (3)
src/lib/search/expansion-types.ts (4)
ExpandOptions(23-32)ExpandedResult(71-82)ExpansionStats(53-66)ExpansionNode(37-48)src/lib/store/vector-db.ts (1)
VectorDB(20-302)src/lib/store/types.ts (1)
ChunkType(50-67)
src/commands/serve.ts (5)
src/lib/store/meta-cache.ts (2)
MetaCache(12-60)MetaEntry(6-10)src/lib/store/types.ts (1)
VectorRecord(31-35)src/lib/utils/file-utils.ts (1)
isIndexableFile(44-63)src/lib/utils/lock.ts (1)
acquireWriterLockWithRetry(87-109)src/lib/index/sync-helpers.ts (1)
InitialSyncProgress(13-22)
src/lib/search/expansion-types.ts (1)
src/lib/store/types.ts (1)
ChunkType(50-67)
src/commands/search.ts (4)
src/lib/search/expansion-types.ts (3)
ExpandOptions(23-32)ExpandedResult(71-82)ExpansionNode(37-48)src/lib/output/formatter.ts (1)
formatResults(101-108)src/lib/utils/server-registry.ts (1)
unregisterServer(50-54)src/lib/utils/project-root.ts (2)
findProjectRoot(14-29)ensureProjectPaths(31-51)
src/lib/search/searcher.ts (5)
src/lib/index/ignore-patterns.ts (1)
GENERATED_FILE_PATTERNS(65-98)src/lib/workers/orchestrator.ts (4)
encodeQuery(204-235)encodeQuery(352-354)rerank(237-330)rerank(356-362)src/lib/utils/filter-builder.ts (1)
escapeSqlString(1-5)src/lib/search/expansion-types.ts (2)
ExpandOptions(23-32)ExpandedResult(71-82)src/lib/search/expander.ts (1)
Expander(107-567)
🪛 markdownlint-cli2 (0.18.1)
README.md
201-201: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci
🔇 Additional comments (3)
src/lib/search/expansion-types.ts (1)
18-101: Expansion type surface is well-structured and future-proofThe expansion model (strategies, nodes, stats, result container, and internal ExpansionChunk) looks coherent and matches how
Expanderand the CLI use it; the extra"coexports"strategy is fine to leave for future implementation.src/lib/search/searcher.ts (1)
534-592: Searcher–Expander integration looks clean and opt‑inThe new
expandandsearchAndExpandmethods keep expansion strictly opt‑in, reuse the sameVectorDBinstance, and threadExpandOptionsthrough without coupling search and expansion logic. This is a solid separation of concerns and should make it easy for both the CLI and HTTP server to wire in--deep/deepbehavior consistently.src/commands/search.ts (1)
398-402:--deepwiring from CLI through HTTP and local search looks consistentThe new
--deepflag flows intogetDeepExpandOptions, into the/searchrequest body (deep: options.deep), and into localSearcher.expandviaexpandOpts. The default strategies (["callers", "symbols"], depth 2) match both the README description and the server’sexpandOptsconstruction, which keeps behavior aligned across local and daemon modes.
| const apply = async () => { | ||
| let lock: { release: () => Promise<void> } | null = null; | ||
| try { | ||
| isWriting = true; | ||
| lock = await acquireWriterLockWithRetry(paths.osgrepDir, { | ||
| maxRetries: 2, | ||
| retryDelayMs: 250, | ||
| }); | ||
|
|
||
| if (unlinks.length > 0) { | ||
| await vectorDb.deletePaths(unlinks); | ||
| for (const relPath of unlinks) { | ||
| metaCache.delete(relPath); | ||
| } | ||
| } | ||
|
|
||
| for (const item of prepared) { | ||
| // If the file changed again since we prepared, skip and let the next event handle it. | ||
| try { | ||
| const stats = await fs.promises.stat(item.absPath); | ||
| if ( | ||
| stats.mtimeMs !== item.meta.mtimeMs || | ||
| stats.size !== item.meta.size | ||
| ) { | ||
| pendingUpserts.add(item.relPath); | ||
| continue; | ||
| } | ||
| } catch (err) { | ||
| const code = (err as NodeJS.ErrnoException)?.code; | ||
| if (code === "ENOENT") { | ||
| pendingUnlinks.add(item.relPath); | ||
| } | ||
| continue; | ||
| } | ||
|
|
||
| await vectorDb.deletePaths([item.relPath]); | ||
| if (!item.shouldDelete && item.vectorsCount > 0) { | ||
| await vectorDb.insertBatch(item.vectors as VectorRecord[]); | ||
| } | ||
| metaCache.put(item.relPath, item.meta); | ||
| } | ||
|
|
||
| if (watchVerbose && (plannedUnlinks > 0 || plannedUpserts > 0)) { | ||
| console.log( | ||
| `[serve] live index applied: upserts=${plannedUpserts} unlinks=${plannedUnlinks}`, | ||
| ); | ||
| } | ||
| } finally { | ||
| if (lock) await lock.release(); | ||
| isWriting = false; | ||
| } | ||
| }; | ||
|
|
||
| dbWriteBarrier = dbWriteBarrier.then(apply, apply); | ||
| await dbWriteBarrier; | ||
|
|
There was a problem hiding this comment.
Ensure all live-index writes complete before closing VectorDB on shutdown
flushPending chains writer work onto dbWriteBarrier and awaits it there, but shutdown closes the server and then calls vectorDb.close() without first awaiting dbWriteBarrier. If a flush is in progress when SIGINT/SIGTERM arrives, you can end up closing the DB while writes are still in flight.
Consider explicitly awaiting the barrier before closing the DB, e.g.:
- // Clean close of vectorDB
- try {
- await vectorDb.close();
- } catch (e) {
- console.error("Error closing vector DB:", e);
- }
+ // Wait for any pending live-index writes to finish, then close the DB
+ try {
+ await dbWriteBarrier;
+ } catch (e) {
+ console.error("Error waiting for pending writes:", e);
+ }
+ try {
+ await vectorDb.close();
+ } catch (e) {
+ console.error("Error closing vector DB:", e);
+ }This keeps shutdown graceful even under concurrent file-change load.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/commands/serve.ts around lines 278 to 333, shutdown currently calls
vectorDb.close() without ensuring pending live-index writes chained on
dbWriteBarrier have finished; this can close the DB while writes are in flight.
Update the shutdown sequence/handlers to await the current dbWriteBarrier (and
any final flushPending chain) before calling vectorDb.close() — i.e., ensure you
await dbWriteBarrier (and any final dbWriteBarrier = dbWriteBarrier.then(noop)
resolution) or call await flushPending() from shutdown so all apply() work
completes, then call vectorDb.close(), and only after close proceed to exit the
process.
| const stats: ExpansionStats = { | ||
| symbolsResolved: 0, | ||
| callersFound: 0, | ||
| neighborsAdded: 0, | ||
| totalChunks: results.length, | ||
| totalTokens: 0, | ||
| }; | ||
|
|
||
| // Track seen chunk IDs to avoid duplicates | ||
| const seen = new Set<string>(); | ||
| for (const r of results) { | ||
| const id = this.getChunkId(r); | ||
| if (id) seen.add(id); | ||
| stats.totalTokens += estimateTokens(r.text || ""); | ||
| } | ||
|
|
||
| // Token budget tracking | ||
| let budgetRemaining = maxTokens > 0 ? maxTokens - stats.totalTokens : Infinity; | ||
|
|
||
| let table: Table; |
There was a problem hiding this comment.
Consider marking truncated when the initial token budget is already exhausted
If maxTokens > 0 and the original results alone exceed that budget, budgetRemaining becomes ≤ 0 so the BFS loop never runs, but truncated remains false. From a caller’s perspective that still represents “a limit was hit during expansion”. Consider setting truncated = true in that case so clients don’t misinterpret “no expansion” as “no limits hit”.
🤖 Prompt for AI Agents
In src/lib/search/expander.ts around lines 126 to 145, the code computes
budgetRemaining after summing tokens from initial results but does not mark
truncated when the initial results alone exceed maxTokens; update the logic to
set truncated = true when maxTokens > 0 and budgetRemaining <= 0 (i.e., initial
totalTokens exceed or meet the budget) so callers know a token limit was hit
before expansion begins; ensure this is done immediately after computing
budgetRemaining and before the BFS/expansion loop so existing flow and return
values reflect the truncated state.
Summary by CodeRabbit
Removed Features
New Features
Search Improvements
Setup & Runtime
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.