Skip to content

Writing a Rust Backend V1#73

Open
Ryandonofrio3 wants to merge 19 commits intomainfrom
rust-backend
Open

Writing a Rust Backend V1#73
Ryandonofrio3 wants to merge 19 commits intomainfrom
rust-backend

Conversation

@Ryandonofrio3
Copy link
Copy Markdown
Owner

@Ryandonofrio3 Ryandonofrio3 commented Dec 14, 2025

Summary by CodeRabbit

  • Removed Features

    • Skeleton command and several legacy model/verification/experiment utilities removed.
  • New Features

    • Native osgrep-core native backend added for faster embeddings & reranking.
    • Search expansion (--deep) that returns expanded results (symbols, callers, neighbors).
  • Search Improvements

    • Simplified CLI options (-m/--max-count); server-backed search with health checks, timeouts, and graceful local fallback.
  • Setup & Runtime

    • Native model pre-warm during setup/doctor; live indexing, watch mode, richer /health and /trace endpoints.
  • Documentation

    • README and help text updated to reflect commands and workflow changes; CI/release use native builds and bun.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 14, 2025

Walkthrough

Adds a native Rust core (osgrep-core) with N‑API bindings, replaces JS/ONNX worker pipelines with native embedding/rerank calls, removes the skeleton subsystem and many experiment/benchmark scripts, and migrates build tooling and CI from pnpm to bun.

Changes

Cohort / File(s) Summary
Native Rust core & JS loader
osgrep-core/Cargo.toml, osgrep-core/build.rs, osgrep-core/package.json, osgrep-core/index.js, osgrep-core/index.d.ts, osgrep-core/src/lib.rs, osgrep-core/src/dense_ort.rs, osgrep-core/src/colbert_ort.rs
Add new Rust crate exposing init, dense/ColBERT embeddings, batch embed, query encoding, and rerank via N‑API; add platform-aware JS loader and TypeScript declarations.
Root package & CI/release
package.json, .github/workflows/ci.yml, .github/workflows/release.yml, .gitignore
Replace pnpm scripts with bun equivalents, remove benchmark scripts, add native build/publish steps, and expand ignored files.
Orchestrator & native wrapper
src/lib/workers/orchestrator.ts, src/lib/native/index.ts
Introduce TypeScript native binding wrapper and singleton orchestrator functions (processFile, encodeQuery, rerank) delegating to osgrep-core and mapping packed embeddings.
Worker & pool deletion
removed: src/lib/workers/pool.ts, src/lib/workers/process-child.ts, src/lib/workers/worker.ts, src/lib/workers/download-worker.ts
Remove custom worker pool, child process worker plumbing, and model-download worker; orchestration centralized and native-backed.
ONNX / embedding workers removed
removed: src/lib/workers/embeddings/colbert.ts, src/lib/workers/embeddings/granite.ts, src/lib/workers/colbert-tokenizer.ts, src/lib/workers/colbert-math.ts
Delete ONNX-based embedding, tokenizer, and similarity math modules (functionality migrated into native core).
Skeleton subsystem removed
removed: src/commands/skeleton.ts, src/lib/skeleton/* (index.ts, skeletonizer.ts, body-fields.ts, retriever.ts, summary-formatter.ts), scripts/verify_skeleton.ts, test_skeleton.py
Remove skeleton CLI command, TreeSitter-based skeletonizer, formatting helpers, stored-skeleton retrieval, and verification scripts/tests.
Trace & graph changes
src/commands/trace.ts, src/lib/graph/graph-builder.ts, src/lib/output/formatter.ts, removed: src/lib/output/json-formatter.ts
Extend trace command options and output formats; refactor graph-builder to CallGraph/TraceOptions and update formatter (JSON trace formatter removed).
Search & server command changes
src/commands/search.ts, src/commands/serve.ts, src/commands/setup.ts, src/index.ts
Simplify CLI options (remove skeleton flag, rename maxCount), add server-backed search flow with health/timeouts/watch/indexing, and setup now pre-warms native models via initNative; unregister skeleton command.
Searcher refactor
src/lib/search/searcher.ts, removed: src/lib/search/intent.ts
Replace intent/pool-based pipeline with orchestrator-native encode/rerank; add static ranking constants, simplify rerank flow, add expansion API, and remove intent detection module.
Indexing & syncer
src/lib/index/syncer.ts
Eliminate worker-pool retry loop; call orchestrator.processFile directly for file processing.
Store & schema changes
src/lib/store/types.ts, src/lib/store/vector-db.ts
Remove colbert_scale and pooled_colbert_48d from VectorRecord and DB schema; simplify doc_token_ids handling/defaults.
Setup helpers & model loader removed
src/lib/setup/setup-helpers.ts, removed: src/lib/setup/model-loader.ts
Remove model-download worker and downloadedModels tracking; ensureSetup no longer exposes models path or download status.
Output & formatting utilities
src/lib/output/formatter.ts, src/lib/utils/ansi.ts, src/lib/utils/formatter.ts
Centralize ANSI styling in new ansi util, update formatter to use shared style and language source; remove JSON formatter.
Utilities & git worktree support
src/lib/utils/git.ts, src/lib/utils/exit.ts, src/lib/index/chunker.ts, src/lib/index/grammar-loader.ts, tests: tests/chunking.test.ts, tests/git-worktree.test.ts
Add git worktree helpers and tests; adjust chunker header (“File: ” prefix); simplify gracefulExit (remove worker-pool teardown); update imports.
Doctor & list command updates
src/commands/utility/doctor.ts, src/commands/utility/list.ts
Doctor now calls initNative instead of checking model files; adjust import paths for list.
Expander & expansion types
src/lib/search/expander.ts, src/lib/search/expansion-types.ts, tests: tests/expander.test.ts
Add result expansion subsystem (symbols, callers, neighbors) with BFS/token-budget-aware expansion and unit tests.
Indexing experiments, scripts & docs cleanup
removed: experiments/*, scripts/*, LICENSE, NOTICE; modified: README.md, plugins/osgrep/skills/osgrep/SKILL.md
Remove experiments/benchmark scripts, delete LICENSE/NOTICE, update README and skill docs to reflect removed features and bun-based developer commands.
Misc path/import adjustments & tests
multiple files (e.g., src/commands/agent/*, src/lib/index/*, tests)
Adjust relative import paths, language source module moves, and update tests to match display text changes.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI / Client
    participant Orch as Orchestrator (TS)
    participant Native as osgrep-core (Rust N‑API)
    participant HF as HuggingFace Hub
    participant ONNX as ONNX Runtime (inside Native)

    CLI->>Orch: initNative()
    Orch->>Native: initModels(denseRepo,colbertRepo)
    Native->>HF: fetch model artifacts
    HF-->>Native: model files
    Native->>ONNX: create sessions / load tokenizers
    ONNX-->>Native: sessions ready
    Native-->>Orch: initialized
    Orch-->>CLI: ready

    CLI->>Orch: encodeQuery(text)
    Orch->>Native: encodeQueryColbert(query)
    Native->>ONNX: tokenize + infer
    ONNX-->>Native: query embeddings
    Native-->>Orch: embeddings
    Orch-->>CLI: embeddings

    CLI->>Orch: processFile(fileChunks)
    Orch->>Native: embedBatch(chunks)
    Native->>ONNX: batch tokenize + infer
    ONNX-->>Native: dense + colbert outputs
    Native-->>Orch: packed embeddings
    Orch-->>CLI: prepared chunks

    CLI->>Orch: rerank({queryEmb, docs, topK})
    Orch->>Native: rerankColbert(...)
    Native->>ONNX: maxsim / rerank
    ONNX-->>Native: scores
    Native-->>Orch: ranked indices/scores
    Orch-->>CLI: top-K results
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Areas requiring extra attention:

  • osgrep-core Rust code, N‑API surface, memory/typed-array conversions, and cross-platform binary selection.
  • Orchestrator ↔ Native interop (buffer layout, async, error propagation).
  • Searcher scoring/rerank changes, expansion addition, and removal of intent/skeleton boosting behaviors.
  • Large deletion sweep—search for dangling imports/references to removed modules/commands.
  • CI/release pipeline and packaging/publishing of native artifacts.

Poem

🐰 I hopped from JS pools to Rusty ground,
New bindings hum as old scripts wind down.
Skeletons vanished, benchmarks took flight,
Native cores whisper vectors through the night.
I nibble code, then crown the search with light. 🌿🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.24% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Writing a Rust Backend V1' is directly related to the main change in the pull request, which introduces a comprehensive native Rust backend (osgrep-core) with ONNX-based embeddings, native model loading, N-API bindings, and significant refactoring of the TypeScript layer to use the native backend.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rust-backend

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tools/eval.ts (1)

65-115: Update eval cases to reference existing orchestrator functions only.

The eval cases at lines 67-115 reference removed modules that no longer exist:

  • Lines 67-85: src/lib/workers/pool.ts — 4 cases will fail (file deleted)
  • Lines 87-104: src/lib/workers/embeddings/granite.ts and colbert.ts — 4 cases will fail (files deleted)
  • Lines 107-115: src/lib/workers/orchestrator.ts — only encodeQuery exists; computeHybrid and references to pooled_colbert_48d do not exist, causing 1 additional case to fail

Remove or update the 9 failing eval cases to reference only existing functions in the current codebase.

🧹 Nitpick comments (6)
src/lib/setup/setup-helpers.ts (1)

21-24: Stale docstring references models.

The docstring mentions "ensures osgrep directories and models exist" but the model download logic has been removed. Update to reflect current behavior:

 /**
- * Idempotent helper that ensures osgrep directories and models exist.
+ * Idempotent helper that ensures osgrep directories exist.
  * Returns status about work performed so callers can decide what to show.
  */
src/lib/index/syncer.ts (1)

188-197: Function name is misleading after removing retry logic.

processFileWithRetry no longer retries anything. Consider renaming to processFileSafe or just inlining the call since it's now a simple wrapper:

-    const processFileWithRetry = async (
-      relPath: string,
-      absPath: string,
-    ): Promise<ProcessFileResult> => {
-      // No retries needed - native embedding is stable
-      return await processFile({
-        path: relPath,
-        absolutePath: absPath,
-      });
-    };

Then at line 262, replace:

-          const result = await processFileWithRetry(relPath, absPath);
+          const result = await processFile({
+            path: relPath,
+            absolutePath: absPath,
+          });
plan.md (1)

262-265: Add language specifier to fenced code blocks.

Per markdownlint, fenced code blocks should have a language specified for proper syntax highlighting. These appear to be file path listings:

-```
+```text
 src/
 ├── index.ts                    # CLI entry

Same applies to code blocks at lines 294 and 299.

src/lib/native/index.ts (2)

79-88: Consider zero-copy optimization with subarray views.

The current implementation copies each vector into a new Float32Array. For better performance, consider using subarray() to create zero-copy views, provided the underlying buffer is not reused:

const vectors: Float32Array[] = [];
for (let i = 0; i < result.count; i++) {
  const start = i * dim;
  vectors.push(result.embeddings.subarray(start, start + dim));
}

However, this creates views that share the same buffer. Only apply this optimization if the native code guarantees buffer stability.


165-171: Clarify the interim colbertOffset value.

Line 169 sets colbertOffset: 0 with a comment indicating it will be set later during storage. This creates an intermediate state where the offset is not meaningful. Consider adding a JSDoc comment to the HybridEmbedding interface explaining that colbertOffset is initialized to 0 and populated during storage to avoid confusion for callers.

src/lib/workers/orchestrator.ts (1)

209-211: Use embedDense instead of embedBatch for single query.

Line 210 uses embedBatch([text]) to get the dense embedding, but embedBatch computes both dense and ColBERT embeddings. Since line 214 separately calls encodeQueryColbert for query-specific ColBERT encoding, the ColBERT portion from embedBatch is discarded.

Apply this diff to use the more appropriate function:

     await this.ensureReady();
 
     // Get dense embedding
-    const hybrids = await embedBatch([text]);
-    const denseVector = hybrids[0].dense;
+    const denseVectors = await embedDense([text]);
+    const denseVector = denseVectors[0];
 
     // Get ColBERT query embedding
     const colbertFlat = await encodeQueryColbert(text);

Don't forget to add embedDense to the import at line 21:

-import { embedBatch, initNative, encodeQueryColbert, rerankColbert } from "../native";
+import { embedBatch, embedDense, initNative, encodeQueryColbert, rerankColbert } from "../native";
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 51906e2 and 30ffe4d.

⛔ Files ignored due to path filters (2)
  • bun.lockb is excluded by !**/bun.lockb
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (38)
  • package.json (1 hunks)
  • plan.md (1 hunks)
  • src/commands/doctor.ts (2 hunks)
  • src/commands/search.ts (0 hunks)
  • src/commands/serve.ts (0 hunks)
  • src/commands/setup.ts (2 hunks)
  • src/commands/skeleton.ts (0 hunks)
  • src/commands/trace.ts (0 hunks)
  • src/commands/verify.ts (0 hunks)
  • src/config.ts (0 hunks)
  • src/index.ts (0 hunks)
  • src/lib/graph/graph-builder.ts (0 hunks)
  • src/lib/index/syncer.ts (2 hunks)
  • src/lib/native/index.ts (1 hunks)
  • src/lib/output/formatter.ts (0 hunks)
  • src/lib/output/json-formatter.ts (0 hunks)
  • src/lib/search/intent.ts (0 hunks)
  • src/lib/search/searcher.ts (9 hunks)
  • src/lib/setup/model-loader.ts (0 hunks)
  • src/lib/setup/setup-helpers.ts (2 hunks)
  • src/lib/skeleton/body-fields.ts (0 hunks)
  • src/lib/skeleton/index.ts (0 hunks)
  • src/lib/skeleton/retriever.ts (0 hunks)
  • src/lib/skeleton/skeletonizer.ts (0 hunks)
  • src/lib/skeleton/summary-formatter.ts (0 hunks)
  • src/lib/store/types.ts (0 hunks)
  • src/lib/store/vector-db.ts (0 hunks)
  • src/lib/utils/exit.ts (0 hunks)
  • src/lib/workers/colbert-math.ts (0 hunks)
  • src/lib/workers/colbert-tokenizer.ts (0 hunks)
  • src/lib/workers/download-worker.ts (0 hunks)
  • src/lib/workers/embeddings/colbert.ts (0 hunks)
  • src/lib/workers/embeddings/granite.ts (0 hunks)
  • src/lib/workers/orchestrator.ts (8 hunks)
  • src/lib/workers/pool.ts (0 hunks)
  • src/lib/workers/process-child.ts (0 hunks)
  • src/lib/workers/worker.ts (0 hunks)
  • tools/eval.ts (1 hunks)
💤 Files with no reviewable changes (28)
  • src/lib/workers/embeddings/granite.ts
  • src/lib/skeleton/retriever.ts
  • src/commands/trace.ts
  • src/index.ts
  • src/commands/serve.ts
  • src/config.ts
  • src/lib/workers/embeddings/colbert.ts
  • src/lib/skeleton/skeletonizer.ts
  • src/commands/search.ts
  • src/lib/store/types.ts
  • src/lib/workers/worker.ts
  • src/lib/workers/process-child.ts
  • src/lib/workers/colbert-math.ts
  • src/lib/search/intent.ts
  • src/lib/utils/exit.ts
  • src/lib/store/vector-db.ts
  • src/lib/workers/pool.ts
  • src/commands/verify.ts
  • src/lib/output/json-formatter.ts
  • src/lib/output/formatter.ts
  • src/lib/skeleton/body-fields.ts
  • src/lib/setup/model-loader.ts
  • src/lib/workers/colbert-tokenizer.ts
  • src/lib/skeleton/summary-formatter.ts
  • src/commands/skeleton.ts
  • src/lib/skeleton/index.ts
  • src/lib/workers/download-worker.ts
  • src/lib/graph/graph-builder.ts
🧰 Additional context used
🧬 Code graph analysis (5)
src/lib/index/syncer.ts (1)
src/lib/workers/orchestrator.ts (2)
  • processFile (150-200)
  • processFile (324-328)
src/commands/setup.ts (1)
src/lib/native/index.ts (1)
  • initNative (36-52)
src/commands/doctor.ts (1)
src/lib/native/index.ts (1)
  • initNative (36-52)
src/lib/search/searcher.ts (2)
src/lib/workers/orchestrator.ts (4)
  • encodeQuery (202-233)
  • encodeQuery (330-332)
  • rerank (235-308)
  • rerank (334-340)
src/lib/utils/filter-builder.ts (1)
  • escapeSqlString (1-5)
src/lib/native/index.ts (1)
src/config.ts (2)
  • MODEL_IDS (4-7)
  • CONFIG (18-26)
🪛 LanguageTool
plan.md

[style] ~252-~252: To form a complete sentence, be sure to include a subject.
Context: ...lex role coloring and breadcrumb logic. Could be simplified to just show: - File path...

(MISSING_IT_THERE)

🪛 markdownlint-cli2 (0.18.1)
plan.md

262-262: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


294-294: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


299-299: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (19)
tools/eval.ts (1)

1-8: Import paths updated correctly.

The import paths are correctly updated to reference ../src/lib/... from the tools/ directory location.

src/lib/setup/setup-helpers.ts (1)

31-55: Setup simplification looks good.

The removal of model download logic and simplification to only manage directories aligns well with the migration to native model initialization via initNative().

src/commands/doctor.ts (1)

27-34: Good runtime health check via native initialization.

The doctor command now actually tests that native models can initialize rather than just checking for directory presence. The error handling provides actionable guidance with the "Try: osgrep setup" message.

src/lib/index/syncer.ts (1)

10-10: Import change aligns with native migration.

Switching from pool-based to direct orchestrator processFile is consistent with the architectural shift to native Rust embeddings.

src/lib/search/searcher.ts (4)

16-20: Good consolidation of tuning constants.

Moving these from environment variables to static class constants improves maintainability and reduces cognitive load. The values appear reasonable:

  • PRE_RERANK_K_MIN = 500 ensures sufficient candidate pool
  • RERANK_CANDIDATES_K = 80 balances quality vs performance
  • MAX_PER_FILE = 3 provides result diversity

140-171: Simplified structure boosting looks good.

The removal of intent-based and role-based boosting simplifies the scoring logic significantly. The remaining penalties are straightforward:

  • Anchors: 0.99 (minimal penalty)
  • Test paths: 0.5 (reasonable de-prioritization)
  • Docs/config: 0.6 (reasonable de-prioritization)

218-225: Clean API simplification.

Removing the intent parameter and renaming _search_options/_filters to options/filters improves the public API. The underscore prefix previously suggested these were unused or internal.


354-368: Verify rerank call matches orchestrator signature.

The rerank call passes query: queryMatrixRaw which is number[][] (from encodeQuery return type). The orchestrator's rerank method signature is async rerank(input: { query: number[][], docs: RerankDoc[], colbertDim: number }): Promise<number[]>, where RerankDoc is { colbert: Buffer | Int8Array | number[] }. The call correctly provides all three required parameters with matching types.

src/lib/native/index.ts (5)

14-31: LGTM! Clear lazy-loading pattern with helpful error guidance.

The singleton pattern for native module loading is well-implemented with appropriate error handling and actionable error messages.


110-121: LGTM! Clean packed format for ColBERT embeddings.

The function correctly wraps the native result in appropriate typed arrays for efficient storage and transmission.


184-190: LGTM! Straightforward query encoding.


40-48: Remove the await suggestion; initModels is synchronous.

Line 44 calls n.initModels() without await, which is correct. All native function calls in this module (including embedDense, embedColbertPacked, embedBatch, encodeQueryColbert, and rerankColbert) are invoked synchronously. The TypeScript project uses strict type checking, and the code compiles successfully, indicating that initModels is a synchronous function, not asynchronous. No changes are required.

Likely an incorrect or invalid review comment.


225-232: Verify native binding signature for rerankColbert in osgrep-core.

Lines 226-227 convert Float32Array and Int8Array to regular arrays using Array.from(). If the native binding accepts typed arrays directly, these conversions create unnecessary overhead for large embeddings.

Check the Rust binding definition in osgrep_v2 to determine if the signature accepts typed arrays. If so, remove the conversions:

   const result = n.rerankColbert(
-    Array.from(input.queryEmbedding),
-    Array.from(input.docEmbeddings),
+    input.queryEmbedding,
+    input.docEmbeddings,
     input.docLengths,
     input.docOffsets,
     input.candidateIndices,
     input.topK
   );
src/lib/workers/orchestrator.ts (6)

12-28: LGTM! Clean imports and focused types.

The migration to native bindings simplifies the type surface by removing obsolete fields from RerankDoc.


63-73: LGTM! Efficient parallel initialization.

The ensureReady method correctly initializes both the chunker and native models in parallel with proper promise caching.


75-109: LGTM! Solid chunking implementation.

The chunk processing correctly handles anchor chunks and maintains proper indexing.


111-148: LGTM! Proper chunk preparation.

The method correctly formats chunks with all necessary metadata and uses the updated filePath parameter consistently.


150-200: LGTM! Streamlined file processing pipeline.

The refactored flow efficiently processes files by chunking and embedding in clear sequential steps, properly handling edge cases with the shouldDelete flag.


315-340: LGTM! Clean singleton pattern and convenient API wrappers.

The module-level exports provide an ergonomic interface while maintaining a single orchestrator instance.

Comment thread package.json Outdated
"license": "Apache-2.0",
"description": "Local grep-like search tool for your codebase.",
"dependencies": {
"osgrep-core": "file:../osgrep_v2",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check the full package.json to understand the context
cat -n package.json | head -60

Repository: Ryandonofrio3/osgrep

Length of output: 2126


🏁 Script executed:

# Check if workspaces are configured
rg '"workspaces"' package.json -A 10

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Check if osgrep_v2 directory exists and has a package.json
ls -la ../osgrep_v2/ 2>/dev/null | head -20

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Check parent directory structure
ls -la .. | grep -E "^d.*osgrep"

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Look for any build or publishing configuration
rg '"publishConfig"|"main"|"exports"' package.json

Repository: Ryandonofrio3/osgrep

Length of output: 85


🏁 Script executed:

# Check if osgrep-core is published to npm
npm view osgrep-core@latest 2>/dev/null | head -20 || echo "Package not found on npm or npm not available"

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Double-check the prepublishOnly script and files array
cat package.json | grep -A 2 -E '"prepublishOnly"|"files"'

Repository: Ryandonofrio3/osgrep

Length of output: 162


Local file dependency will break npm distribution.

The dependency "osgrep-core": "file:../osgrep_v2" references a local path outside the package. This will cause installation failures for users installing via npm/pnpm since the relative path won't exist on their systems. The "files" array in package.json does not include this directory, so it cannot be bundled.

Consider one of:

  1. Publish osgrep-core to npm and reference by version
  2. Use a workspace configuration if this is a monorepo
  3. Bundle the native addon within this package
  4. Use a postinstall script to build/fetch the native dependency
🤖 Prompt for AI Agents
In package.json at line ~50 (the dependency entry "osgrep-core":
"file:../osgrep_v2"), the package references a local file path which will break
installs for external consumers; replace the file: reference with one of the
recommended approaches: publish osgrep-core to npm and change the dependency to
the published semantic version, convert the repo to a workspace/monorepo and
reference the package via workspace protocol (and ensure workspace config is
committed), bundle the compiled native addon into this package and add it to the
"files" array, or implement a postinstall script that builds or downloads the
native dependency during install; update package.json accordingly and ensure
CI/build scripts reflect the chosen approach.

Comment thread src/commands/setup.ts
Comment on lines +37 to +40
// Pre-warm native models (downloads via HuggingFace Hub cache on first run)
console.log("\nInitializing native models...");
await initNative();
console.log("✓ Native models ready");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing error handling for initNative() call.

Unlike doctor.ts which wraps initNative() in a try/catch with a helpful error message, the setup command will crash ungracefully if native initialization fails. Consider consistent error handling:

     // Pre-warm native models (downloads via HuggingFace Hub cache on first run)
     console.log("\nInitializing native models...");
-    await initNative();
-    console.log("✓ Native models ready");
+    try {
+      await initNative();
+      console.log("✓ Native models ready");
+    } catch (error) {
+      console.error("✗ Native model initialization failed:", error);
+      console.error("  Check network connectivity and disk space.");
+      process.exit(1);
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Pre-warm native models (downloads via HuggingFace Hub cache on first run)
console.log("\nInitializing native models...");
await initNative();
console.log("✓ Native models ready");
// Pre-warm native models (downloads via HuggingFace Hub cache on first run)
console.log("\nInitializing native models...");
try {
await initNative();
console.log("✓ Native models ready");
} catch (error) {
console.error("✗ Native model initialization failed:", error);
console.error(" Check network connectivity and disk space.");
process.exit(1);
}
🤖 Prompt for AI Agents
In src/commands/setup.ts around lines 37 to 40, the call to initNative() lacks
error handling and can crash the CLI; wrap the await initNative() call in a
try/catch, log a clear error message including the caught error details (use the
same logger pattern as doctor.ts or console.error), then exit with a non-zero
code (e.g., process.exit(1)) or rethrow after logging so the failure is handled
gracefully.

Comment on lines +254 to +281
// Pack all doc embeddings into a single buffer; offsets are element offsets
const packedChunks: Int8Array[] = [];
let totalElements = 0;

for (let i = 0; i < input.docs.length; i++) {
const doc = input.docs[i];
const col = doc.colbert;
let colbert: Int8Array;

let colbert: Int8Array;
if (col instanceof Int8Array) {
colbert = col;
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (
col &&
typeof col === "object" &&
"type" in col &&
(col as any).type === "Buffer" &&
Array.isArray((col as any).data)
) {
// IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer)
colbert = new Int8Array((col as any).data);
} else if (Array.isArray(col)) {
colbert = new Int8Array(col);
} else {
colbert = new Int8Array(0);
}

const seqLen = Math.floor(colbert.length / input.colbertDim);
const docMatrix: Float32Array[] = [];
for (let i = 0; i < seqLen; i++) {
const start = i * input.colbertDim;
const row = new Float32Array(input.colbertDim);
for (let d = 0; d < input.colbertDim; d++) {
row[d] = (colbert[start + d] * doc.scale) / 127.0;
}
docMatrix.push(row);
}
const tokenIds =
Array.isArray(doc.token_ids) && doc.token_ids.length === seqLen
? doc.token_ids
: undefined;
return maxSim(queryMatrix, docMatrix, tokenIds);
const used = colbert.subarray(0, seqLen * input.colbertDim);

docOffsets.push(totalElements);
docLengths.push(seqLen);
candidateIndices.push(i);
packedChunks.push(used);
totalElements += used.length;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify offset semantics: bytes vs. elements.

The implementation uses element offsets (line 276: docOffsets.push(totalElements), line 280: totalElements += used.length), but the RerankInput interface in src/lib/native/index.ts (line 203) documents docOffsets as "Byte offsets per doc."

For Int8Array, elements and bytes are equivalent (1 element = 1 byte), so the code is functionally correct. However, the inconsistent terminology is confusing.

Apply these changes for clarity:

In src/lib/native/index.ts, line 203:

   /** Token counts per doc */
   docLengths: number[];
-  /** Byte offsets per doc */
+  /** Element offsets per doc in docEmbeddings */
   docOffsets: number[];

In src/lib/workers/orchestrator.ts, line 254:

-    // Pack all doc embeddings into a single buffer; offsets are element offsets
+    // Pack all doc embeddings into a single buffer
     const packedChunks: Int8Array[] = [];
     let totalElements = 0;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Pack all doc embeddings into a single buffer; offsets are element offsets
const packedChunks: Int8Array[] = [];
let totalElements = 0;
for (let i = 0; i < input.docs.length; i++) {
const doc = input.docs[i];
const col = doc.colbert;
let colbert: Int8Array;
let colbert: Int8Array;
if (col instanceof Int8Array) {
colbert = col;
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (
col &&
typeof col === "object" &&
"type" in col &&
(col as any).type === "Buffer" &&
Array.isArray((col as any).data)
) {
// IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer)
colbert = new Int8Array((col as any).data);
} else if (Array.isArray(col)) {
colbert = new Int8Array(col);
} else {
colbert = new Int8Array(0);
}
const seqLen = Math.floor(colbert.length / input.colbertDim);
const docMatrix: Float32Array[] = [];
for (let i = 0; i < seqLen; i++) {
const start = i * input.colbertDim;
const row = new Float32Array(input.colbertDim);
for (let d = 0; d < input.colbertDim; d++) {
row[d] = (colbert[start + d] * doc.scale) / 127.0;
}
docMatrix.push(row);
}
const tokenIds =
Array.isArray(doc.token_ids) && doc.token_ids.length === seqLen
? doc.token_ids
: undefined;
return maxSim(queryMatrix, docMatrix, tokenIds);
const used = colbert.subarray(0, seqLen * input.colbertDim);
docOffsets.push(totalElements);
docLengths.push(seqLen);
candidateIndices.push(i);
packedChunks.push(used);
totalElements += used.length;
}
// Pack all doc embeddings into a single buffer
const packedChunks: Int8Array[] = [];
let totalElements = 0;
for (let i = 0; i < input.docs.length; i++) {
const doc = input.docs[i];
const col = doc.colbert;
let colbert: Int8Array;
if (col instanceof Int8Array) {
colbert = col;
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (Array.isArray(col)) {
colbert = new Int8Array(col);
} else {
colbert = new Int8Array(0);
}
const seqLen = Math.floor(colbert.length / input.colbertDim);
const used = colbert.subarray(0, seqLen * input.colbertDim);
docOffsets.push(totalElements);
docLengths.push(seqLen);
candidateIndices.push(i);
packedChunks.push(used);
totalElements += used.length;
}

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/lib/workers/orchestrator.ts (1)

257-292: Offset semantics clarification needed (see past review).

As noted in a previous review, the code uses element offsets (line 285: docOffsets.push(totalElements)), but the RerankInput interface documents docOffsets as "Byte offsets per doc." For Int8Array, elements equal bytes, so the code is functionally correct, but the terminology should be clarified.

The previous review suggested updating the interface documentation in src/lib/native/index.ts to say "Element offsets per doc in docEmbeddings" instead of "Byte offsets per doc."

🧹 Nitpick comments (5)
src/lib/search/searcher.ts (3)

16-20: Constants are well-defined with reasonable defaults.

The static readonly constants provide clear tuning parameters for the search pipeline. Consider documenting the rationale for these specific values (e.g., why RERANK_CANDIDATES_K = 80 and FUSED_WEIGHT = 0.5) in comments for future maintainers.


21-75: FTS normalization looks good with one minor suggestion.

The stopword filtering and query normalization are well-implemented. The 16-token limit prevents pathological FTS parsing. One minor consideration: the minimum token length of 3 (line 70) could filter out meaningful short terms like "id", "io", or "db".


418-438: Defensive handling for token_ids conversion.

The nested ternary for token_ids extraction (lines 423-428) handles multiple input formats correctly. However, the deep nesting reduces readability.

Consider extracting to a helper function:

+  private extractTokenIds(doc: any): number[] {
+    if (Array.isArray(doc.doc_token_ids)) {
+      return doc.doc_token_ids;
+    }
+    if (typeof doc.doc_token_ids?.toArray === "function") {
+      return (doc.doc_token_ids.toArray() ?? []).filter(
+        (v: unknown) => typeof v === "number"
+      );
+    }
+    return [];
+  }
tools/eval.ts (1)

533-540: Avoidance logic is correct but slightly redundant.

Line 536 checks avoidPath ? path.includes(avoidPath) : false, but avoidPath is always truthy due to the nullish coalescing on line 533. The condition avoidPath ? is technically redundant.

-  const avoidRank = response.data.findIndex((chunk) => {
-    const path = chunk.metadata?.path?.toLowerCase() || "";
-    return avoidPath ? path.includes(avoidPath) : false;
-  });
+  const avoidRank = response.data.findIndex((chunk) => {
+    const path = chunk.metadata?.path?.toLowerCase() || "";
+    return path.includes(avoidPath);
+  });
src/lib/native/index.ts (1)

79-91: Dense embedding extraction is correct but could use subarray.

The manual element-by-element copy works but Float32Array.prototype.subarray() followed by slice() would be more idiomatic and potentially faster.

   const vectors: Float32Array[] = [];
   for (let i = 0; i < result.count; i++) {
     const start = i * dim;
-    const vec = new Float32Array(dim);
-    for (let j = 0; j < dim; j++) {
-      vec[j] = result.embeddings[start + j];
-    }
+    const vec = result.embeddings.slice(start, start + dim);
     vectors.push(vec);
   }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 30ffe4d and 7fabb4f.

📒 Files selected for processing (7)
  • src/commands/search.ts (5 hunks)
  • src/lib/native/index.ts (1 hunks)
  • src/lib/search/searcher.ts (10 hunks)
  • src/lib/store/types.ts (0 hunks)
  • src/lib/store/vector-db.ts (1 hunks)
  • src/lib/workers/orchestrator.ts (8 hunks)
  • tools/eval.ts (6 hunks)
💤 Files with no reviewable changes (1)
  • src/lib/store/types.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/store/vector-db.ts
🧰 Additional context used
🧬 Code graph analysis (1)
src/lib/native/index.ts (1)
src/config.ts (2)
  • MODEL_IDS (4-7)
  • CONFIG (18-26)
🔇 Additional comments (20)
src/lib/search/searcher.ts (4)

11-11: LGTM on orchestrator import migration.

The import correctly switches from worker-pool based encoding to the orchestrator-based encodeQuery and rerank functions, aligning with the native Rust backend migration.


195-231: Structure boost simplification is appropriate.

The removal of intent-based and complex type boosts aligns with the PR's goal of simplifying the search pipeline. The path-based penalties (tests: 0.5, tools: 0.35, docs: 0.6) are reasonable heuristics. The anchor penalty of 0.99 is minimal and appropriate for recall helpers.


278-289: Search method signature simplification looks correct.

The removal of the intent parameter and simplification of options aligns with the broader refactor. The default rerank: true is a sensible default for quality results.


481-499: Score normalization and confidence labeling are well-implemented.

The relative normalization against maxScore and the confidence thresholds (>0.8 High, >0.5 Medium, else Low) provide meaningful user-facing quality indicators. The guard against maxScore > 0 prevents division by zero.

tools/eval.ts (3)

1-8: Import path corrections are appropriate.

The imports correctly navigate from tools/ to ../src/lib/ to access the library modules. This maintains consistency with the project structure.


17-18: Smart self-exclusion from eval results.

Using DEFAULT_AVOID_PATH = "tools/eval.ts" prevents the eval script itself from appearing in search results, which is a good practice since this file contains many query-related strings that could match incorrectly.


61-76: Eval cases correctly updated for native binding migration.

The test cases now reference src/lib/native/index.ts for native core-related functionality (loadNative, initNative, rerankColbert), aligning with the architectural shift from worker-based to native bindings.

src/commands/search.ts (4)

264-268: CLI option naming improvement.

The change from -m <max_count> to -m, --max-count <max_count> with a clear default of "5" improves CLI usability and self-documentation.


299-309: Options type correctly updated.

The maxCount: string type aligns with Commander's string-based option parsing. The removal of the skeleton option from the type reflects the broader skeleton functionality removal.


316-318: Robust limit validation.

The validation ensures a positive finite number with a sensible fallback to 5. This handles edge cases like NaN, negative numbers, and Infinity.


492-498: Consistent limit usage in local search path.

The limit variable is correctly passed to searcher.search(), matching the server-side behavior. The parameter ordering aligns with the updated Searcher.search signature.

src/lib/native/index.ts (3)

15-17: Module-level state is appropriate for singleton pattern.

The three state variables (native, initPromise, initialized) correctly implement the lazy-loading singleton pattern with initialization guards.


151-179: Hybrid embedding extraction handles all formats correctly.

The batch embedding function properly extracts dense vectors, ColBERT embeddings, and token IDs per document. The tokenCursor tracking ensures correct token ID slicing across documents.


231-261: Verify native rerankColbert parameter type expectations against encodeQueryColbert return type.

Line 235 converts input.queryEmbedding (typed as Float32Array) to Float64Array via Float64Array.from(input.queryEmbedding as any). This creates an inconsistency: encodeQueryColbert() at line 197 explicitly returns Float32Array, yet rerankColbert() immediately converts it to Float64Array.

According to NAPI-RS documentation, TypedArray types must match their Rust counterparts—mixing Float32Array and Float64Array does not trigger automatic reinterpretation, so the native function must expect one or the other. If the native rerankColbert function accepts Float32Array (matching encodeQueryColbert's output), the conversion introduces unnecessary memory allocation and precision change overhead. The obscuring as any cast makes the intent unclear.

Recommend verifying the Rust binding signature to confirm whether the conversion is required or should be removed to maintain type consistency.

src/lib/workers/orchestrator.ts (6)

1-10: Clear module documentation.

The header comment clearly explains the module's responsibilities and the rationale for removing the worker pool (Rust ONNX Runtime is fast and stable).


64-74: Parallel initialization is appropriate.

Using Promise.all to initialize both the chunker and native models concurrently is efficient. The .finally() cleanup ensures initPromise is cleared even on failure.


183-201: Clean embedding integration.

The embedBatch call efficiently embeds all chunks in one native call. Converting ColBERT to Buffer for storage compatibility is appropriate. The doc_token_ids are correctly extracted from the hybrid embedding.


204-235: Query encoding correctly reshapes ColBERT output.

The function properly converts the flat ColBERT array into a 2D matrix. The calculation seqLen = colbertFlat.length / dim correctly determines the sequence length.


318-327: Score alignment preserves input order.

The mapping from result indices back to input order via candidateIndices.map ensures the Searcher receives scores in the expected order. This is critical for correct result ranking.


334-341: Singleton pattern is correct.

The lazy initialization of the orchestrator singleton is appropriate for this use case. The null check ensures only one instance is created.

Comment thread src/lib/native/index.ts
Comment on lines +36 to +52
export async function initNative(): Promise<void> {
if (initialized) return;
if (initPromise) return initPromise;

initPromise = (async () => {
const n = await loadNative();

if (!n.isInitialized()) {
n.initModels(MODEL_IDS.embed, MODEL_IDS.colbert);
}

initialized = true;
})();

await initPromise;
initPromise = null;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Idempotent initialization with race condition protection.

The initNative() function correctly handles concurrent calls by checking initPromise before starting initialization. However, setting initPromise = null on line 51 after awaiting could cause a race condition if another call starts during the brief window.

Consider keeping the promise cached or using a different pattern:

 export async function initNative(): Promise<void> {
   if (initialized) return;
   if (initPromise) return initPromise;
 
   initPromise = (async () => {
     const n = await loadNative();
 
     if (!n.isInitialized()) {
       n.initModels(MODEL_IDS.embed, MODEL_IDS.colbert);
     }
 
     initialized = true;
   })();
 
   await initPromise;
-  initPromise = null;
+  // Keep initPromise cached to handle concurrent calls during init
 }
🤖 Prompt for AI Agents
In src/lib/native/index.ts around lines 36–52, the code clears initPromise after
awaiting which opens a race where another caller could start a new init during
that brief gap; instead, keep initPromise set so concurrent callers continue
awaiting the same promise (i.e., remove the initPromise = null line) or
explicitly set it to a resolved Promise to represent completed initialization,
and wrap the initialization in try/catch so you only clear/reset initPromise on
failure (not on success) to allow retries on error but prevent concurrent
double-initialization.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

♻️ Duplicate comments (1)
src/lib/native/index.ts (1)

36-52: Race condition when clearing initPromise after initialization.

Setting initPromise = null on line 51 after awaiting could cause a race condition if another call starts during that brief window. This was flagged in a previous review.

Apply the previously suggested fix:

   await initPromise;
-  initPromise = null;
+  // Keep initPromise cached to handle concurrent calls during init
 }
🧹 Nitpick comments (23)
.gitignore (1)

32-33: Minor: Remove redundant target directory pattern.

Line 33's **/target/ already matches osgrep-core/target/ from line 32, making line 32 redundant. Consider removing line 32 to keep the patterns concise.

Apply this diff to remove redundancy:

 # Native core (Rust / N-API) build outputs
-osgrep-core/target/
 **/target/
 osgrep-core/*.node
osgrep-core/tsconfig.json (1)

8-8: Consider removing unused jsx configuration.

The jsx: "react-jsx" option appears unnecessary for a native binding package that likely doesn't contain React components. Removing it would simplify the configuration.

osgrep-core/package.json (1)

9-17: Consider adding Linux ARM64 target for broader platform support.

The triples configuration is missing aarch64-unknown-linux-gnu, which is increasingly common for AWS Graviton instances, Docker containers on Apple Silicon, and other ARM-based Linux deployments.

       "additional": [
         "aarch64-apple-darwin",
         "x86_64-apple-darwin",
         "x86_64-unknown-linux-gnu",
-        "x86_64-pc-windows-msvc"
+        "x86_64-pc-windows-msvc",
+        "aarch64-unknown-linux-gnu"
       ]
osgrep-core/src/lib.rs (2)

37-55: TOCTOU race in initialization is benign but could be cleaner.

The pattern if MODEL.get().is_none() { MODEL.set(...) } has a race window where multiple threads could both pass the check and attempt initialization. While OnceCell::set safely returns an error if already set, this means redundant model loading could occur.

Consider using OnceCell::get_or_try_init for atomic initialization:

 pub fn init_models(dense_repo: String, colbert_repo: String) -> Result<()> {
-    // Initialize dense model
-    if DENSE_MODEL.get().is_none() {
-        let encoder = DenseEncoderOrt::load_from_hf(&dense_repo, 384)
-            .map_err(|e| Error::from_reason(format!("Failed to load dense model: {:?}", e)))?;
-        DENSE_MODEL.set(Mutex::new(encoder))
-            .map_err(|_| Error::from_reason("Dense model already initialized"))?;
-    }
+    DENSE_MODEL.get_or_try_init(|| {
+        DenseEncoderOrt::load_from_hf(&dense_repo, 384)
+            .map(Mutex::new)
+            .map_err(|e| Error::from_reason(format!("Failed to load dense model: {:?}", e)))
+    })?;

-    // Initialize ColBERT model
-    if COLBERT_MODEL.get().is_none() {
-        let encoder = ColbertEncoderOrt::load_from_hf(&colbert_repo, 48)
-            .map_err(|e| Error::from_reason(format!("Failed to load ColBERT model: {:?}", e)))?;
-        COLBERT_MODEL.set(Mutex::new(encoder))
-            .map_err(|_| Error::from_reason("ColBERT model already initialized"))?;
-    }
+    COLBERT_MODEL.get_or_try_init(|| {
+        ColbertEncoderOrt::load_from_hf(&colbert_repo, 48)
+            .map(Mutex::new)
+            .map_err(|e| Error::from_reason(format!("Failed to load ColBERT model: {:?}", e)))
+    })?;

     Ok(())
 }

254-265: Unnecessary clone and double mutex acquisition in embed_batch.

texts.clone() creates an extra allocation, and calling embed_dense then embed_colbert_packed acquires/releases each mutex separately. For better performance with large batches, consider inlining the logic to avoid the clone and reduce lock contention.

 pub fn embed_batch(texts: Vec<String>) -> Result<EmbedResult> {
-    let dense = embed_dense(texts.clone())?;
-    let colbert = embed_colbert_packed(texts)?;
+    // Inline to avoid clone and reduce lock acquisitions
+    let dense_model = DENSE_MODEL.get()
+        .ok_or_else(|| Error::from_reason("Models not initialized. Call init_models() first."))?;
+    let colbert_model = COLBERT_MODEL.get()
+        .ok_or_else(|| Error::from_reason("Models not initialized. Call init_models() first."))?;
+
+    let embeddings_f32 = {
+        let mut encoder = dense_model.lock()
+            .map_err(|e| Error::from_reason(format!("Failed to lock dense model: {:?}", e)))?;
+        encoder.encode_batch(texts.clone(), true)
+            .map_err(|e| Error::from_reason(format!("Dense encoding failed: {:?}", e)))?
+    };
+
+    let packed = {
+        let mut encoder = colbert_model.lock()
+            .map_err(|e| Error::from_reason(format!("Failed to lock ColBERT model: {:?}", e)))?;
+        encoder.encode_docs_packed(&texts)
+            .map_err(|e| Error::from_reason(format!("ColBERT encoding failed: {:?}", e)))?
+    };
+
+    let embeddings_i8: Vec<i8> = packed.embeddings.iter()
+        .map(|&x| (x * 127.0).clamp(-128.0, 127.0) as i8)
+        .collect();

     Ok(EmbedResult {
-        dense: dense.embeddings,
-        colbert_embeddings: colbert.embeddings,
-        colbert_token_ids: colbert.token_ids,
-        colbert_lengths: colbert.lengths,
-        colbert_offsets: colbert.offsets,
+        dense: embeddings_f32.iter().map(|&x| x as f64).collect(),
+        colbert_embeddings: embeddings_i8,
+        colbert_token_ids: packed.token_ids,
+        colbert_lengths: packed.lengths,
+        colbert_offsets: packed.offsets,
     })
 }

Note: The clone is still needed since both encoders need the texts, but this avoids acquiring each mutex twice when called through the individual functions.

osgrep-core/src/colbert_ort.rs (4)

76-79: Silent fallback for missing special tokens could mask configuration issues.

Using unwrap_or(&0) for CLS, SEP, MASK, and PAD tokens will silently default to token ID 0 if these tokens are missing from the vocabulary. This could lead to incorrect encoding behavior that's hard to debug. Consider returning an error or at least logging a warning when critical tokens are missing.

-        let cls_id = *vocab.get("[CLS]").unwrap_or(&0);
-        let sep_id = *vocab.get("[SEP]").unwrap_or(&0);
-        let mask_id = *vocab.get("[MASK]").unwrap_or(&0);
-        let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD
+        let cls_id = *vocab.get("[CLS]")
+            .ok_or_else(|| anyhow::anyhow!("Tokenizer missing [CLS] token"))?;
+        let sep_id = *vocab.get("[SEP]")
+            .ok_or_else(|| anyhow::anyhow!("Tokenizer missing [SEP] token"))?;
+        let mask_id = *vocab.get("[MASK]")
+            .ok_or_else(|| anyhow::anyhow!("Tokenizer missing [MASK] token"))?;
+        let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD

496-502: Token offset calculation is O(n) per document, leading to O(n²) complexity.

The token_offset is recomputed by summing all previous lengths for each document. Since PackedDocEmbeddings already stores offsets for embeddings, consider adding a token_offsets field to store precomputed token offsets, making lookups O(1).

 pub struct PackedDocEmbeddings {
     pub embeddings: Vec<f32>,
     pub token_ids: Vec<u32>,
     pub lengths: Vec<u32>,
     pub offsets: Vec<u32>,
+    /// Token ID offsets for each doc (for fast lookup)
+    pub token_offsets: Vec<u32>,
     pub hidden_size: usize,
 }

Then update encode_docs_packed to populate token_offsets alongside offsets, and simplify score_packed:

-            let token_offset: usize = packed.offsets[..doc_idx]
-                .iter()
-                .zip(&packed.lengths[..doc_idx])
-                .map(|(&off, &len)| len as usize)
-                .sum();
+            let token_offset = packed.token_offsets[doc_idx] as usize;

172-183: Consider extracting L2 normalization to a helper function.

The L2 normalization logic is duplicated in encode_query (lines 172-183) and encode_docs (lines 286-297). Extracting this to a helper would reduce duplication and make future changes easier.

/// L2 normalize a single token embedding in-place within the destination buffer
fn l2_normalize_token(
    src: &[f32],
    dst: &mut [f32],
    hidden_size: usize,
) {
    let mut sum_sq = 0.0f32;
    for d in 0..hidden_size {
        sum_sq += src[d] * src[d];
    }
    let norm = sum_sq.sqrt().max(1e-12);
    for d in 0..hidden_size {
        dst[d] = src[d] / norm;
    }
}

Also applies to: 286-297


65-68: Thread count differs from dense_ort.rs.

This uses 8 intra-op threads while dense_ort.rs uses 4. If intentional due to ColBERT's heavier workload, consider adding a comment explaining the choice. Otherwise, consider using a shared constant for consistency.

osgrep-core/test.mjs (2)

24-27: Smoke test lacks assertions.

The test logs values but doesn't assert correctness. If embedDense returns malformed data, the test would still print "All tests passed". Consider adding basic assertions.

   const dense = embedDense(texts);
   console.log(`Dense embed ${texts.length} texts in ${Date.now() - t0}ms`);
   console.log(`  count: ${dense.count}`);
-  console.log(`  embeddings length: ${dense.embeddings.length} (expected: ${texts.length * 384})`);
+  const expectedLength = texts.length * 384;
+  console.log(`  embeddings length: ${dense.embeddings.length} (expected: ${expectedLength})`);
+  if (dense.count !== texts.length) {
+    throw new Error(`Expected count ${texts.length}, got ${dense.count}`);
+  }
+  if (dense.embeddings.length !== expectedLength) {
+    throw new Error(`Expected embeddings length ${expectedLength}, got ${dense.embeddings.length}`);
+  }

4-5: Consider documenting model dimensions.

The expected dimension (384) is hardcoded on line 27. Consider adding a comment here noting the expected dimensions for each model, or querying the dimension from the API if available.

-const DENSE_REPO = 'onnx-community/granite-embedding-30m-english-ONNX';
-const COLBERT_REPO = 'ryandono/mxbai-edge-colbert-v0-17m-onnx-int8';
+const DENSE_REPO = 'onnx-community/granite-embedding-30m-english-ONNX';  // 384-dim
+const COLBERT_REPO = 'ryandono/mxbai-edge-colbert-v0-17m-onnx-int8';    // 48-dim ColBERT
osgrep-core/bench/rerank.ts (1)

51-57: Consider bounds checking for candidate preview.

If candidates[idx] is shorter than 80 characters, .slice(0, 80) is safe, but if idx is out of bounds due to a bug in the native code, this would throw. Consider adding a guard.

   for (let i = 0; i < result.indices.length; i++) {
     const idx = result.indices[i];
     const score = result.scores[i];
-    const preview = candidates[idx].slice(0, 80).replace(/\n/g, " ").trim();
+    const preview = idx < candidates.length 
+      ? candidates[idx].slice(0, 80).replace(/\n/g, " ").trim()
+      : "<invalid index>";
     console.log(`    [${i + 1}] idx=${idx}, score=${score.toFixed(4)}`);
     console.log(`        "${preview}..."`);
   }
osgrep-core/bench/rerank-ort.ts (1)

21-27: Redundant file enumeration.

walkFiles and isCodeFile are called here for logging, but loadRepoChunks internally calls them again. Consider either removing the redundant calls or refactoring loadRepoChunks to accept pre-computed file lists.

   // 2. Load chunks
   console.log("Loading code files and chunking...");
-  const files = walkFiles(REPO_DIR);
-  const codeFiles = files.filter(isCodeFile);
   const allChunks = loadRepoChunks(REPO_DIR, 80);
 
-  console.log(`  Files found: ${files.length}`);
-  console.log(`  Code files: ${codeFiles.length}`);
   console.log(`  Total chunks: ${allChunks.length}`);
   console.log(`  Candidates per query: ${NUM_CANDIDATES}`);
   console.log(`  Top-K: ${TOP_K}`);
osgrep-core/bench/dense.ts (1)

21-29: Same redundant file enumeration pattern.

Like in rerank-ort.ts, walkFiles and isCodeFile are called separately just for logging, while loadRepoChunks recomputes them internally. Consider refactoring for consistency across benchmarks.

osgrep-core/index.js (1)

19-31: Linux musl/gnu detection: prefer “try both” over brittle detection.
Reading the ldd binary to detect musl is fragile, and mis-detection forces a hard failure. Consider attempting musl load, then falling back to gnu (or vice versa) when arch === "x64".

Also applies to: 61-77

osgrep-core/bench/codeatlas-system.ts (1)

276-300: Watch out for memory blowups: number[][] + per-chunk slices copies embeddings.
For larger repos, this can get expensive fast (many chunks × 384 dims × JS array overhead). Consider keeping a single flat Float32Array (or number[]) plus hiddenSize and computing cosine similarity via offsets, avoiding per-chunk array allocations.

osgrep-core/src/dense_ort.rs (3)

21-66: Silent failure when downloading external data file.

Line 31 silently ignores errors when downloading model.onnx_data. While the comment says "ignore if not found," this could mask real download failures (network issues, auth problems) for models that require the external data file.

Consider logging a debug message or checking if the model actually needs the external data:

-        let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found
+        // External data file - optional for some models
+        if let Err(_) = repo.get("onnx/model.onnx_data") {
+            // External data not found - this is expected for smaller models
+        }

43-56: Duplicate tokenizer configuration logic between load_from_hf and load.

The truncation and padding configuration (lines 46-56 and 82-91) is duplicated. Extract this into a helper function to maintain DRY.

fn configure_tokenizer(tokenizer: &mut Tokenizer, max_len: usize) -> anyhow::Result<()> {
    tokenizer.with_truncation(Some(tokenizers::TruncationParams {
        max_length: max_len,
        ..Default::default()
    })).map_err(|e| anyhow::anyhow!("Failed to set truncation: {}", e))?;

    tokenizer.with_padding(Some(tokenizers::PaddingParams {
        strategy: tokenizers::PaddingStrategy::BatchLongest,
        ..Default::default()
    }));

    Ok(())
}

Also applies to: 79-91


103-111: Consider avoiding the texts.clone() in tokenization.

The texts vector is cloned unnecessarily since encode_batch should accept references.

-    pub fn encode_batch(&mut self, texts: Vec<String>, normalize: bool) -> anyhow::Result<Vec<f32>> {
+    pub fn encode_batch(&mut self, texts: &[String], normalize: bool) -> anyhow::Result<Vec<f32>> {
         if texts.is_empty() {
             return Err(anyhow::anyhow!("Empty input texts"));
         }
 
         // Tokenize
         let encodings = self.tokenizer
-            .encode_batch(texts.clone(), true)
+            .encode_batch(texts.to_vec(), true)
             .map_err(|e| anyhow::anyhow!("Tokenization failed: {}", e))?;

Note: This depends on caller patterns; if texts ownership isn't needed downstream, accepting a slice avoids an allocation at call sites.

osgrep-core/index.d.ts (2)

3-8: Hardcoded dimension 384 in comments may become stale.

The comments reference [batch_size * 384] which couples documentation to a specific model configuration. Consider referencing VECTOR_DIM conceptually or noting it's model-dependent.

 export interface DenseResult {
-  /** Flat array of embeddings [batch_size * 384] */
+  /** Flat array of embeddings [batch_size * VECTOR_DIM] */
   embeddings: number[];

Also applies to: 28-39


10-19: ColbertPackedResult missing tokenIds documentation consistency.

The interface has tokenIds but the EmbedResult interface uses colbertTokenIds. This naming inconsistency could cause confusion.

Consider aligning the naming: either both use tokenIds or both use colbertTokenIds prefix.

src/lib/native/index.ts (2)

72-91: Dense embedding extraction is correct but could use Float32Array.from for clarity.

The manual loop works but Float32Array.prototype.subarray or slice could be more idiomatic.

   const vectors: Float32Array[] = [];
   for (let i = 0; i < result.count; i++) {
     const start = i * dim;
-    const vec = new Float32Array(dim);
-    for (let j = 0; j < dim; j++) {
-      vec[j] = result.embeddings[start + j];
-    }
+    // Create a copy of the slice (subarray would share the buffer)
+    const vec = new Float32Array(result.embeddings.slice(start, start + dim));
     vectors.push(vec);
   }

Note: This assumes result.embeddings is array-like with a slice method.


221-226: RerankResult interface duplicates osgrep-core/index.d.ts definition.

Both this file and osgrep-core/index.d.ts define RerankResult. Consider importing from the native binding types to maintain a single source of truth.

+import type { RerankResult as NativeRerankResult } from "osgrep-core";
+
+// Re-export for consumers
+export type RerankResult = NativeRerankResult;
-export interface RerankResult {
-  /** Original indices of top-k docs */
-  indices: number[];
-  /** MaxSim scores */
-  scores: number[];
-}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7fabb4f and 1e936d2.

⛔ Files ignored due to path filters (3)
  • osgrep-core/Cargo.lock is excluded by !**/*.lock
  • osgrep-core/bun.lockb is excluded by !**/bun.lockb
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (24)
  • .gitignore (2 hunks)
  • osgrep-core/.gitignore (1 hunks)
  • osgrep-core/Cargo.toml (1 hunks)
  • osgrep-core/bench/PARITY.md (1 hunks)
  • osgrep-core/bench/_util.ts (1 hunks)
  • osgrep-core/bench/codeatlas-system.ts (1 hunks)
  • osgrep-core/bench/colbert-parity-rust.ts (1 hunks)
  • osgrep-core/bench/colbert-parity.ts (1 hunks)
  • osgrep-core/bench/dense-ort.ts (1 hunks)
  • osgrep-core/bench/dense.ts (1 hunks)
  • osgrep-core/bench/rerank-ort.ts (1 hunks)
  • osgrep-core/bench/rerank-preindexed.ts (1 hunks)
  • osgrep-core/bench/rerank.ts (1 hunks)
  • osgrep-core/build.rs (1 hunks)
  • osgrep-core/index.d.ts (1 hunks)
  • osgrep-core/index.js (1 hunks)
  • osgrep-core/package.json (1 hunks)
  • osgrep-core/src/colbert_ort.rs (1 hunks)
  • osgrep-core/src/dense_ort.rs (1 hunks)
  • osgrep-core/src/lib.rs (1 hunks)
  • osgrep-core/test.mjs (1 hunks)
  • osgrep-core/tsconfig.json (1 hunks)
  • package.json (1 hunks)
  • src/lib/native/index.ts (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • osgrep-core/.gitignore
  • osgrep-core/bench/PARITY.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • package.json
🧰 Additional context used
🧬 Code graph analysis (13)
osgrep-core/bench/dense.ts (1)
osgrep-core/bench/_util.ts (4)
  • cloneRepoIfMissing (8-15)
  • walkFiles (21-43)
  • isCodeFile (48-62)
  • loadRepoChunks (107-121)
osgrep-core/bench/rerank-preindexed.ts (1)
osgrep-core/bench/_util.ts (4)
  • cloneRepoIfMissing (8-15)
  • walkFiles (21-43)
  • isCodeFile (48-62)
  • loadRepoChunks (107-121)
osgrep-core/bench/codeatlas-system.ts (1)
osgrep-core/bench/_util.ts (3)
  • ChunkMeta (126-132)
  • chunkOverlapsSpan (198-219)
  • loadRepoChunksWithMeta (142-192)
osgrep-core/bench/rerank-ort.ts (1)
osgrep-core/bench/_util.ts (4)
  • cloneRepoIfMissing (8-15)
  • walkFiles (21-43)
  • isCodeFile (48-62)
  • loadRepoChunks (107-121)
osgrep-core/test.mjs (1)
src/lib/native/index.ts (2)
  • embedDense (72-91)
  • embedBatch (139-182)
osgrep-core/bench/rerank.ts (1)
osgrep-core/bench/_util.ts (3)
  • cloneRepoIfMissing (8-15)
  • loadRepoChunks (107-121)
  • pickCandidates (100-102)
osgrep-core/src/dense_ort.rs (2)
osgrep-core/src/colbert_ort.rs (3)
  • log_native (7-11)
  • outputs (163-163)
  • outputs (271-271)
osgrep-core/test.mjs (2)
  • texts (22-22)
  • start (15-15)
osgrep-core/bench/dense-ort.ts (3)
osgrep-core/build.rs (1)
  • main (3-5)
osgrep-core/bench/_util.ts (4)
  • cloneRepoIfMissing (8-15)
  • walkFiles (21-43)
  • isCodeFile (48-62)
  • loadRepoChunks (107-121)
osgrep-core/index.js (2)
  • process (14-14)
  • process (28-28)
osgrep-core/bench/colbert-parity.ts (1)
osgrep-core/bench/_util.ts (1)
  • loadRepoChunksWithMeta (142-192)
osgrep-core/index.d.ts (1)
src/lib/native/index.ts (5)
  • RerankResult (221-226)
  • embedDense (72-91)
  • encodeQueryColbert (192-198)
  • rerankColbert (231-261)
  • embedBatch (139-182)
osgrep-core/src/colbert_ort.rs (1)
osgrep-core/src/dense_ort.rs (3)
  • log_native (6-10)
  • load_from_hf (21-66)
  • hidden_size (205-207)
osgrep-core/src/lib.rs (3)
osgrep-core/src/colbert_ort.rs (1)
  • load_from_hf (37-103)
osgrep-core/src/dense_ort.rs (2)
  • load_from_hf (21-66)
  • hidden_size (205-207)
osgrep-core/index.d.ts (4)
  • DenseResult (3-8)
  • ColbertPackedResult (10-19)
  • RerankResult (21-26)
  • EmbedResult (28-39)
src/lib/native/index.ts (2)
src/config.ts (2)
  • MODEL_IDS (4-7)
  • CONFIG (18-26)
osgrep-core/index.d.ts (1)
  • RerankResult (21-26)
🔇 Additional comments (30)
.gitignore (1)

31-37: Excellent addition of Rust backend build outputs.

The new ignores properly account for the osgrep-core Rust crate structure and its native N-API bindings:

  • osgrep-core/*.node captures compiled native modules
  • osgrep-core/bench/opencode/ handles the optional nested benchmark repository
  • Patterns are syntactically correct and appropriately scoped
osgrep-core/build.rs (1)

1-5: LGTM!

Standard napi-rs build script setup. This correctly initializes the N-API build-time integration required for generating native bindings.

osgrep-core/Cargo.toml (1)

1-32: LGTM on the overall manifest structure.

The package configuration, dependency choices (napi-rs, tokenizers, hf-hub), and release profile optimizations (LTO, opt-level 3) are appropriate for a native performance core.

osgrep-core/tsconfig.json (1)

1-27: LGTM on the TypeScript configuration.

The ESNext/bundler-mode settings align well with modern Node.js native binding development. Strict mode with selective relaxation of unused-variable checks is reasonable for binding code.

osgrep-core/package.json (1)

1-26: LGTM on package manifest structure.

The ESM configuration, build scripts, and N-API settings are correctly structured for a native Rust binding package.

osgrep-core/src/lib.rs (2)

122-125: Verify quantization symmetry for negative values.

The quantization uses x * 127.0 clamped to [-128, 127], but dequantization (line 203) divides by 127.0. This creates slight asymmetry for values near -1.0 (they map to -127, not -128). This is a common trade-off, but ensure it aligns with your precision requirements.


1-265: LGTM on overall architecture.

The native core design with global singleton models, proper mutex synchronization, and clean N-API bindings is well-structured. The separation of dense embeddings, ColBERT indexing, and reranking into distinct APIs provides a clear interface for the Node.js layer.

osgrep-core/src/colbert_ort.rs (1)

311-343: MaxSim implementation looks correct.

The handling of skiplist tokens and the edge case where all doc tokens are skipped (contributing 0 rather than a negative score) is reasonable for ColBERT scoring semantics.

osgrep-core/test.mjs (1)

7-42: Smoke test structure is appropriate.

The test exercises the core API surface (initialization, dense embedding, batch embedding) and provides useful timing information. The async/catch pattern is correct for error handling.

osgrep-core/bench/rerank.ts (1)

1-5: Imports and setup look correct.

The benchmark imports utilities from _util and native bindings from the index. The @ts-ignore is appropriate for build-time generated code.

osgrep-core/bench/rerank-ort.ts (2)

56-74: Benchmark loop implementation is clean.

The loop correctly times each query, accumulates results, and provides useful intermediate output. The use of Array.from ensures the typed arrays are converted to regular arrays for storage.


76-84: Summary metrics are well-designed.

Reporting total time, average time per query, and throughput provides useful performance characterization. The calculations are correct.

osgrep-core/bench/dense.ts (2)

42-53: Batched processing with progress indicator is well-implemented.

The batch processing loop correctly handles the last partial batch via slice, and the progress indicator using \r provides a good user experience. The throughput calculation during progress is a nice touch.


55-66: Results summary provides useful metrics.

The final summary includes all relevant metrics (file count, chunk count, time, throughput, checksum). The checksum enables reproducibility validation across runs.

osgrep-core/bench/colbert-parity-rust.ts (1)

82-95: Verify colbertPreindexDocs() resets state per call (parity correctness depends on it).
This script assumes a fresh “local packed store per query”, but if the native layer appends docs, indices/ranking parity will be invalid.

osgrep-core/src/dense_ort.rs (4)

6-10: No-op logging function is consistent with codebase pattern.

This matches the same pattern used in colbert_ort.rs. Consider extracting to a shared module if more files adopt this pattern.


159-179: Mean pooling implementation is correct.

The mean pooling logic properly:

  • Accumulates embeddings weighted by attention mask
  • Uses max(1e-9) to avoid division by zero
  • Iterates correctly over batch, sequence, and hidden dimensions

The indexing i * max_len * self.hidden_size + j * self.hidden_size + k correctly accesses [batch, seq, hidden] layout.


182-193: L2 normalization implementation looks good.

Uses max(1e-12) to prevent division by zero, which is appropriate for numerical stability.


155-157: No verification needed. The try_extract_tensor::<f32>() API correctly returns a tuple (&Shape, &[T]) where index 1 is the data slice. This usage is consistent across dense_ort.rs and colbert_ort.rs and matches the documented API for ort 2.0.0-rc.10.

osgrep-core/bench/_util.ts (6)

21-43: walkFiles implementation is clean and handles common ignored directories.

Good use of a Set for O(1) lookup and recursive walking pattern.


48-62: Comprehensive code file extension list.

The extension set covers most common languages. Consider adding .dockerfile, .makefile, or Makefile (no extension) if needed for broader coverage, but current list is reasonable.


84-95: readFileSafe appropriately guards against large files and handles errors.

The 1MB limit is sensible for code files. The catch block with empty object handles TypeScript's unknown error type.


100-102: pickCandidates is a simple slice - consider documenting that order is preserved.

The function name suggests selection but it's just taking the first k elements. This is fine for benchmarks but the behavior should be clear to callers.


198-218: chunkOverlapsSpan path matching logic handles CodeAtlas format variations well.

The three-way match (exact, chunk ends with span, span ends with chunk) handles common path format inconsistencies. The line overlap check is correct for inclusive ranges.


175-188: Off-by-one check in endLine calculation.

Line 184 uses Math.min(i + linesPerChunk, lines.length) which is correct for the last chunk. However, for non-last chunks, i + linesPerChunk would exceed the actual lines in the chunk by 1 since slice(i, i + linesPerChunk) is exclusive on the end.

For example, with linesPerChunk=80 and chunk starting at line 0:

  • chunkLines = lines.slice(0, 80) → lines 0-79 (80 lines)
  • endLine = Math.min(0 + 80, lines.length) → 80 (should be 79 for inclusive)
         if (text.trim().length > 0) {
           allChunks.push({
             id: chunkId++,
             file: relativePath,
             startLine: i + 1,  // 1-indexed
-            endLine: Math.min(i + linesPerChunk, lines.length),  // 1-indexed, inclusive
+            endLine: Math.min(i + chunkLines.length, lines.length),  // 1-indexed, inclusive
             text,
           });
         }

Likely an incorrect or invalid review comment.

osgrep-core/index.d.ts (1)

41-60: Type declarations align with native binding usage.

The function signatures match the usage in src/lib/native/index.ts and the flexible union types (Int8Array | number[], Uint32Array | number[]) accommodate both typed arrays and plain arrays from the native binding.

src/lib/native/index.ts (4)

19-31: loadNative correctly caches the native module.

The lazy loading pattern with caching avoids redundant dynamic imports. Error message is helpful for debugging build issues.


110-121: embedColbert properly converts native result to typed arrays.

Clean extraction pattern. The interface ColbertPacked correctly documents the quantized INT8 format.


139-182: embedBatch correctly assembles hybrid embeddings per document.

The token cursor tracking for tokenIds is well-handled. One observation: new Uint32Array(tokenSlice) on line 175 creates a copy, which is appropriate since subarray shares the buffer.


204-219: RerankInput interface is well-documented.

Clear JSDoc comments for each field help consumers understand the expected data format.

Comment thread osgrep-core/bench/_util.ts Outdated
Comment on lines +8 to +15
export function cloneRepoIfMissing(repoUrl: string, dir: string): void {
if (existsSync(dir)) {
console.log(`Repository already exists at ${dir}`);
return;
}
console.log(`Cloning ${repoUrl} to ${dir}...`);
execSync(`git clone --depth 1 ${repoUrl} ${dir}`, { stdio: "inherit" });
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Command injection vulnerability in cloneRepoIfMissing.

The repoUrl and dir parameters are directly interpolated into a shell command without sanitization. A malicious repo URL or directory name could execute arbitrary commands.

+import { spawnSync } from "child_process";
+
 export function cloneRepoIfMissing(repoUrl: string, dir: string): void {
   if (existsSync(dir)) {
     console.log(`Repository already exists at ${dir}`);
     return;
   }
   console.log(`Cloning ${repoUrl} to ${dir}...`);
-  execSync(`git clone --depth 1 ${repoUrl} ${dir}`, { stdio: "inherit" });
+  const result = spawnSync("git", ["clone", "--depth", "1", repoUrl, dir], { stdio: "inherit" });
+  if (result.status !== 0) {
+    throw new Error(`Failed to clone repository: ${result.error?.message ?? 'unknown error'}`);
+  }
 }

Using spawnSync with an array of arguments avoids shell interpretation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function cloneRepoIfMissing(repoUrl: string, dir: string): void {
if (existsSync(dir)) {
console.log(`Repository already exists at ${dir}`);
return;
}
console.log(`Cloning ${repoUrl} to ${dir}...`);
execSync(`git clone --depth 1 ${repoUrl} ${dir}`, { stdio: "inherit" });
}
import { spawnSync } from "child_process";
export function cloneRepoIfMissing(repoUrl: string, dir: string): void {
if (existsSync(dir)) {
console.log(`Repository already exists at ${dir}`);
return;
}
console.log(`Cloning ${repoUrl} to ${dir}...`);
const result = spawnSync("git", ["clone", "--depth", "1", repoUrl, dir], { stdio: "inherit" });
if (result.status !== 0) {
throw new Error(`Failed to clone repository: ${result.error?.message ?? 'unknown error'}`);
}
}
🤖 Prompt for AI Agents
In osgrep-core/bench/_util.ts around lines 8 to 15, the function
cloneRepoIfMissing interpolates repoUrl and dir into a shell command, creating a
command-injection risk; replace execSync of a single string with a call to
child_process.spawnSync passing the git executable and an argument array (e.g.,
["clone", "--depth", "1", repoUrl, dir]) and set stdio: "inherit" and shell:
false, and additionally validate/normalize repoUrl and dir (reject or sanitize
control characters and disallow shell metacharacters/absolute paths if not
expected) before spawning to ensure inputs cannot inject shell commands.

Comment thread osgrep-core/bench/codeatlas-system.ts Outdated
Comment on lines +19 to +27
// @ts-ignore - generated at build time
import {
initDenseOrt,
denseEmbedOrt,
initColbertOrt,
colbertPreindexDocs,
colbertRerankPreindexed,
} from "../index.js";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Blocker: imports don’t match osgrep-core/index.js exports (as in this PR).
This benchmark will fail on startup unless osgrep-core/index.js re-exports these names or the script is updated to the new API.

🤖 Prompt for AI Agents
In osgrep-core/bench/codeatlas-system.ts around lines 19 to 27, the named
imports (initDenseOrt, denseEmbedOrt, initColbertOrt, colbertPreindexDocs,
colbertRerankPreindexed) do not match the current exports from
osgrep-core/index.js; update this file so the imports match the actual exported
identifiers (or alternatively add re-exports for these names in
osgrep-core/index.js). Change the import list to the new API names exactly as
exported, or add corresponding export statements in index.js forwarding the real
implementations, and remove any stale or incorrect imports so the benchmark can
start without import errors.

Comment thread osgrep-core/bench/colbert-parity-rust.ts Outdated
Comment on lines +51 to +57
function topKOverlap(a: number[], b: number[], k: number): number {
const sa = new Set(a.slice(0, k));
const sb = new Set(b.slice(0, k));
let inter = 0;
for (const x of sa) if (sb.has(x)) inter++;
return inter / k;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard topKOverlap against k=0 (and short rankings).
return inter / k will produce Infinity/NaN if --topk 0, and can skew results when either ranking has <k items.

 function topKOverlap(a: number[], b: number[], k: number): number {
+  if (k <= 0) return 0;
   const sa = new Set(a.slice(0, k));
   const sb = new Set(b.slice(0, k));
   let inter = 0;
   for (const x of sa) if (sb.has(x)) inter++;
-  return inter / k;
+  return inter / k;
 }

Also applies to: 62-63, 99-100

Comment thread osgrep-core/bench/colbert-parity.ts Outdated
Comment on lines +18 to +26
// @ts-ignore - generated at build time
import {
initDenseOrt,
denseEmbedOrt,
initColbertOrt,
colbertPreindexDocs,
colbertRerankPreindexed,
} from "../index.js";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Blocker: ../index.js doesn’t export these *Ort functions (as in this PR).
Either update this tool to the new API names or add compatibility exports in osgrep-core/index.js.

🤖 Prompt for AI Agents
In osgrep-core/bench/colbert-parity.ts around lines 18 to 26, this file imports
initDenseOrt, denseEmbedOrt, initColbertOrt, colbertPreindexDocs, and
colbertRerankPreindexed from "../index.js", but ../index.js no longer exports
those names. Fix by either (A) updating these imports to the new exported API
names used in osgrep-core/index.js (replace each old name with the corresponding
new function name throughout this file), or (B) add compatibility exports in
osgrep-core/index.js that re-export the new implementations under the legacy
names (export { newInitDense as initDenseOrt, ... } for each of the five
functions) so existing callers keep working.

Comment thread osgrep-core/bench/rerank-preindexed.ts Outdated
}
}

main().catch(console.error);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return non-zero exit code on failure.
Same as other bench files.

-main().catch(console.error);
+main().catch((e) => {
+  console.error(e);
+  process.exit(1);
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
main().catch(console.error);
main().catch((e) => {
console.error(e);
process.exit(1);
});
🤖 Prompt for AI Agents
In osgrep-core/bench/rerank-preindexed.ts around line 94, the catch handler
currently only logs the error; update it to log the error and then exit the
process with a non-zero status so failures propagate to CI. Replace the simple
console.error catch with a handler that prints the error and calls
process.exit(1) (matching the pattern used in the other bench files).

Comment thread osgrep-core/Cargo.toml
hf-hub = "0.4"

# ONNX Runtime (the only ML backend we need)
ort = { version = "2.0.0-rc.10", default-features = true, features = ["download-binaries"] }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

ort crate 2.0.0 stable version crates.io rust

💡 Result:

No — there is no stable ort 2.0.0 on crates.io. The crate has 2.0.0 release candidates (e.g. 2.0.0-rc.10) while the latest stable series remains in the 1.x line (latest published stable shown as 1.16.3). [1][2][3]

Sources:

  • lib.rs crate page (lists 2.0.0-rc.*). [1]
  • docs.rs listings for ort 2.0.0-rc.*. [2]
  • ort crate latest/stable docs (1.16.3). [3]

Consider pinning to a stable version instead of release candidate.

Using ort = "2.0.0-rc.10" pins to a release candidate version. The latest stable release is 1.16.3; no stable 2.0.0 has been published. Release candidates can introduce breaking changes before the stable release. Decide if the RC is acceptable for your stability requirements, or consider using 1.16.3 if the 2.0 features are not essential.

🤖 Prompt for AI Agents
In osgrep-core/Cargo.toml around line 25, the dependency is pinned to a
release-candidate ort = "2.0.0-rc.10" which is not a stable release; either
change the version to the latest stable ort = "1.16.3" (or another chosen stable
release) to ensure stability, or if 2.0.0-rc.10 is required, add a comment
explaining the rationale and any planned upgrade path and run cargo update /
cargo test to verify compatibility.

Comment thread osgrep-core/index.js
Comment on lines +106 to +124
// Clean API exports
export const {
// Initialization
initModels,
isInitialized,

// Dense embeddings (384-dim)
embedDense,

// ColBERT embeddings (48-dim per token, packed)
embedColbertPacked,

// ColBERT reranking
encodeQueryColbert,
rerankColbert,

// Convenience: both embeddings in one call
embedBatch,
} = nativeBinding;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Blocker: export surface doesn’t match the bench scripts added in this PR.
Given the bench files import initDenseOrt/denseEmbedOrt/initColbertOrt/colbertPreindexDocs/colbertRerankPreindexed/..., this module should either:

  1. re-export those legacy names as aliases to the new API, or
  2. the bench scripts should be updated to only use initModels/embedDense/encodeQueryColbert/rerankColbert/embedBatch.

Right now, the benches will crash on import.

 export const {
   // Initialization
   initModels,
   isInitialized,
@@
   embedBatch,
 } = nativeBinding;
+
+// ---------------------------------------------------------------------------
+// Back-compat aliases for bench scripts (remove once benches are updated)
+// ---------------------------------------------------------------------------
+export const initDenseOrt = nativeBinding.initDenseOrt ?? nativeBinding.initModels;
+export const denseEmbedOrt = nativeBinding.denseEmbedOrt ?? nativeBinding.embedDense;
+export const initColbertOrt = nativeBinding.initColbertOrt ?? nativeBinding.initModels;
+export const colbertPreindexDocs = nativeBinding.colbertPreindexDocs ?? nativeBinding.embedColbertPacked;
+export const colbertRerankPreindexed = nativeBinding.colbertRerankPreindexed ?? nativeBinding.rerankColbert;
+export const colbertRerankOrt = nativeBinding.colbertRerankOrt ?? nativeBinding.rerankColbert;
+export const denseEmbedChecksumOrt = nativeBinding.denseEmbedChecksumOrt;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Clean API exports
export const {
// Initialization
initModels,
isInitialized,
// Dense embeddings (384-dim)
embedDense,
// ColBERT embeddings (48-dim per token, packed)
embedColbertPacked,
// ColBERT reranking
encodeQueryColbert,
rerankColbert,
// Convenience: both embeddings in one call
embedBatch,
} = nativeBinding;
// Clean API exports
export const {
// Initialization
initModels,
isInitialized,
// Dense embeddings (384-dim)
embedDense,
// ColBERT embeddings (48-dim per token, packed)
embedColbertPacked,
// ColBERT reranking
encodeQueryColbert,
rerankColbert,
// Convenience: both embeddings in one call
embedBatch,
} = nativeBinding;
// ---------------------------------------------------------------------------
// Back-compat aliases for bench scripts (remove once benches are updated)
// ---------------------------------------------------------------------------
export const initDenseOrt = nativeBinding.initDenseOrt ?? nativeBinding.initModels;
export const denseEmbedOrt = nativeBinding.denseEmbedOrt ?? nativeBinding.embedDense;
export const initColbertOrt = nativeBinding.initColbertOrt ?? nativeBinding.initModels;
export const colbertPreindexDocs = nativeBinding.colbertPreindexDocs ?? nativeBinding.embedColbertPacked;
export const colbertRerankPreindexed = nativeBinding.colbertRerankPreindexed ?? nativeBinding.rerankColbert;
export const colbertRerankOrt = nativeBinding.colbertRerankOrt ?? nativeBinding.rerankColbert;
export const denseEmbedChecksumOrt = nativeBinding.denseEmbedChecksumOrt;
🤖 Prompt for AI Agents
In osgrep-core/index.js around lines 106 to 124, the exported names no longer
match the legacy bench script imports so the benches crash; add re-exports
(aliases) for the legacy names to the new API surface: export initDenseOrt ->
initModels, denseEmbedOrt -> embedDense, initColbertOrt -> initModels (or
another appropriate init if separate), colbertPreindexDocs ->
embedColbertPacked, colbertRerankPreindexed -> rerankColbert, and any other
legacy identifiers used by the benches, so existing bench imports continue to
work without changing the bench files.

Comment thread osgrep-core/src/lib.rs
Comment on lines +191 to +193
let hidden_size = 48usize;
let query_seq_len = query_embeddings.len() / hidden_size;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential division by zero if query_embeddings is empty.

If query_embeddings has zero length, hidden_size (48) would cause query_seq_len to be 0, which may lead to unexpected behavior downstream. Consider adding input validation.

     let hidden_size = 48usize;
+    if query_embeddings.is_empty() {
+        return Err(Error::from_reason("Query embeddings cannot be empty"));
+    }
     let query_seq_len = query_embeddings.len() / hidden_size;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let hidden_size = 48usize;
let query_seq_len = query_embeddings.len() / hidden_size;
let hidden_size = 48usize;
if query_embeddings.is_empty() {
return Err(Error::from_reason("Query embeddings cannot be empty"));
}
let query_seq_len = query_embeddings.len() / hidden_size;
🤖 Prompt for AI Agents
In osgrep-core/src/lib.rs around lines 191 to 193, the code calculates
query_seq_len by dividing query_embeddings.len() by a fixed hidden_size (48)
without validating inputs; add validation to ensure query_embeddings is not
empty and its length is divisible by hidden_size, and handle failures by
returning a clear Err (or Result) or panicking with a descriptive message.
Specifically, check if query_embeddings.is_empty() and if query_embeddings.len()
% hidden_size != 0, and return an error (or bail) that identifies the problem
and includes the offending length and expected hidden_size so downstream code
won’t see an invalid sequence length.

Comment thread src/lib/native/index.ts
Comment on lines +231 to +261
export async function rerankColbert(input: RerankInput): Promise<RerankResult> {
await initNative();
const n = await loadNative();

const q = Float64Array.from(input.queryEmbedding as any);

const docs =
input.docEmbeddings instanceof Int8Array
? input.docEmbeddings
: new Int8Array(input.docEmbeddings as any);

const tokenIds =
input.docTokenIds instanceof Uint32Array
? input.docTokenIds
: Uint32Array.from(input.docTokenIds as any);

const result = n.rerankColbert(
q,
docs,
tokenIds,
input.docLengths,
input.docOffsets,
input.candidateIndices,
input.topK
);

return {
indices: Array.from(result.indices),
scores: Array.from(result.scores),
};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Type conversion in rerankColbert uses as any casts.

Lines 235, 240, and 245 use as any to convert between Float32Array and Float64Array. This works but loses type safety. The native binding expects Float64Array for query embeddings but the RerankInput interface declares Float32Array.

Consider aligning the interface type with what the native binding expects:

 export interface RerankInput {
-  /** Query ColBERT embedding from encodeQueryColbert */
-  queryEmbedding: Float32Array;
+  /** Query ColBERT embedding from encodeQueryColbert (will be converted to Float64Array) */
+  queryEmbedding: Float32Array | Float64Array;

Or document the conversion explicitly.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/lib/native/index.ts around lines 231 to 261, the function rerankColbert
uses unsafe "as any" casts at lines ~235, ~240 and ~245 to coerce embeddings and
token arrays to the native types; update the code to remove "as any" by either
(a) changing the RerankInput interface so queryEmbedding is declared as
Float64Array and docEmbeddings/docTokenIds have the exact typed arrays the
native binding expects, or (b) keep the interface but perform an explicit,
type-safe conversion/validation (e.g., check instance types and create new
Float64Array/Int8Array/Uint32Array copies where necessary) before calling
n.rerankColbert, and throw a clear error if conversion is not possible; ensure
no "as any" casts remain and that runtime conversions preserve numeric values
(Float32 -> Float64) and validate lengths/offsets before invoking the native
binding.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
osgrep-core/package.json (2)

12-23: N-API configuration looks good for initial release.

The platform triple setup with defaults: false and explicit targets is a pragmatic approach for V1 that avoids unnecessary cross-compilation overhead while covering the most common deployment targets (macOS, Linux glibc, Windows).

Consider adding aarch64-unknown-linux-gnu (ARM64 Linux) to the platform triples if this is deployed on cloud platforms supporting ARM instances, as it's increasingly common for cost/efficiency reasons.


1-30: Add package metadata fields (optional for V1 but recommended for publishing).

The package is missing standard metadata fields that improve discoverability and clarity for maintainers and users:

  • description: Brief summary of osgrep-core functionality
  • license: Important for OSS projects to clarify terms
  • repository: Helpful for users and tools that want to navigate to the source

These are not blockers for V1 but are best-practice for published packages.

 {
   "name": "osgrep-core",
   "version": "0.1.0",
+  "description": "Native Rust bindings for osgrep core embedding and reranking functionality",
   "type": "module",
   "main": "index.js",
   "types": "index.d.ts",
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/Ryandonofrio3/osgrep.git"
+  },
   "files": [
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 1e936d2 and 70e48c1.

📒 Files selected for processing (1)
  • osgrep-core/package.json (1 hunks)
🔇 Additional comments (1)
osgrep-core/package.json (1)

5-6: Entry points exist; clarify intent for native binary declaration.

The declared entry points index.js and index.d.ts are present in the osgrep-core directory. However, the "files" field includes "*.node" (line 10) without any .node files currently present. If native bindings are expected to be built at package installation time, confirm this is intentional; otherwise, remove the "*.node" pattern from the declared files list.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/workers/orchestrator.ts (1)

155-183: Be explicit about “stored path” vs “read/chunk path” to avoid mismatched metadata.
absolutePath is used for reading/validation, but input.path is used for chunking (chunkFile) and persisted into PreparedChunk.path. If input.path can differ from the actual file identity (symlinks, callers passing mismatched values), chunks/anchors and stored records can drift.

If the intent is “store relative path, but chunk using the real path”, consider passing absolutePath into chunkFile() / buildAnchorChunk() while still storing input.path in PreparedChunk.path (or rename fields to make this explicit).

♻️ Duplicate comments (1)
src/lib/workers/orchestrator.ts (1)

244-327: Fix rerank() packing invariants before calling native: tokenIds length must equal sum(docLengths), and query flattening must respect colbertDim.
Two native-boundary hazards here:

  1. docLengths is derived from embeddings (seqLen), but tokenIds can be [] (missing) or shorter than seqLen. That makes packedTokenIds.length < sum(docLengths), which is a classic out-of-bounds read risk in native code.

  2. Query flattening currently uses each row.length; docs use input.colbertDim. If rows are ragged or not exactly colbertDim, you can send a query vector whose shape doesn’t match the doc shape.

Suggested patch:

-    const queryEmbedding: number[] = [];
-    for (const row of input.query) {
-      for (let i = 0; i < row.length; i++) {
-        queryEmbedding.push(row[i] ?? 0);
-      }
-    }
+    const queryEmbedding: number[] = [];
+    for (const row of input.query) {
+      for (let d = 0; d < input.colbertDim; d++) {
+        queryEmbedding.push(row?.[d] ?? 0);
+      }
+    }

     for (let i = 0; i < input.docs.length; i++) {
       const doc = input.docs[i];
       const col = doc.colbert;
@@
       const seqLen = Math.floor(colbert.length / input.colbertDim);
       const used = colbert.subarray(0, seqLen * input.colbertDim);

-      const tokenIdsRaw = doc.token_ids ?? [];
-      const tokenIds = Uint32Array.from(
-        tokenIdsRaw.slice(0, seqLen).map((v) => (Number.isFinite(v) ? v : 0)),
-      );
+      const tokenIdsRaw = doc.token_ids ?? [];
+      const tokenIds = new Uint32Array(seqLen);
+      for (let t = 0; t < seqLen; t++) {
+        const v = tokenIdsRaw[t];
+        tokenIds[t] = Number.isFinite(v) ? (v as number) : 0;
+      }

Also: your prior review note about offset terminology still applies (these are element offsets, which happen to equal byte offsets for Int8Array).

🧹 Nitpick comments (2)
.gitignore (2)

2-3: Consider removing redundant node_modules pattern.

The pattern node_modules/ (line 2) already matches node_modules directories at any depth in the git tree. The **/node_modules/ (line 3) pattern is redundant.

 dist/
 node_modules/
-**/node_modules/
 *.tsbuildinfo

48-49: Consider removing redundant target directory pattern.

The pattern **/target/ (line 49) already matches the osgrep-core/target/ case on line 48. The explicit path is redundant.

 # Native core (Rust / N-API) build outputs
-osgrep-core/target/
 **/target/
 osgrep-core/*.node
 osgrep-core/bench/
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 70e48c1 and 2303bca.

⛔ Files ignored due to path filters (2)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
  • public/bench.png is excluded by !**/*.png
📒 Files selected for processing (16)
  • .gitignore (1 hunks)
  • LICENSE (0 hunks)
  • NOTICE (0 hunks)
  • README.md (3 hunks)
  • experiments/mrr-sweep.ts (0 hunks)
  • experiments/quick_check.ts (0 hunks)
  • experiments/ranking-test.ts (0 hunks)
  • experiments/verify-fix.ts (0 hunks)
  • package.json (2 hunks)
  • plugins/osgrep/skills/osgrep/SKILL.md (1 hunks)
  • scripts/compare-engines.ts (0 hunks)
  • scripts/index-bench.sh (0 hunks)
  • scripts/sync-eval-cases.ts (0 hunks)
  • scripts/verify_skeleton.ts (0 hunks)
  • src/lib/workers/orchestrator.ts (8 hunks)
  • test_skeleton.py (0 hunks)
💤 Files with no reviewable changes (11)
  • scripts/index-bench.sh
  • LICENSE
  • experiments/quick_check.ts
  • experiments/ranking-test.ts
  • NOTICE
  • scripts/verify_skeleton.ts
  • scripts/compare-engines.ts
  • experiments/mrr-sweep.ts
  • scripts/sync-eval-cases.ts
  • test_skeleton.py
  • experiments/verify-fix.ts
🧰 Additional context used
🧬 Code graph analysis (1)
src/lib/workers/orchestrator.ts (4)
src/lib/index/chunker.ts (3)
  • TreeSitterChunker (211-761)
  • ChunkWithContext (60-65)
  • chunk (268-313)
src/lib/native/index.ts (4)
  • initNative (36-52)
  • embedBatch (139-182)
  • encodeQueryColbert (192-198)
  • rerankColbert (231-261)
src/lib/store/types.ts (2)
  • PreparedChunk (7-29)
  • VectorRecord (31-35)
src/config.ts (1)
  • CONFIG (18-26)
🔇 Additional comments (5)
.gitignore (1)

47-51: Rust backend additions are correct and well-organized.

The new entries for osgrep-core build artifacts (osgrep-core/*.node, osgrep-core/bench/, **/target/) align perfectly with the introduction of a native Rust/N-API binding layer. The ignore rules will keep compiled bindings, Rust build outputs, and benchmark directories out of version control.

package.json (1)

28-28: Verify bun availability in CI and development setup.

The prepublishOnly script now uses bun run build instead of pnpm. Ensure bun is available in your CI pipeline and documented as a development prerequisite. If developers are expected to use bun, verify the development documentation (README, CONTRIBUTING guides) clearly states this requirement.

README.md (1)

259-261: Ensure bun is documented as a development requirement.

The development commands have switched to bun. Verify that a CONTRIBUTING guide or setup documentation clearly states bun as a prerequisite (e.g., "Install bun from https://bun.sh/docs/installation"). This complements the change to use bun in the CI/build pipeline.

plugins/osgrep/skills/osgrep/SKILL.md (1)

49-56: No changes needed—documentation is correct.

The SKILL.md examples use -m 20 (lowercase) and --compact, both of which are valid osgrep flags. There is no -M uppercase variant; -m (lowercase) is the correct flag for maximum results. The documentation matches the actual osgrep CLI.

src/lib/workers/orchestrator.ts (1)

64-74: Init gating + singleton wrapper pattern looks solid.
ensureReady() correctly coalesces init, and the thin wrappers keep call sites simple.

Also applies to: 334-359

Comment thread package.json
Comment on lines 36 to 40
"files": [
"dist",
"plugins",
"README.md",
"LICENSE",
"NOTICE"
"README.md"
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add LICENSE file to the published distribution.

The files array should include the LICENSE file to comply with Apache-2.0 license distribution requirements. While "license": "Apache-2.0" is declared in the manifest, the actual LICENSE file must be included in npm packages for proper legal attribution.

Apply this diff to add the LICENSE file:

  "files": [
    "dist",
    "plugins",
-   "README.md"
+   "README.md",
+   "LICENSE"
  ],

Consider also adding the NOTICE file if it contains important attribution information.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"files": [
"dist",
"plugins",
"README.md",
"LICENSE",
"NOTICE"
"README.md"
],
"files": [
"dist",
"plugins",
"README.md",
"LICENSE"
],
🤖 Prompt for AI Agents
In package.json around lines 36 to 40, the "files" array omits the LICENSE file;
update the array to include "LICENSE" (and optionally "NOTICE" if present) so
the LICENSE is packaged and published; modify the files entry to add "LICENSE"
(and "NOTICE" if applicable) ensuring the repository root contains the LICENSE
file before publishing.

Comment thread package.json
"dependencies": {
"@clack/prompts": "^0.11.0",
"@huggingface/transformers": "^3.8.0",
"osgrep-core": "file:./osgrep-core",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Check if osgrep-core directory exists locally
echo "=== Checking osgrep-core directory ==="
if [ -d "osgrep-core" ]; then
  echo "✓ osgrep-core directory found locally"
  if [ -f "osgrep-core/package.json" ]; then
    echo "✓ osgrep-core has package.json"
    head -20 osgrep-core/package.json
  fi
else
  echo "✗ osgrep-core directory NOT found locally"
fi

# Check package.json for workspaces configuration
echo -e "\n=== Checking workspace configuration ==="
if [ -f "package.json" ]; then
  rg '"workspaces"' package.json -A 10 || echo "No workspaces config found"
fi

# Check for postinstall/postbuild scripts
echo -e "\n=== Checking scripts in package.json ==="
rg '"(postinstall|postbuild|prepare|build)"' package.json -A 1 || echo "No postinstall/postbuild found"

# Check files array in package.json
echo -e "\n=== Checking files array ==="
rg '"files"' package.json -A 15 || echo "No files array found"

# Check main package.json for osgrep-core reference
echo -e "\n=== Checking osgrep-core dependency ==="
rg '"osgrep-core"' package.json -B 2 -A 2

Repository: Ryandonofrio3/osgrep

Length of output: 1518


🏁 Script executed:

#!/bin/bash

# Check npm registry for osgrep-core package
echo "=== Checking if osgrep-core is published to npm ==="
npm view osgrep-core 2>&1 | head -20

# Check root package.json for complete build/install workflow
echo -e "\n=== Checking root package.json scripts ==="
rg '"scripts"' package.json -A 20

# Check if there's a build workflow that handles both packages
echo -e "\n=== Checking for build configuration files ==="
ls -la | grep -E "tsconfig|webpack|vite|esbuild|build"
find . -maxdepth 2 -name "*.json" -type f | head -20

Repository: Ryandonofrio3/osgrep

Length of output: 1341


Fix osgrep-core distribution before publishing to npm—this will cause installation failures.

The dependency "osgrep-core": "file:./osgrep-core" references a local path and cannot be distributed via npm. Verification confirms:

  • osgrep-core is not published to npm (404 not found)
  • The files array does not include osgrep-core directory
  • No workspace configuration exists
  • The prepublishOnly script only runs TypeScript compilation and does not build osgrep-core

When external users install from npm, the package will fail because the relative file: reference will point to a non-existent directory.

Choose one of these approaches:

  1. Use npm workspaces (recommended for monorepos): Add "workspaces": ["osgrep-core"] to root package.json
  2. Publish osgrep-core separately: Publish osgrep-core to npm and reference by version (e.g., "^0.1.0")
  3. Bundle osgrep-core: Add osgrep-core to the files array, build native bindings in prepublishOnly, and include prebuilt .node files
  4. Use postinstall: Add a postinstall script that builds or downloads osgrep-core
🤖 Prompt for AI Agents
package.json around line 44: the dependency "osgrep-core": "file:./osgrep-core"
references a local path which will break installs from npm; fix by either (A)
enable npm workspaces for the monorepo by adding a "workspaces": ["osgrep-core"]
entry to the root package.json and remove the file: reference, or (B) publish
osgrep-core to npm and change the dependency to its published version (e.g.,
"^0.1.0"), or (C) bundle osgrep-core into the package by adding it to the
"files" array and updating prepublishOnly to build native artifacts (.node) into
the published package, or (D) add a postinstall script that builds or downloads
osgrep-core on install; pick one approach and update package.json scripts/fields
accordingly and ensure the chosen flow is covered by CI before publishing.

Comment thread README.md
Comment on lines 20 to +26
Natural-language search that works like `grep`. Fast, local, and built for coding agents.

- **Semantic:** Finds concepts ("where do transactions get created?"), not just strings.
- **Call Graph Tracing:** Map dependencies with `trace` to see who calls what.
- **Role Detection:** Distinguishes `ORCHESTRATION` (high-level logic) from `DEFINITION` (types/classes).
- **Local & Private:** 100% local embeddings via `onnxruntime-node`.
- **Auto-Isolated:** Each repository gets its own index automatically.
- **Agent-Ready:** Native output with symbols, roles, and call graphs.
- **Agent-Ready:** Native output with symbols and roles.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update outdated embeddings technology reference.

Line 24 still references onnxruntime-node for embeddings, but this dependency has been removed from package.json (replaced with Rust-native bindings via osgrep-core). The README should reflect the new architecture.

Apply this diff to update the embeddings description:

- **Local & Private:** 100% local embeddings via `onnxruntime-node`.
+ **Local & Private:** 100% local embeddings via Rust-native bindings.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Natural-language search that works like `grep`. Fast, local, and built for coding agents.
- **Semantic:** Finds concepts ("where do transactions get created?"), not just strings.
- **Call Graph Tracing:** Map dependencies with `trace` to see who calls what.
- **Role Detection:** Distinguishes `ORCHESTRATION` (high-level logic) from `DEFINITION` (types/classes).
- **Local & Private:** 100% local embeddings via `onnxruntime-node`.
- **Auto-Isolated:** Each repository gets its own index automatically.
- **Agent-Ready:** Native output with symbols, roles, and call graphs.
- **Agent-Ready:** Native output with symbols and roles.
Natural-language search that works like `grep`. Fast, local, and built for coding agents.
- **Semantic:** Finds concepts ("where do transactions get created?"), not just strings.
- **Role Detection:** Distinguishes `ORCHESTRATION` (high-level logic) from `DEFINITION` (types/classes).
- **Local & Private:** 100% local embeddings via Rust-native bindings.
- **Auto-Isolated:** Each repository gets its own index automatically.
- **Agent-Ready:** Native output with symbols and roles.
🤖 Prompt for AI Agents
In README.md around lines 20 to 26, update the embeddings/technology reference:
replace the outdated mention of `onnxruntime-node` with the current Rust-native
bindings via `osgrep-core`. Edit the bullet "Local & Private" to state that
embeddings are produced locally using the Rust-native `osgrep-core` bindings (or
similar phrasing), and ensure wording reflects that `onnxruntime-node` is no
longer used and the package.json now depends on `osgrep-core`.

Comment on lines +216 to 229
const colbertFlat = await encodeQueryColbert(text);
const dim = CONFIG.COLBERT_DIM;
const seqLen = colbertFlat.length / dim;

// Reshape to matrix
const matrix: number[][] = [];

for (let s = 0; s < seq; s++) {
let sumSq = 0;
const offset = s * dim;
for (let d = 0; d < dim; d++) {
const val = data[offset + d];
sumSq += val * val;
}
const norm = Math.sqrt(sumSq);

for (let s = 0; s < seqLen; s++) {
const row: number[] = [];
if (norm > 1e-9) {
for (let d = 0; d < dim; d++) {
row.push(data[offset + d] / norm);
}
} else {
for (let d = 0; d < dim; d++) {
row.push(data[offset + d]);
}
}
matrix.push(row);
}

// Compute pooled embedding (mean of tokens)
const pooled = new Float32Array(dim);
for (const row of matrix) {
for (let d = 0; d < dim; d++) {
pooled[d] += row[d];
}
}
// Normalize pooled
let sumSq = 0;
for (let d = 0; d < dim; d++) {
pooled[d] /= matrix.length || 1;
sumSq += pooled[d] * pooled[d];
}
const norm = Math.sqrt(sumSq);
if (norm > 1e-9) {
for (let d = 0; d < dim; d++) {
pooled[d] /= norm;
row.push(colbertFlat[s * dim + d]);
}
matrix.push(row);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Floor seqLen (and ideally assert divisibility) when reshaping query embeddings.
const seqLen = colbertFlat.length / dim; can be non-integer; the loop will then run one extra iteration and index past the end of the typed array.

-    const seqLen = colbertFlat.length / dim;
+    const seqLen = Math.floor(colbertFlat.length / dim);
+    // Optional: if (colbertFlat.length % dim !== 0) throw new Error("Invalid ColBERT query embedding length");
🤖 Prompt for AI Agents
In src/lib/workers/orchestrator.ts around lines 216 to 229, seqLen is computed
as colbertFlat.length / dim which may be non-integer and cause out-of-bounds
access when reshaping; compute seqLen = Math.floor(colbertFlat.length / dim) and
add an assertion or explicit check that colbertFlat.length % dim === 0 (throwing
or logging a clear error if not) so you either fail fast on bad input or handle
the remainder intentionally before running the nested loops.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
.gitignore (3)

2-3: Redundant node_modules patterns are ok, but you can simplify to one.
Keeping both node_modules/ and **/node_modules/ works; if you want to reduce noise, **/node_modules/ alone is typically sufficient.


9-10: Deduplicate .osgrep/ vs .osgrep to avoid ambiguity.
You currently ignore both .osgrep/ (directory) and .osgrep (file). If only the directory exists, drop the file pattern; if both can exist, consider a clarifying comment.

Also applies to: 52-52


38-40: .env* may unintentionally ignore .env.example / .env.sample.
If you rely on committed env templates, tighten this (e.g., ignore .env, .env.local, .env.*.local explicitly) and keep .env.example tracked.

.github/workflows/release.yml (1)

56-56: Avoid redundant napi-rs CLI installation.

The @napi-rs/cli is installed in both build-native (line 56) and publish (line 92). While idempotent, this adds unnecessary steps and time to the publish job.

Install napi-rs CLI once in a shared setup step or cache it across jobs. Alternatively, if job parallelism is important, this redundancy is acceptable—document the rationale with a comment.

Confirm whether these jobs can share a cached global npm installation or if separate installs are intentional for build isolation.

Also applies to: 92-92

.github/workflows/ci.yml (1)

54-57: Node setup may be redundant after bun setup.

Both bun (line 51-52) and Node (line 54-57) are configured in the ci job. Bun includes Node.js, so the separate Node setup may be unnecessary unless there's a specific version requirement or tooling dependency on npm/pnpm.

Verify whether the explicit Node 20 setup is required. If bun's built-in Node is sufficient for typecheck/test/build scripts, this step can be removed to simplify the workflow.

Confirm whether the osgrep project requires a specific npm/pnpm version or global tools that are not available through bun's Node.

osgrep-core/package.json (1)

39-44: Platform package version pinning is strict; ensure consistency with release workflow.

The optional dependencies on platform-specific packages (e.g., @osgrep-core/darwin-arm64@0.1.0) use exact version constraints. Verify that the release workflow publishes platform packages with matching versions, and confirm this version synchronization is documented or automated.

Confirm that the .github/workflows/release.yml build-native and publish jobs ensure all platform-specific packages are published at the same version (0.1.0) as declared in this package.json. If versions drift, npm resolution will fail.

Consider documenting the version constraint strategy (e.g., via a comment or CONTRIBUTING.md) to prevent future version mismatches between the meta package and platform packages.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2303bca and 95d88fb.

📒 Files selected for processing (9)
  • .github/workflows/ci.yml (1 hunks)
  • .github/workflows/release.yml (2 hunks)
  • .gitignore (1 hunks)
  • osgrep-core/package.json (1 hunks)
  • src/lib/index/chunker.ts (1 hunks)
  • src/lib/index/grammar-loader.ts (1 hunks)
  • src/lib/output/formatter.ts (2 hunks)
  • src/lib/output/json-formatter.ts (0 hunks)
  • src/lib/utils/formatter.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • src/lib/output/json-formatter.ts
✅ Files skipped from review due to trivial changes (1)
  • src/lib/utils/formatter.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/output/formatter.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-native
🔇 Additional comments (8)
.gitignore (3)

24-29: Double-check ignoring public/ (often a tracked asset directory).
If public/ contains shipped/static assets, ignoring it will cause accidental deletions/untracked content for other devs/CI. If it’s truly dev-only generated output, all good—just worth confirming.


42-45: Lockfile policy: consider explicitly addressing bun.lockb.
You’re ignoring npm/pnpm/yarn lockfiles “(using bun)”; make sure the intended bun lockfile behavior is explicit (committed vs ignored).


47-51: Native/Rust output ignores look correct; **/target/ is broad—confirm scope.
If the repo includes non-Rust tooling that also uses target/, this could hide relevant artifacts; otherwise it’s a good catch-all for Rust builds.

src/lib/index/chunker.ts (1)

4-4: LGTM! Consistent with the import path refactoring.

The import path update from ../core/languages to ../store/languages is consistent with the change in grammar-loader.ts and aligns with the repository-wide restructuring.

src/lib/index/grammar-loader.ts (1)

5-5: All imports from the old path have been successfully updated.

The search for remaining references to ../core/languages returned no results, confirming that the refactoring is complete. The import in this file correctly uses the new path ../store/languages.

.github/workflows/release.yml (2)

108-123: Strong tag and version verification in place.

The workflow correctly verifies that the tag commit is an ancestor of main (lines 108–114) and that the git tag matches the package.json version (lines 116–123). This prevents publishing mismatched or out-of-branch releases.


60-60: napi-rs CLI command syntax is correct and properly supports cross-compilation.

The napi build --platform --release --target ${{ matrix.target }} command uses the documented napi-rs v3.x CLI flags correctly. The --target flag is directly passed to cargo and supports all five target triples specified in the workflow matrix (aarch64-apple-darwin, x86_64-apple-darwin, x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, x86_64-pc-windows-msvc).

.github/workflows/ci.yml (1)

65-75: Bun-based workflow replaces pnpm successfully.

All three scripts referenced in the workflow are properly defined in package.json: typecheck (tsc --noEmit), test (vitest run), and build (tsc). The transition from pnpm to bun is clean and functional.

Comment on lines +100 to +105
run: |
cd osgrep-core
for dir in artifacts/bindings-*/; do
mv "$dir"*.node . 2>/dev/null || true
done
rm -rf artifacts
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Artifact movement script lacks error visibility.

The defensive || true silently ignores errors when moving .node files. If artifacts fail to copy due to unexpected directory structure or permissions, the publish job will still proceed with missing binaries.

Add explicit verification that artifacts were moved:

  cd osgrep-core
  for dir in artifacts/bindings-*/; do
-   mv "$dir"*.node . 2>/dev/null || true
+   if ! mv "$dir"*.node . 2>/dev/null; then
+     echo "Warning: Failed to move artifacts from $dir"
+   fi
  done
  rm -rf artifacts
- ls -la *.node
+ if ! ls *.node 1>/dev/null 2>&1; then
+   echo "Error: No .node files found in osgrep-core after moving artifacts"
+   exit 1
+ fi
+ ls -la *.node

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
.github/workflows/release.yml around lines 100 to 105: the current loop uses "||
true" which hides failures when moving .node artifacts; replace the silent
ignore with explicit checks: attempt to move files, capture whether any .node
files were moved, emit a clear error message if none were moved (including
listing the artifacts/bindings-*/ directories for diagnosis) and exit non‑zero
to fail the job; ensure permissions errors are surfaced by removing the "||
true" and adding a conditional that fails the step when the moved-file count is
zero.

Comment thread osgrep-core/package.json
{
"name": "osgrep-core",
"version": "0.1.0",
"description": "Native Rust core for osgrep (embeddings + reranking via ONNX)",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update package description to reflect actual functionality.

The description mentions "ONNX" but based on the PR objective and summary, this is a native Rust core. The ONNX reference may be outdated or a copy-paste error from a previous iteration.

Update the description to accurately reflect the package's purpose:

- "description": "Native Rust core for osgrep (embeddings + reranking via ONNX)",
+ "description": "Native Rust core for osgrep (embeddings + reranking)",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"description": "Native Rust core for osgrep (embeddings + reranking via ONNX)",
"description": "Native Rust core for osgrep (embeddings + reranking)",
🤖 Prompt for AI Agents
In osgrep-core/package.json around line 4, the package description incorrectly
references "ONNX"; update the "description" field to accurately describe the
package as the native Rust core for osgrep (e.g., mentioning embeddings and
reranking implemented in Rust) by replacing the current string with a concise,
factual description that removes/onflicts any mention of ONNX.

Comment thread osgrep-core/package.json
Comment on lines +14 to +17
"files": [
"index.js",
"index.d.ts"
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Check for index.js and index.d.ts files in osgrep-core directory
echo "=== Checking for index.js and index.d.ts in osgrep-core ==="
ls -la osgrep-core/index.{js,d.ts} 2>&1

echo ""
echo "=== Checking osgrep-core/package.json content ==="
# Show lines 14-17 (files whitelist)
echo "Lines 14-17 (files whitelist):"
sed -n '14,17p' osgrep-core/package.json

echo ""
echo "Lines 36-37:"
sed -n '36,37p' osgrep-core/package.json

echo ""
echo "Lines 39-44:"
sed -n '39,44p' osgrep-core/package.json

echo ""
echo "=== Full package.json for context ==="
cat osgrep-core/package.json

Repository: Ryandonofrio3/osgrep

Length of output: 1937


🌐 Web query:

@napi-rs/cli latest stable version and version history

💡 Result:

Latest stable: @napi-rs/cli v3.4.0 (released 2025-10-23). (napi.rs)

Recent version history (selected releases):

  • 3.4.0 — 2025-10-23. (napi.rs)
  • 3.2.0 — 2025-09-11. (napi.rs)
  • 3.1.5 — 2025-08-16. (napi.rs)
  • 3.1.2 / 3.1.1 / 3.1.0 — Aug 2025 (Aug 8 / Aug 8 / Aug 7). (napi.rs)
  • 3.0.0 — 2025-07-17 (NAPI-RS v3 milestone; many 3.0.0-alpha builds in June–July 2025). (napi.rs)

Sources / full changelog: NAPI-RS changelog (napi-cli) and the npm package page. (napi.rs)

If you want the complete list of all published versions (including all alphas) or links to individual changelog entries, I can fetch and paste them.

Citations:


Update @napi-rs/cli to stable version 3.4.0.

The @napi-rs/cli dependency is pinned to ^3.0.0-alpha.63, an alpha version from early development. Stable version 3.4.0 (released October 2025) is available and should be used instead. Pinning to an alpha version may introduce unexpected breaking changes and prevent automatic updates to stable releases. Update to version ^3.4.0 or at minimum ^3.0.0.

The files whitelist (lines 14–17) correctly references existing files (index.js and index.d.ts).

🤖 Prompt for AI Agents
In osgrep-core/package.json around lines 14–17 the "files" array is fine, but
the @napi-rs/cli dependency must be upgraded from the pinned alpha to a stable
release: change the package.json entry for @napi-rs/cli to use "^3.4.0" (or at
minimum "^3.0.0"), then run your package manager (npm/yarn/pnpm) to update the
lockfile and reinstall, rebuild any native artifacts, and run tests to verify
nothing breaks.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
README.md (1)

20-27: README “Local & Private” still looks out of sync with the Rust-native binding story.
Given the PR introduces osgrep-core + N-API, this bullet should likely describe “Rust-native core/bindings” (and only mention onnxruntime if it’s intentionally user-relevant).

#!/bin/bash
# Verify what the JS/TS side actually depends on and whether onnxruntime is still user-visible.
rg -n "onnxruntime|onnxruntime-node|osgrep-core|napi|node-addon-api" -S .
rg -n "\"dependencies\"|\"optionalDependencies\"|\"peerDependencies\"" -S package.json osgrep-core/package.json 2>/dev/null || true
src/lib/native/index.ts (2)

27-44: Race condition in initialization remains.

The initPromise = null in .finally() still creates a window for duplicate initialization attempts. While n.isInitialized() provides some protection, concurrent callers during init could still trigger redundant work.

   initPromise = (async () => {
     const n = await loadNative();

     if (!n.isInitialized()) {
       n.initModels(MODEL_IDS.embed, MODEL_IDS.colbert);
     }

     initialized = true;
-  })().finally(() => {
-    initPromise = null;
-  });
+  })();

   return initPromise;

223-253: Type conversions still use as any casts.

The as any casts at lines 227, 232, and 237 remain. While functionally correct, they bypass TypeScript's type checking. Consider updating the RerankInput interface to explicitly allow the types that are actually accepted, or add runtime type guards.

 export interface RerankInput {
-  queryEmbedding: Float32Array;
-  docEmbeddings: Int8Array;
-  docTokenIds: Uint32Array;
+  queryEmbedding: Float32Array | Float64Array;
+  docEmbeddings: Int8Array | Buffer | ArrayBufferView;
+  docTokenIds: Uint32Array | number[];
src/lib/workers/orchestrator.ts (1)

216-229: Floor seqLen to prevent potential out-of-bounds access.

seqLen is computed as colbertFlat.length / dim which may be non-integer if the embedding length is not perfectly divisible. This could cause the loop to access past the array bounds.

-    const seqLen = colbertFlat.length / dim;
+    const seqLen = Math.floor(colbertFlat.length / dim);
🧹 Nitpick comments (4)
src/lib/index/chunker.ts (1)

106-136: Avoid duplicating the file label (and handle empty filePath).
Right now displayText includes both // File: ... and (usually) a breadcrumb starting with File: .... Also, filePath === "" produces // File: while breadcrumb uses unknown.

 export function formatChunkText(
   chunk: ChunkWithContext,
   filePath: string,
 ): { content: string; displayText: string } {
   const breadcrumb = [...chunk.context];
   const fileLabel = `File: ${filePath || "unknown"}`;
   const hasFileLabel = breadcrumb.some(
     (entry) => typeof entry === "string" && entry.startsWith("File: "),
   );
-  if (!hasFileLabel) {
-    breadcrumb.unshift(fileLabel);
-  }
+  // If we always emit a `// File:` header line, keep breadcrumbs focused on symbols/structure.
+  // (Retain existing file labels if already present.)
 
   const sections: string[] = [];
 
   // 1. File path (always first)
-  sections.push(`// File: ${filePath}`);
+  sections.push(`// File: ${filePath || "unknown"}`);
 
   // 2. Imports (if available)
   if (chunk.imports && chunk.imports.length > 0) {
     sections.push(
       chunk.imports
         .map((imp) => (imp.startsWith("import") ? imp : `import ${imp}`))
         .join("\n"),
     );
   }
 
   // 3. Breadcrumb
-  const header = breadcrumb.length > 0 ? breadcrumb.join(" > ") : fileLabel;
+  const header =
+    breadcrumb.length > 0
+      ? breadcrumb.filter((b) => !String(b).startsWith("File: ")).join(" > ")
+      : fileLabel;
   sections.push(header);
tests/git-worktree.test.ts (1)

40-44: Minor: Consider try-catch in cleanup for robustness.

If rmSync throws for one directory, cleanup of subsequent directories won't occur. For test isolation, consider wrapping in try-catch:

   afterEach(() => {
     for (const dir of created.splice(0)) {
-      fs.rmSync(dir, { recursive: true, force: true });
+      try {
+        fs.rmSync(dir, { recursive: true, force: true });
+      } catch { /* ignore cleanup errors */ }
     }
   });
src/lib/workers/orchestrator.ts (1)

64-74: Minor race condition in initialization pattern.

Setting initPromise = null in finally creates a brief window where concurrent callers could start duplicate initialization. However, since this is an internal method and initNative() itself guards against double-init, the practical impact is minimal.

src/lib/native/index.ts (1)

196-211: Consider clarifying offset semantics in documentation.

Line 205 documents docOffsets as "Byte offsets per doc" which is technically correct for Int8Array (1 element = 1 byte), but "element offsets" would be more universally accurate and consistent with usage in orchestrator.ts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 95d88fb and 31797a0.

📒 Files selected for processing (14)
  • README.md (3 hunks)
  • src/commands/serve.ts (2 hunks)
  • src/commands/utility/doctor.ts (2 hunks)
  • src/commands/utility/list.ts (1 hunks)
  • src/index.ts (1 hunks)
  • src/lib/index/chunker.ts (2 hunks)
  • src/lib/native/index.ts (1 hunks)
  • src/lib/output/formatter.ts (1 hunks)
  • src/lib/utils/ansi.ts (1 hunks)
  • src/lib/utils/formatter.ts (1 hunks)
  • src/lib/utils/git.ts (2 hunks)
  • src/lib/workers/orchestrator.ts (8 hunks)
  • tests/chunking.test.ts (1 hunks)
  • tests/git-worktree.test.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/index.ts
  • src/commands/serve.ts
🧰 Additional context used
🧬 Code graph analysis (4)
tests/git-worktree.test.ts (1)
src/lib/utils/git.ts (3)
  • isWorktree (22-46)
  • getGitCommonDir (48-82)
  • getMainRepoRoot (87-96)
src/lib/workers/orchestrator.ts (4)
src/lib/index/chunker.ts (3)
  • ChunkWithContext (60-65)
  • chunk (268-313)
  • formatChunkText (106-151)
src/lib/native/index.ts (4)
  • initNative (27-44)
  • embedBatch (131-174)
  • encodeQueryColbert (184-190)
  • rerankColbert (223-253)
src/lib/store/types.ts (2)
  • PreparedChunk (7-29)
  • VectorRecord (31-35)
src/config.ts (1)
  • CONFIG (18-26)
src/commands/utility/doctor.ts (1)
src/lib/native/index.ts (1)
  • initNative (27-44)
src/lib/native/index.ts (2)
src/config.ts (2)
  • MODEL_IDS (4-7)
  • CONFIG (18-26)
osgrep-core/index.d.ts (1)
  • RerankResult (21-26)
🪛 GitHub Check: ci
src/commands/utility/doctor.ts

[failure] 6-6:
Cannot find module '../lib/native' or its corresponding type declarations.


[failure] 5-5:
Cannot find module '../config' or its corresponding type declarations.

🔇 Additional comments (24)
src/commands/utility/list.ts (1)

4-5: No action needed. The import paths have been correctly updated as part of the reorganization. gracefulExit is exported from src/lib/utils/exit.ts, and both ensureProjectPaths and findProjectRoot are exported from src/lib/utils/project-root.ts, matching the import statements at lines 4-5.

tests/chunking.test.ts (1)

63-76: Test expectation update matches new header format.
Looks consistent with formatChunkText() now emitting // File: {path}.

README.md (1)

256-262: Bun workflow + license line updates look fine.

Also applies to: 272-275

src/lib/utils/formatter.ts (1)

1-19: Import and shared ANSI style alignment looks good; extension format handling is correct.

The detectLanguage function correctly uses path.extname() which returns extensions with leading dots (e.g., ".ts"), and this matches the format stored in the LANGUAGES array (e.g., [".ts", ".tsx", ".js"]). The lookup via getLanguageByExtension() works as intended.

src/lib/output/formatter.ts (1)

1-11: The extension handling is correct; no issue here.

The code properly passes path.extname() result (which includes the leading dot like .ts) to getLanguageByExtension(), which expects extensions in this format. The LANGUAGES array stores extensions with leading dots (.ts, .js, etc.) and the function normalizes input to lowercase before matching.

src/lib/utils/ansi.ts (1)

1-10: LGTM!

Clean utility that correctly respects TTY detection and the NO_COLOR convention. ANSI codes and reset sequences are accurate.

tests/git-worktree.test.ts (4)

7-14: LGTM!

Helper functions are concise and appropriate for test fixture setup.


16-35: LGTM!

The fixture correctly simulates a git worktree structure with the commondir file pointing to the main repository's .git directory.


46-67: LGTM!

Test cases correctly verify the three main scenarios: normal repos, gitfile-based worktrees, and expected return values using realpathSync.native for consistency with the implementation.


69-85: LGTM!

Good handling of the Windows platform limitation for symlinks. The conditional skip ensures the test suite passes on all platforms while still testing symlink behavior where supported.

src/lib/utils/git.ts (5)

4-10: LGTM!

Good defensive helper that gracefully handles non-existent paths or inaccessible symlinks by falling back to the original path.


12-20: LGTM!

Correctly parses the git gitfile format with appropriate error handling.


22-46: LGTM!

Correctly handles both standard worktree detection via gitfile and the less common symlinked .git directory scenario. The commondir check is the correct discriminator for worktree detection.


48-82: LGTM with minor observation.

The implementation correctly handles both gitfile and symlink-based worktrees. The fallback at line 66 for missing commondir in gitfile scenarios is reasonable for non-standard setups, though it differs from the directory case which returns null when commondir is absent.


84-96: LGTM!

Clean implementation that correctly derives the main repository root by taking the parent of the common git directory.

src/lib/workers/orchestrator.ts (5)

1-51: LGTM!

Clean module structure with appropriate imports. The removal of the scale field from RerankDoc aligns with the native reranking implementation.


76-149: LGTM!

Chunking and preparation logic is well-structured. Good use of crypto.randomUUID() for chunk IDs and proper context window handling for prev/next chunks.


151-202: LGTM!

Clean file processing pipeline with proper validation, chunking, and native embedding. Progress callbacks are appropriately placed throughout the flow.


297-330: LGTM!

Packing logic and result mapping are correctly implemented. The score alignment back to input order via candidateIndices is the expected behavior for the Searcher.


337-362: LGTM!

Clean singleton pattern with well-designed convenience wrappers for the public API surface.

src/lib/native/index.ts (4)

64-83: LGTM!

Correctly splits the flat embedding buffer into isolated per-text vectors.


89-113: LGTM!

Clean interface for packed ColBERT embeddings with proper typed array handling.


131-174: LGTM!

Efficient hybrid embedding extraction with proper cursor management for token IDs. The comment about colbertOffset being set during storage is helpful.


184-190: LGTM!

Clean wrapper for query encoding.

Comment thread src/commands/utility/doctor.ts Outdated
Comment on lines +266 to 278
let colbert: Int8Array;
if (col instanceof Int8Array) {
colbert = col;
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (
col &&
typeof col === "object" &&
"type" in col &&
(col as any).type === "Buffer" &&
Array.isArray((col as any).data)
) {
// IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer)
colbert = new Int8Array((col as any).data);
} else if (ArrayBuffer.isView(col)) {
// Handles Uint8Array and other typed arrays (e.g. from LanceDB)
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (Array.isArray(col)) {
colbert = new Int8Array(col);
} else {
colbert = new Int8Array(0);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential data corruption when handling non-Int8 typed arrays.

The ArrayBuffer.isView branch at lines 271-273 will accept any typed array (e.g., Float32Array, Uint16Array) and reinterpret its bytes as Int8Array. This could produce incorrect embeddings if the source isn't already Int8Array or Uint8Array.

       } else if (Buffer.isBuffer(col)) {
         colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
-      } else if (ArrayBuffer.isView(col)) {
-        // Handles Uint8Array and other typed arrays (e.g. from LanceDB)
+      } else if (col instanceof Uint8Array) {
+        // Handles Uint8Array (e.g. from LanceDB)
         colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
       } else if (Array.isArray(col)) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let colbert: Int8Array;
if (col instanceof Int8Array) {
colbert = col;
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (
col &&
typeof col === "object" &&
"type" in col &&
(col as any).type === "Buffer" &&
Array.isArray((col as any).data)
) {
// IPC serialization fallback (still copies, but unavoidable without SharedArrayBuffer)
colbert = new Int8Array((col as any).data);
} else if (ArrayBuffer.isView(col)) {
// Handles Uint8Array and other typed arrays (e.g. from LanceDB)
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (Array.isArray(col)) {
colbert = new Int8Array(col);
} else {
colbert = new Int8Array(0);
}
let colbert: Int8Array;
if (col instanceof Int8Array) {
colbert = col;
} else if (Buffer.isBuffer(col)) {
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (col instanceof Uint8Array) {
// Handles Uint8Array (e.g. from LanceDB)
colbert = new Int8Array(col.buffer, col.byteOffset, col.byteLength);
} else if (Array.isArray(col)) {
colbert = new Int8Array(col);
} else {
colbert = new Int8Array(0);
}

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (5)
src/commands/utility/doctor.ts (1)

5-8: Avoid import-time native crashes by lazy-loading ../../lib/native inside the action.

If ../../lib/native ever loads/validates the N-API binary at module import time, osgrep doctor (and potentially osgrep --help) can crash before your try/catch runs. Consider dynamically importing within the try so failures are contained.

-import { initNative } from "../../lib/native";
+// NOTE: keep native load inside the command action so failures are catchable.
     try {
-      await initNative();
+      const { initNative } = await import("../../lib/native");
+      await initNative();
       console.log("✅ Native models initialized");
     } catch (e) {
osgrep-core/src/colbert_ort.rs (1)

373-373: Consider making batch size configurable.

The hardcoded batch_size = 64 works well for most cases, but making it configurable (e.g., via a method parameter with a default) would allow tuning for different hardware or workload characteristics.

osgrep-core/src/dense_ort.rs (3)

132-170: Avoid avoidable clones + per-row allocations in the hot path

  • encode_batch(texts: Vec<String>, ...) immediately does texts.clone() just for tokenization.
  • Mean pooling allocates sum_hidden = vec![...] for every row.

This is likely on your query-time critical path; consider changing the API to accept &[String] (or impl IntoIterator<Item = impl AsRef<str>>) and reusing buffers for pooling.

-    pub fn encode_batch(&mut self, texts: Vec<String>, normalize: bool) -> anyhow::Result<Vec<f32>> {
+    pub fn encode_batch(&mut self, texts: &[String], normalize: bool) -> anyhow::Result<Vec<f32>> {
         if texts.is_empty() {
             return Err(anyhow::anyhow!("Empty input texts"));
         }
 
         // Tokenize
-        let encodings = self.tokenizer
-            .encode_batch(texts.clone(), true)
+        let encodings = self.tokenizer
+            .encode_batch(texts.to_vec(), true) // consider a non-cloning form if supported by your tokenizers version
             .map_err(|e| anyhow::anyhow!("Tokenization failed: {}", e))?;
@@
-        for i in 0..batch_size {
-            let mut sum_hidden = vec![0.0f32; self.hidden_size];
+        let mut sum_hidden = vec![0.0f32; self.hidden_size];
+        for i in 0..batch_size {
+            sum_hidden.fill(0.0);
             let mut sum_mask = 0.0f32;

Also applies to: 188-207


24-82: De-duplicate session/tokenizer construction between load_from_hf() and load()
The two constructors are ~90% identical (session builder + tokenizer config). A small private helper (e.g., fn build_session(model_path: impl AsRef<Path>) -> ... and fn build_tokenizer(tokenizer_path: ...) -> ...) will reduce drift and make it easier to evolve options (threads, EPs, max_len) consistently.

Also applies to: 85-130


42-56: Hardcoded intra-op threads (with_intra_threads(4)) is a portability/perf footgun
This will underutilize big machines and oversubscribe small ones (and may fight JS-layer concurrency). Consider defaulting to num_cpus (or making it configurable from the caller / env).

Also applies to: 91-105

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 31797a0 and 816901b.

📒 Files selected for processing (6)
  • .gitignore (1 hunks)
  • osgrep-core/Cargo.toml (1 hunks)
  • osgrep-core/src/colbert_ort.rs (1 hunks)
  • osgrep-core/src/dense_ort.rs (1 hunks)
  • src/commands/agent/mcp.ts (1 hunks)
  • src/commands/utility/doctor.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • osgrep-core/Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (3)
src/commands/utility/doctor.ts (1)
src/lib/native/index.ts (1)
  • initNative (27-44)
osgrep-core/src/dense_ort.rs (1)
osgrep-core/src/colbert_ort.rs (4)
  • log_native (10-14)
  • load_from_hf (40-120)
  • outputs (180-180)
  • outputs (288-288)
osgrep-core/src/colbert_ort.rs (1)
osgrep-core/src/dense_ort.rs (4)
  • log_native (9-13)
  • load_from_hf (24-82)
  • hidden_size (234-236)
  • outputs (185-185)
🔇 Additional comments (13)
.gitignore (4)

25-29: Verify that entire directories (scripts/, tools/, experiments/, public/) are intentionally ignored.

Ignoring complete directories can mask important source code from version control. Based on the PR summary mentioning removal of "skeleton/trace/verify CLI commands" and "many worker, embedding, and experiment scripts," please confirm these directories have been deleted or reorganized and are not critical for the project.

Please verify:

  1. Have the scripts/, tools/, experiments/, and public/ directories been deleted entirely from the codebase, or are they being relocated?
  2. If they still exist, do they contain build artifacts or generated files that should be ignored, or is there important source code that should remain tracked?

3-3: Pattern clarity: **/node_modules/ is correct but adds scope.

The addition of **/node_modules/ on line 3 complements the root-level node_modules/ on line 2. This ensures node_modules in any subdirectory (e.g., osgrep-core/node_modules/) is ignored, which is appropriate given the introduction of a Rust subproject with potential Node bindings.


48-52: Rust and N-API build artifacts are properly ignored.

The new patterns for osgrep-core build outputs are well-structured:

  • osgrep-core/target/ → Cargo build artifacts
  • **/target/ → Fallback for any Rust project in subdirectories
  • osgrep-core/*.node → Compiled N-API bindings
  • osgrep-core/bench/ → Benchmark outputs

42-47: Migration to bun is reflected correctly in lockfile ignores.

The removal of pnpm-lock.yaml, package-lock.json, yarn.lock from tracked files and addition of bun.lockb correctly reflects the tooling migration. Ensure bun.lock (human-readable lockfile) is tracked if your project uses it, or confirm bun.lockb is the only lockfile being generated.

Please verify that the build tooling migration includes updating CI/CD workflows, package installation scripts, and documentation to use bun instead of pnpm.

src/commands/agent/mcp.ts (1)

10-12: All updated import paths resolve correctly—no issues found.

The three modules exist at their expected locations and the import paths are correct:

  • syncer.tssrc/lib/index/syncer.ts
  • setup-helpers.tssrc/lib/setup/setup-helpers.ts
  • project-root.tssrc/lib/utils/project-root.ts
osgrep-core/src/colbert_ort.rs (8)

1-14: LGTM!

The imports are appropriate for ColBERT-ORT functionality, and the no-op logging pattern matches the approach in dense_ort.rs.


16-22: LGTM!

The constants are well-defined with clear documentation about the MaxSim cost implications.


24-37: LGTM!

The struct design is well-organized with appropriate types for each field. The use of Option for marker tokens and HashSet for skip IDs demonstrates good design choices.


122-207: LGTM!

The query encoding logic is well-implemented with proper handling of optional marker tokens, careful truncation accounting for special tokens, and correct L2 normalization with numerical stability protection.


209-325: LGTM!

The document encoding implementation properly handles batching, dynamic padding, and preserves token IDs for skiplist filtering. The L2 normalization is correctly applied per-token.


327-360: LGTM!

The MaxSim scoring algorithm is correctly implemented with proper skiplist filtering and numerical stability.


455-495: LGTM!

The packed encoding implementation efficiently concatenates embeddings for storage and reuse. The offset calculation correctly tracks the running position in the flat embeddings array.


560-608: LGTM!

The fast query-time reranking path correctly uses pre-encoded documents and properly maps scores back to the original document indices.

Comment thread .gitignore
.DS_Store

# Local osgrep data
.osgrep/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Redundant entry: remove line 53 .osgrep.

Line 53 (.osgrep) is a duplicate of the new entry on line 9 (.osgrep/). The trailing slash is more precise as it targets only directories. Remove the old entry to avoid confusion.

-.osgrep

Also applies to: 53-53

🤖 Prompt for AI Agents
In .gitignore around lines 9 and 53, there is a duplicate entry for `.osgrep`
(line 9 has `.osgrep/` and line 53 has `.osgrep`); remove the redundant line 53
entry so only the more precise directory pattern `.osgrep/` remains, ensuring no
other references to the same pattern are left elsewhere in the file.

Comment thread osgrep-core/src/colbert_ort.rs Outdated
.build(),
])?
.with_optimization_level(GraphOptimizationLevel::Level3)?
.with_intra_threads(8)?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check intra_threads usage across the codebase
rg -n 'intra_threads' --type=rust -C 2

# Also search for any configuration or documentation about thread settings
rg -n 'thread' --type=rust osgrep-core/src/ | grep -i 'intra\|config\|session'

Repository: Ryandonofrio3/osgrep

Length of output: 2561


🏁 Script executed:

#!/bin/bash
# Check for comments around the thread configuration in both files
echo "=== colbert_ort.rs context ===" 
sed -n '70,90p' osgrep-core/src/colbert_ort.rs

echo -e "\n=== dense_ort.rs context ===" 
sed -n '40,60p' osgrep-core/src/dense_ort.rs

# Search for any comments about thread tuning or performance in the codebase
rg -i 'colbert|thread|performance|tuning' osgrep-core/src/*.rs | head -20

Repository: Ryandonofrio3/osgrep

Length of output: 3587


Add a comment explaining the thread count difference.

colbert_ort.rs uses intra_threads(8) while dense_ort.rs uses intra_threads(4). This difference is consistent across both CoreML and CPU paths in each file, but lacks documentation. Add a comment explaining whether this is intentional based on model characteristics (e.g., different model sizes/architecture) or performance tuning rationale.

🤖 Prompt for AI Agents
In osgrep-core/src/colbert_ort.rs around line 78, add a short comment next to
the .with_intra_threads(8)? call explaining why this file uses 8 intra-thread
workers whereas dense_ort.rs uses 4 (e.g., intentional difference due to model
size/architecture, different layer parallelism, or performance
tuning/benchmarks); if the choice is not intentional, note that it should be
unified and add a TODO referencing where to run/record a quick benchmark or link
to an issue so maintainers can validate and standardize the thread counts across
both files.

let cls_id = *vocab.get("[CLS]").unwrap_or(&0);
let sep_id = *vocab.get("[SEP]").unwrap_or(&0);
let mask_id = *vocab.get("[MASK]").unwrap_or(&0);
let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Execute the provided verification script
rg -n -C3 'PAD.*MASK|pad.*fallback' --type=rust

# Also check for pad_id/mask_id patterns
rg -n -C3 'pad_id.*mask_id' --type=rust -g '**/*test*.rs'

Repository: Ryandonofrio3/osgrep

Length of output: 860


🏁 Script executed:

# Get context around line 96 in colbert_ort.rs
head -120 osgrep-core/src/colbert_ort.rs | tail -40

Repository: Ryandonofrio3/osgrep

Length of output: 1557


🏁 Script executed:

# Search for where mask_id comes from and how it's used
rg -n 'mask_id' osgrep-core/src/colbert_ort.rs -B5 -A5

Repository: Ryandonofrio3/osgrep

Length of output: 2078


🏁 Script executed:

# Check if there are any comments or documentation about token handling
rg -n 'vocab|token' osgrep-core/src/colbert_ort.rs | head -30

Repository: Ryandonofrio3/osgrep

Length of output: 1820


🏁 Script executed:

# Search for where pad_id is used
rg -n 'pad_id' osgrep-core/src/colbert_ort.rs

Repository: Ryandonofrio3/osgrep

Length of output: 252


🏁 Script executed:

# Search across entire codebase for pad_id usage
rg -n 'pad_id' --type=rust

Repository: Ryandonofrio3/osgrep

Length of output: 376


🏁 Script executed:

# Check how padding is actually handled in the code
rg -n 'pad|padding' osgrep-core/src/colbert_ort.rs -i

Repository: Ryandonofrio3/osgrep

Length of output: 578


🏁 Script executed:

# Check if there are any related tests or documentation in the repo
find . -name "*.md" -o -name "*.txt" | xargs grep -l "pad\|PAD" 2>/dev/null | head -10

Repository: Ryandonofrio3/osgrep

Length of output: 151


Remove unused pad_id field or implement its actual usage.

The pad_id variable is assigned at line 96 with a fallback to mask_id, but it is never used in the encoding or processing logic. The code pads with mask_id directly (line 160) or leaves positions as 0 (line 274), making pad_id dead code. Either remove this field and its assignment, or implement the intended padding logic using pad_id instead of mask_id if semantically required for the model.

pub token_ids: Vec<u32>,
/// Number of tokens per document
pub lengths: Vec<u32>,
/// Byte offsets into embeddings buffer for each doc (for fast lookup)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Correct the misleading comment.

The comment states "Byte offsets into embeddings buffer" but the offsets are actually element indices (f32 counts), not byte offsets. This is evident from line 481 where all_embeddings.len() (an element count) is stored.

Apply this diff to correct the comment:

-    /// Byte offsets into embeddings buffer for each doc (for fast lookup)
+    /// Element offsets into embeddings buffer for each doc (for fast lookup)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Byte offsets into embeddings buffer for each doc (for fast lookup)
/// Element offsets into embeddings buffer for each doc (for fast lookup)
🤖 Prompt for AI Agents
In osgrep-core/src/colbert_ort.rs around line 449, the comment "Byte offsets
into embeddings buffer for each doc (for fast lookup)" is misleading because the
values are element indices (counts of f32 elements) not byte offsets; update the
comment to clearly state these are element offsets (e.g., "Element offsets (f32
indices) into embeddings buffer for each doc (for fast lookup)") so it
accurately reflects that values like all_embeddings.len() are stored as element
counts.

Comment on lines +515 to +519
let token_offset: usize = packed.offsets[..doc_idx]
.iter()
.zip(&packed.lengths[..doc_idx])
.map(|(&off, &len)| len as usize)
.sum();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Simplify the token offset calculation.

The code zips offsets[..doc_idx] with lengths[..doc_idx] but then only uses the lengths in the map function (the &off binding is unused). This is unnecessarily complex.

Apply this diff to simplify:

-            let token_offset: usize = packed.offsets[..doc_idx]
-                .iter()
-                .zip(&packed.lengths[..doc_idx])
-                .map(|(&off, &len)| len as usize)
-                .sum();
+            let token_offset: usize = packed.lengths[..doc_idx]
+                .iter()
+                .map(|&len| len as usize)
+                .sum();

Or even more concisely:

-            let token_offset: usize = packed.offsets[..doc_idx]
-                .iter()
-                .zip(&packed.lengths[..doc_idx])
-                .map(|(&off, &len)| len as usize)
-                .sum();
+            let token_offset: usize = packed.lengths[..doc_idx]
+                .iter()
+                .sum::<u32>() as usize;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let token_offset: usize = packed.offsets[..doc_idx]
.iter()
.zip(&packed.lengths[..doc_idx])
.map(|(&off, &len)| len as usize)
.sum();
let token_offset: usize = packed.lengths[..doc_idx]
.iter()
.map(|&len| len as usize)
.sum();
🤖 Prompt for AI Agents
In osgrep-core/src/colbert_ort.rs around lines 515 to 519, the token_offset
calculation zips offsets[..doc_idx] with lengths[..doc_idx] but only uses the
lengths, so remove the unnecessary zip and unused binding and compute
token_offset by summing the lengths slice up to doc_idx (casting each length to
usize) — replace the zip/map with a direct iterator over
packed.lengths[..doc_idx].iter().map(|&len| len as usize).sum() (or equivalent)
to simplify and clarify the calculation.

Comment on lines +30 to +36
// Download model and tokenizer files
// ONNX model is in onnx/ subdirectory for onnx-community repos
// Also download the external data file if it exists
let model_path = repo.get("onnx/model.onnx")?;
let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found
let tokenizer_path = repo.get("tokenizer.json")?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t silently ignore model.onnx_data download failures (can mask the real root cause)

let _ = repo.get("onnx/model.onnx_data") will swallow all errors (not just “not found”), which makes external-data model failures harder to debug and can hide transient network/auth issues. Consider logging the error (at least) and/or only ignoring a “not found” case.

-        let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found
+        match repo.get("onnx/model.onnx_data") {
+            Ok(path) => log_native(format!("[ORT] Downloaded external data: {:?}", path)),
+            Err(e) => {
+                // Many models won't have external data; still surface unexpected failures in logs.
+                log_native(format!("[ORT] External data not downloaded (may be absent): {e}"));
+            }
+        }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In osgrep-core/src/dense_ort.rs around lines 30–36, the call `let _ =
repo.get("onnx/model.onnx_data")` swallows all errors; change it to only ignore
a “not found”/missing-file error and for any other error either log the error
with context (including the path and error) or return it to the caller.
Concretely, call repo.get and match the Result: if the error indicates not
found, continue silently; otherwise log processLogger/error (or use the crate
logger) with the path and error details and propagate the error (or return Err)
so transient/network/auth issues aren’t hidden.

Comment thread osgrep-core/src/dense_ort.rs Outdated
Comment on lines +39 to +57
// Initialize ONNX Runtime session
// On macOS, use CoreML for GPU acceleration with CPU fallback
#[cfg(target_os = "macos")]
let session = Session::builder()?
.with_execution_providers([
CoreMLExecutionProvider::default()
.with_subgraphs(true) // Enable CoreML for subgraphs
.build(),
])?
.with_optimization_level(GraphOptimizationLevel::Level3)?
.with_intra_threads(4)?
.commit_from_file(&model_path)?;

#[cfg(not(target_os = "macos"))]
let session = Session::builder()?
.with_optimization_level(GraphOptimizationLevel::Level3)?
.with_intra_threads(4)?
.commit_from_file(&model_path)?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the dense_ort.rs file to understand the full context
fd dense_ort.rs

Repository: Ryandonofrio3/osgrep

Length of output: 92


🏁 Script executed:

# Check for the file and read it
fd -e rs dense_ort

Repository: Ryandonofrio3/osgrep

Length of output: 92


🏁 Script executed:

# List all rust files in osgrep-core/src
find osgrep-core/src -name "*.rs" -type f 2>/dev/null | head -20

Repository: Ryandonofrio3/osgrep

Length of output: 146


🏁 Script executed:

# Check Cargo.toml to see ort version
find . -name Cargo.toml -type f | head -5

Repository: Ryandonofrio3/osgrep

Length of output: 88


🏁 Script executed:

# Read the dense_ort.rs file
cat -n osgrep-core/src/dense_ort.rs

Repository: Ryandonofrio3/osgrep

Length of output: 10847


🏁 Script executed:

# Read the Cargo.toml
cat osgrep-core/Cargo.toml

Repository: Ryandonofrio3/osgrep

Length of output: 943


🌐 Web query:

ort 2.0.0-rc.10 session inputs API rust

💡 Result:

Here's a concise summary of how to provide session inputs with ort 2.0.0-rc.10 (Rust), with minimal examples.

Key points

  • Use the inputs! macro to build inputs easily (either positional array or named map). [1]
  • Session.run accepts types convertible Into<SessionInputs<'i,'v,N>>; common usage is session.run(ort::inputs![...]) or session.run(ort::inputs!{"name" => value}?). [1][2]
  • You can pass Tensor, TensorRef, Value, Vec, arrays, or HashMap<String,Value> depending on whether you build statically or dynamically. For async/timeout control use run_with_options / run_async with RunOptions. [1]

Minimal examples

  • Positional array:
    let outputs = session.run(ort::inputs![Tensor::from_array(([1, 3, 224, 224], image_vec))?])?;
  • Named inputs:
    let outputs = session.run(ort::inputs!{
    "input_ids" => Tensor::from_array(([1, seq_len], ids_vec))?,
    "attention_mask" => Tensor::from_array(([1, seq_len], mask_vec))?
    })?;
  • Using TensorRef (avoid extra copy):
    let outputs = session.run(ort::inputs![TensorRef::from_array_view(&ndarray_array)?])?;

References

  • Session API (Session.run / run_with_options, Session.inputs/outputs). [1]
  • inputs! macro docs and examples. [2]
  • ort 2.0.0-rc.10 introduction / getting started. [3]

Sources
[1] Session (ort crate) API docs. [2] inputs! macro docs. [3] ort 2.0.0-rc.10 guide (getting started).


🌐 Web query:

ort rust 2.0.0-rc.10 extract tensor shape

💡 Result:

Use the OrtOwnedTensor’s underlying ndarray view (it implements Deref to an ndarray::ArrayView) and call shape() or dim().

Example:

let outputs: Vec<OrtOwnedTensor<f32,_>> = session.run(inputs)?;
let out = &outputs[0];           // OrtOwnedTensor
let shape: &[usize] = out.shape(); // e.g. &[1,3,224,224]
let dim = out.ndim();            // number of axes

References: OrtOwnedTensor docs (deref to ndarray::ArrayView) and DynArrayRef shape helper. [1][2]


🌐 Web query:

ort 2.0.0-rc.10 session.inputs name field rust

💡 Result:

Yes — in ort 2.0.0-rc.10 Session.inputs is a Vec and Input has a pub name: String. Example:

let session = Session::builder()?.commit_from_file("model.onnx")?;
for input in &session.inputs {
println!("input name = {}", input.name);
}

Docs: Input struct (shows name field). [1] Session struct (shows inputs: Vec). [2]

Sources
[1] https://docs.rs/ort/latest/ort/session/struct.Input.html
[2] https://docs.rs/ort/latest/ort/session/struct.Session.html


🌐 Web query:

ort crate rust Session inputs input metadata

💡 Result:

Short answer: the ort crate’s Session exposes its inputs as Session.inputs: Vec, where Input has name, input_type (TensorElementDataType) and dimensions (Vec<Option>) and helper methods; model-level metadata is available via Session::metadata(). Use the inputs! macro or Value/Tensor helpers to build run inputs. Example:

Code:

  • Inspect inputs / metadata
    let mut session = Session::builder()?.with_model_from_file("model.onnx")?;
    for input in &session.inputs {
    println!("name={} type={:?} dims={:?}", input.name, input.input_type, input.dimensions());
    }
    let meta = session.metadata()?;
    println!("model name: {:?}", meta.name());

  • Run using inputs! macro:
    let outputs = session.run(ort::inputs! { "input_name" => Tensor::from_array(([1,3,224,224], data))? })?;

Sources:

  • Session (fields & methods). [1]
  • Input struct (name, input_type, dimensions + dimensions() iterator). [2]
  • inputs! macro (convenience macro for building inputs). [3]

References:
[1] docs.rs/ort :: Session.
[2] docs.rs/ort :: session.rs (Input struct).
[3] docs.rs/ort :: inputs! macro.


Add runtime validation for ORT input names and output tensor shape

The encode_batch() method unconditionally passes token_type_ids (line 181), but many embedding models don't accept this input. Additionally, there is no validation that the output tensor has the expected shape [batch, seq, hidden] before indexing into it.

Add a conditional check for the token_type_ids input by inspecting self.session.inputs, and validate the embeddings tensor shape after extraction:

         // Run inference
-        let outputs = self.session.run(ort::inputs![
-            "input_ids" => input_ids,
-            "attention_mask" => attention_mask_tensor,
-            "token_type_ids" => token_type_ids
-        ])?;
+        // Some models do not accept token_type_ids
+        let outputs = if self.session.inputs.iter().any(|i| i.name == "token_type_ids") {
+            self.session.run(ort::inputs![
+                "input_ids" => input_ids,
+                "attention_mask" => attention_mask_tensor,
+                "token_type_ids" => token_type_ids
+            ])?
+        } else {
+            self.session.run(ort::inputs![
+                "input_ids" => input_ids,
+                "attention_mask" => attention_mask_tensor
+            ])?
+        };
 
         // Get the last_hidden_state output (typically first output)
         let embeddings_tensor = outputs[0].try_extract_tensor::<f32>()?;
+        // Validate output shape before indexing
+        let shape = embeddings_tensor.shape();
+        if shape.len() != 3 || shape[0] != batch_size || shape[1] != max_len || shape[2] != self.hidden_size {
+            return Err(anyhow::anyhow!(
+                "Unexpected embeddings shape {:?}; expected [{}, {}, {}]",
+                shape, batch_size, max_len, self.hidden_size
+            ));
+        }
         let embeddings_data: &[f32] = embeddings_tensor.1;

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +27 to 34
try {
await initNative();
console.log("✅ Native models initialized");
} catch (e) {
const msg = e instanceof Error ? e.message : String(e);
console.log(`❌ Native init failed: ${msg}`);
console.log(" Try: osgrep setup");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Return a non-zero exit code (and use console.error) when native init fails.

Right now failures are only logged; consumers (CI/scripts) can’t reliably detect an unhealthy state. Consider setting process.exitCode = 1 (or accumulating a healthy flag) and emitting errors to stderr.

     } catch (e) {
       const msg = e instanceof Error ? e.message : String(e);
-      console.log(`❌ Native init failed: ${msg}`);
-      console.log("   Try: osgrep setup");
+      console.error(`❌ Native init failed: ${msg}`);
+      console.error("   Try: osgrep setup");
+      process.exitCode = 1;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try {
await initNative();
console.log("✅ Native models initialized");
} catch (e) {
const msg = e instanceof Error ? e.message : String(e);
console.log(`❌ Native init failed: ${msg}`);
console.log(" Try: osgrep setup");
}
try {
await initNative();
console.log("✅ Native models initialized");
} catch (e) {
const msg = e instanceof Error ? e.message : String(e);
console.error(`❌ Native init failed: ${msg}`);
console.error(" Try: osgrep setup");
process.exitCode = 1;
}
🤖 Prompt for AI Agents
In src/commands/utility/doctor.ts around lines 27 to 34, the catch block only
logs failures to stdout and doesn't signal an error to callers; change the catch
to write messages to stderr using console.error (both the "Native init failed:
<msg>" and the "Try: osgrep setup" hint) and set process.exitCode = 1 (or
otherwise ensure the command returns a non-zero exit) so CI/scripts can detect
the failure.

…n osgrep. Added process checks for running servers, improved project root detection, and introduced file watching for live indexing. Updated search command to handle server responses and added tracing functionality. Enhanced ignore patterns for generated files in search results.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (7)
osgrep-core/src/colbert_ort.rs (3)

498-502: Simplify the unnecessary zip operation in token offset calculation.

The past review comment on this issue remains unaddressed. The code zips offsets[..doc_idx] with lengths[..doc_idx] but only uses the lengths in the map function (the &off binding is unused).

Apply this diff to simplify:

-            let token_offset: usize = packed.offsets[..doc_idx]
-                .iter()
-                .zip(&packed.lengths[..doc_idx])
-                .map(|(&off, &len)| len as usize)
-                .sum();
+            let token_offset: usize = packed.lengths[..doc_idx]
+                .iter()
+                .map(|&len| len as usize)
+                .sum();

Or even more concisely:

-            let token_offset: usize = packed.offsets[..doc_idx]
-                .iter()
-                .zip(&packed.lengths[..doc_idx])
-                .map(|(&off, &len)| len as usize)
-                .sum();
+            let token_offset: usize = packed.lengths[..doc_idx]
+                .iter()
+                .sum::<u32>() as usize;

79-79: Remove unused pad_id field or implement its usage.

The past review comment on this issue remains unaddressed. The pad_id is assigned here (with fallback to mask_id) but is never referenced anywhere in the encoding logic. The code uses mask_id directly for padding (line 143) or leaves positions as 0. Either remove this dead code or implement the intended padding logic using pad_id.

Apply this diff to remove the unused field:

     let cls_id = *vocab.get("[CLS]").unwrap_or(&0);
     let sep_id = *vocab.get("[SEP]").unwrap_or(&0);
     let mask_id = *vocab.get("[MASK]").unwrap_or(&0);
-    let pad_id = *vocab.get("[PAD]").unwrap_or(&mask_id); // Use MASK as PAD if no PAD

     // ColBERT marker tokens (may not exist in all tokenizers)

And remove from struct definition (line 29):

     mask_id: u32,
-    pad_id: u32,
     query_marker_id: Option<u32>,

And from struct initialization (line 98):

         mask_id,
-        pad_id,
         query_marker_id,

And from logging (line 85-86):

-        log_native(format!("[ColBERT-ORT] Token IDs: CLS={}, SEP={}, MASK={}, PAD={}",
-            cls_id, sep_id, mask_id, pad_id));
+        log_native(format!("[ColBERT-ORT] Token IDs: CLS={}, SEP={}, MASK={}",
+            cls_id, sep_id, mask_id));

432-432: Correct the misleading comment about offsets.

The past review comment on this issue remains unaddressed. The comment states "Byte offsets into embeddings buffer" but the values are actually element indices (f32 counts), not byte offsets. This is evident from line 464 where all_embeddings.len() (an element count) is stored as an offset.

Apply this diff to correct the comment:

-    /// Byte offsets into embeddings buffer for each doc (for fast lookup)
+    /// Element offsets into embeddings buffer for each doc (for fast lookup)
osgrep-core/src/dense_ort.rs (2)

31-31: Silently ignoring all errors for external data download can hide real issues.

The past review comment on this line remains unaddressed. Using let _ = swallows all errors, not just "file not found". This makes debugging difficult when external data downloads fail due to network issues, authentication problems, or other transient failures.

Consider logging the error at minimum:

-        let _ = repo.get("onnx/model.onnx_data"); // External data, ignore if not found
+        match repo.get("onnx/model.onnx_data") {
+            Ok(path) => log_native(format!("[ORT] Downloaded external data: {:?}", path)),
+            Err(e) => {
+                // Many models won't have external data; log unexpected failures for debugging
+                log_native(format!("[ORT] External data not downloaded (may be absent): {e}"));
+            }
+        }

149-153: Runtime failure on models that don't accept token_type_ids input.

The past review comment on this issue remains unaddressed. Many embedding models don't accept the token_type_ids input, which will cause this code to fail at runtime. You need to conditionally check which inputs the model accepts by inspecting self.session.inputs.

Apply this fix to conditionally pass token_type_ids:

-        // Run inference
-        let outputs = self.session.run(ort::inputs![
-            "input_ids" => input_ids,
-            "attention_mask" => attention_mask_tensor,
-            "token_type_ids" => token_type_ids
-        ])?;
+        // Run inference - some models don't accept token_type_ids
+        let outputs = if self.session.inputs.iter().any(|i| i.name == "token_type_ids") {
+            self.session.run(ort::inputs![
+                "input_ids" => input_ids,
+                "attention_mask" => attention_mask_tensor,
+                "token_type_ids" => token_type_ids
+            ])?
+        } else {
+            self.session.run(ort::inputs![
+                "input_ids" => input_ids,
+                "attention_mask" => attention_mask_tensor
+            ])?
+        };
src/lib/native/index.ts (2)

45-47: Race condition remains from previous review.

The .finally(() => { initPromise = null; }) on line 46 still creates a race window. If a caller arrives after initPromise resolves but before the finally block completes, it will see initialized = true and return immediately (line 28). However, if another caller arrives during the brief window when initPromise is being set to null, it could start a new initialization.

As noted in the past review comment, you should either keep initPromise cached permanently or wrap initialization in try/catch to only reset on failure.


240-250: Type casts remain from previous review.

Lines 240, 245, and 250 still use as any casts. While instanceof checks are present, the as any casts bypass type safety.

As noted in the past review comment, consider:

  1. Updating the RerankInput interface to match what the native binding expects, or
  2. Performing explicit, type-safe conversions with validation

The Float32Array → Float64Array conversion on line 240 is necessary but could be documented.

🧹 Nitpick comments (3)
plugins/osgrep/hooks/start.js (1)

61-61: Consider configurable log path.

The log path is hardcoded to /tmp/osgrep.log. Consider using a user-specific path (e.g., ~/.osgrep/logs/hook.log) to avoid permission issues and conflicts with other users on shared systems.

src/commands/trace.ts (1)

7-13: Consider centralizing ANSI utilities.

The inline ANSI style helpers duplicate functionality that could be in a shared utility (e.g., src/lib/utils/ansi.ts mentioned in the PR). While functional, centralizing would improve maintainability.

src/commands/serve.ts (1)

613-690: Consider adding timeout/abort handling to /trace.

The /trace endpoint lacks the timeout and AbortController handling present in /search (lines 518-578). While graph queries may be faster, adding timeout protection would make the endpoint more resilient to pathological cases.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 816901b and b261fdb.

⛔ Files ignored due to path filters (1)
  • bun.lockb is excluded by !**/bun.lockb
📒 Files selected for processing (14)
  • osgrep-core/Cargo.toml (1 hunks)
  • osgrep-core/index.d.ts (1 hunks)
  • osgrep-core/src/colbert_ort.rs (1 hunks)
  • osgrep-core/src/dense_ort.rs (1 hunks)
  • plugins/osgrep/hooks/start.js (1 hunks)
  • plugins/osgrep/skills/osgrep/SKILL.md (2 hunks)
  • src/commands/search.ts (8 hunks)
  • src/commands/serve.ts (11 hunks)
  • src/commands/trace.ts (1 hunks)
  • src/index.ts (1 hunks)
  • src/lib/graph/graph-builder.ts (1 hunks)
  • src/lib/index/ignore-patterns.ts (1 hunks)
  • src/lib/native/index.ts (1 hunks)
  • src/lib/search/searcher.ts (10 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/index.ts
  • osgrep-core/Cargo.toml
  • plugins/osgrep/skills/osgrep/SKILL.md
  • osgrep-core/index.d.ts
🧰 Additional context used
🧬 Code graph analysis (7)
plugins/osgrep/hooks/start.js (3)
src/lib/utils/server-registry.ts (2)
  • isProcessRunning (33-40)
  • getServerForProject (66-70)
plugins/osgrep/hooks/stop.js (4)
  • pid (32-32)
  • fs (1-1)
  • data (31-31)
  • cwd (25-25)
src/lib/utils/project-root.ts (1)
  • findProjectRoot (14-29)
src/commands/trace.ts (3)
src/lib/graph/graph-builder.ts (2)
  • CallGraph (13-17)
  • GraphBuilder (26-173)
src/lib/utils/project-root.ts (2)
  • findProjectRoot (14-29)
  • ensureProjectPaths (31-51)
src/lib/utils/exit.ts (1)
  • gracefulExit (3-20)
src/commands/search.ts (3)
src/lib/output/formatter.ts (1)
  • formatResults (101-108)
src/lib/utils/server-registry.ts (1)
  • unregisterServer (50-54)
src/lib/store/vector-db.ts (1)
  • VectorDB (20-302)
src/lib/native/index.ts (2)
src/config.ts (2)
  • MODEL_IDS (4-7)
  • CONFIG (18-26)
osgrep-core/index.d.ts (1)
  • RerankResult (21-26)
src/commands/serve.ts (8)
src/lib/utils/server-registry.ts (5)
  • isProcessRunning (33-40)
  • registerServer (42-48)
  • unregisterServer (50-54)
  • listServers (56-64)
  • getServerForProject (66-70)
src/lib/store/meta-cache.ts (2)
  • MetaCache (12-60)
  • MetaEntry (6-10)
src/lib/store/types.ts (1)
  • VectorRecord (31-35)
src/lib/utils/file-utils.ts (1)
  • isIndexableFile (44-63)
src/lib/workers/orchestrator.ts (2)
  • processFile (151-202)
  • processFile (346-350)
src/lib/utils/lock.ts (1)
  • acquireWriterLockWithRetry (87-109)
src/lib/index/ignore-patterns.ts (1)
  • DEFAULT_IGNORE_PATTERNS (3-60)
src/lib/graph/graph-builder.ts (1)
  • GraphBuilder (26-173)
src/lib/graph/graph-builder.ts (2)
src/lib/utils/filter-builder.ts (2)
  • escapeSqlString (1-5)
  • normalizePath (12-14)
src/lib/store/types.ts (1)
  • VectorRecord (31-35)
osgrep-core/src/colbert_ort.rs (1)
osgrep-core/src/dense_ort.rs (3)
  • log_native (6-10)
  • load_from_hf (21-66)
  • hidden_size (205-207)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
🔇 Additional comments (39)
src/lib/native/index.ts (4)

70-89: LGTM: embedDense implementation is correct.

The function properly initializes, calls the native binding, and splits the flat embedding array into per-document vectors.


108-119: LGTM: embedColbert correctly wraps native result.

The function properly wraps the packed ColBERT result in typed arrays.


137-183: LGTM: embedBatch correctly assembles hybrid embeddings.

The function properly extracts dense vectors, ColBERT embeddings, and token IDs from the packed native result, with correct offset/length tracking for each document.


193-202: LGTM: encodeQueryColbert is straightforward.

plugins/osgrep/hooks/start.js (4)

14-42: Helper duplication is acceptable for hook isolation.

The isProcessRunning, findProjectRoot, and getServerForProject functions duplicate logic from src/lib/utils/server-registry.ts and src/lib/utils/project-root.ts. This is acceptable since this hook runs in a separate Node.js process and cannot import TypeScript modules.


47-57: LGTM: Clean early exit for disabled auto-serve.

The environment variable check provides a clear escape hatch for users who don't want automatic server startup.


59-71: LGTM: Server spawn logic is correct.

The code correctly checks for an existing server before spawning, and uses the proper detached + unref pattern for background processes.


73-83: LGTM: Response structure is clear and informative.

src/lib/index/ignore-patterns.ts (1)

62-98: LGTM: Comprehensive generated file patterns.

The regex patterns provide good coverage of common generated/auto-generated code across multiple languages and frameworks. The case-insensitive matching is appropriate.

src/commands/trace.ts (4)

19-41: LGTM: formatPlain provides clean agent-friendly output.


46-92: LGTM: formatTree provides clean visual hierarchy.

The Unicode box-drawing characters create a clear visual structure for the call graph.


97-118: LGTM: formatJson provides complete structured data.


130-163: LGTM: Trace command correctly integrates GraphBuilder.

The command properly:

  • Handles project root discovery with fallback
  • Passes parsed options to GraphBuilder
  • Selects output format based on flags and TTY status
  • Cleans up resources in finally block
src/lib/search/searcher.ts (5)

17-62: LGTM: Search tuning constants are reasonable.

The static constants provide sensible defaults for the search pipeline. The FTS stopwords list covers common English words appropriately.


64-76: LGTM: FTS normalization is well-designed.

The normalization pipeline (lowercase, tokenize, filter short/stopwords, limit tokens) provides good query preparation for full-text search.


198-242: LGTM: Structure boost penalties align with search intent.

The scoring adjustments correctly de-prioritize:

  • Generated files (0.4x)
  • Test files (0.5x)
  • Tooling/scripts (0.35x)
  • Documentation (0.6x)

The penalties appropriately guide results toward production code.


287-413: LGTM: Search initialization and dual retrieval are well-structured.

The method correctly:

  • Handles abort signals at multiple checkpoints
  • Builds safe SQL filters with escapeSqlString
  • Performs parallel vector and FTS search
  • Gracefully handles FTS failures with fallback

The empty FTS query guard on line 396 is a good defensive check.


415-530: LGTM: Fusion, reranking, and result assembly are well-designed.

The pipeline correctly:

  • Fuses vector and FTS results with RRF
  • Applies ColBERT reranking with fallback
  • Blends rerank and fusion scores
  • Deduplicates and diversifies per-file
  • Normalizes scores and assigns confidence levels
src/commands/serve.ts (10)

79-88: Good defensive registration for spawned server.

Registering the spawned server immediately with port 0 prevents race conditions where multiple concurrent sessions might spawn duplicate servers before the first one finishes indexing and starts listening.


50-92: LGTM: Server check and background spawn are correct.

The code properly:

  • Checks for existing servers before spawning
  • Uses detached process with unref() for background operation
  • Registers spawned server early to prevent races

114-140: LGTM: Live indexing setup with write serialization.

The dbWriteBarrier pattern correctly serializes database writes to avoid LanceDB contention. The watch configuration provides good flexibility via environment variables.


192-322: Excellent two-phase update pattern.

The flushPending function correctly:

  1. Phase 1: Prepares updates outside the lock (hashing, embedding - slow operations)
  2. Phase 2: Applies under writer lock (DB updates - fast operations)

This minimizes lock hold time and maximizes throughput. The race detection on lines 280-295 (checking mtime/size before applying) is a nice safety measure.


347-365: Good EMFILE error recovery.

The watcher error handler correctly detects EMFILE (too many open files) and falls back to polling mode. The didLogWatchFallback flag prevents log spam.


454-465: LGTM: Health endpoint provides useful status.


518-578: Excellent request lifecycle management.

The /search endpoint correctly:

  • Limits payload size (1MB)
  • Validates path to prevent traversal
  • Sets up AbortController on both request abort and response close
  • Applies configurable timeout (60s default)
  • Passes abort signal through to searcher

The defensive checks for !res.headersSent && !res.writableEnded prevent double-response errors.


579-603: Good error handling with appropriate status codes.

The error handler correctly distinguishes:

  • AbortError → 504 (timeout/cancelled)
  • isWriting (indexing) → 503 (service unavailable)
  • Generic errors → 500

647-672: LGTM: Trace endpoint correctly integrates GraphBuilder.

The endpoint properly parses options, builds the graph, and returns a structured JSON response.


742-791: LGTM: Shutdown sequence correctly cleans up resources.

The shutdown function properly:

  1. Unregisters server from registry
  2. Closes filesystem watchers
  3. Closes HTTP server (with timeout fallback)
  4. Closes VectorDB and MetaCache
  5. Calls gracefulExit()

The resource cleanup order and error handling are appropriate.

src/lib/graph/graph-builder.ts (5)

5-24: LGTM: Interface design is clean and well-structured.

The separation of GraphNode, CallGraph, and TraceOptions provides clear contracts for the graph builder API.


33-44: LGTM: findDefinitions correctly queries for symbol definitions.


50-72: LGTM: findCallers with self-definition filtering.

The method correctly:

  • Queries for chunks that reference the symbol
  • Filters out self-definitions (where chunk also defines the symbol)
  • Determines caller symbol with appropriate fallbacks

The extra fetch (limit + 20) accounts for filtered records.


103-143: LGTM: buildGraph correctly assembles the call graph.

The method properly:

  • Finds definitions with optional path filtering
  • Selects first definition as center
  • Gets callers and callees based on options flags
  • Filters internal callees (have definitions in index)

145-173: LGTM: Helper methods handle LanceDB type coercion.

The recordToNode and toStringArray helpers correctly handle the various formats that LanceDB might return (arrays, objects with toArray(), etc.).

src/commands/search.ts (6)

264-268: LGTM: Option naming is clear.

The change from -m to -m, --max-count provides both short and long forms, improving CLI usability.


335-364: Excellent defensive health check.

The fast health preflight (150-250ms timeout) prevents hung servers from adding multi-second latency to every query. This is a good UX improvement.


366-463: LGTM: Server request with proper timeout and formatting.

The server request flow correctly:

  • Applies configurable timeout (700-1200ms)
  • Handles response with multiple format options
  • Filters results by minScore
  • Supports compact, plain, and formatted output paths

467-481: Good defensive server cleanup.

Unregistering the server on ECONNREFUSED or health check failures (lines 471-472) prevents future queries from paying the timeout cost. This is a good optimization.


487-577: LGTM: Local search setup with proper initialization.

The fallback path correctly:

  • Ensures setup and initializes VectorDB
  • Checks for lock and warns about partial results
  • Syncs if needed (first run or --sync flag)
  • Creates FTS index after sync

579-651: LGTM: Local search execution and cleanup.

The search execution properly:

  • Calls searcher with rerank enabled
  • Filters by minScore
  • Formats output (compact/plain/formatted)
  • Cleans up VectorDB in finally block
  • Calls gracefulExit

Comment on lines +78 to 97
async filterToInternal(symbols: string[]): Promise<string[]> {
if (symbols.length === 0) return [];

// 2. Get Callers
const callers = await this.getCallers(symbol);

// 3. Get Callees (from center)
const callees = center ? center.calls : [];
const table = await this.db.ensureTable();
const internal: string[] = [];

// Batch check which symbols have definitions
// For efficiency, we use a single query with OR conditions
// But LanceDB doesn't support OR in array_contains well, so we check individually
// This is acceptable since callees are typically <20 per function
for (const sym of symbols) {
const whereClause = `array_contains(defined_symbols, '${escapeSqlString(sym)}')`;
const records = await table.query().where(whereClause).limit(1).toArray();
if (records.length > 0) {
internal.push(sym);
}
}

return { center, callers, callees };
return internal;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n src/lib/graph/graph-builder.ts | head -110

Repository: Ryandonofrio3/osgrep

Length of output: 4339


🏁 Script executed:

# Check for calls to filterToInternal to understand typical symbol counts
rg -n "filterToInternal" src/lib/graph/

Repository: Ryandonofrio3/osgrep

Length of output: 254


🏁 Script executed:

# Check LanceDB array_contains capabilities and OR support
rg -n "array_contains\|LanceDB\|lance" src/ -i | head -30

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Look for any test files that might show typical usage
fd -e test.ts -e spec.ts src/lib/graph/

Repository: Ryandonofrio3/osgrep

Length of output: 425


🏁 Script executed:

cat -n src/lib/graph/graph-builder.ts | sed -n '103,160p'

Repository: Ryandonofrio3/osgrep

Length of output: 2257


🏁 Script executed:

# Check what rawCallees looks like and where it comes from
rg -n "rawCallees" src/lib/graph/graph-builder.ts -B2 -A2

Repository: Ryandonofrio3/osgrep

Length of output: 364


🏁 Script executed:

# Look for test files or usage examples showing callees counts
fd . src/lib/graph/ --type f

Repository: Ryandonofrio3/osgrep

Length of output: 94


🏁 Script executed:

# Check LanceDB documentation or any docs about OR limitations
fd -e md -e txt . | xargs rg -l "LanceDB\|array_contains\|OR" 2>/dev/null | head -5

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Find VectorRecord type definition
rg -n "interface VectorRecord\|type VectorRecord" src/

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Check for any stats, benchmarks, or test data
fd -e test.ts -e spec.ts -e test.js src/ --type f

Repository: Ryandonofrio3/osgrep

Length of output: 385


🏁 Script executed:

# Check package.json for LanceDB version and any docs
cat package.json | grep -A2 -B2 "lance\|LanceDB" -i

Repository: Ryandonofrio3/osgrep

Length of output: 234


🏁 Script executed:

# Search for any documentation about LanceDB limitations
fd -e md . | head -20

Repository: Ryandonofrio3/osgrep

Length of output: 111


🏁 Script executed:

cat -n src/lib/store/types.ts

Repository: Ryandonofrio3/osgrep

Length of output: 2478


🏁 Script executed:

cat -n src/lib/store/vector-db.ts | head -80

Repository: Ryandonofrio3/osgrep

Length of output: 2657


🏁 Script executed:

cat README.md

Repository: Ryandonofrio3/osgrep

Length of output: 10338


🌐 Web query:

LanceDB 0.22.3 array_contains OR conditions SQL WHERE clause

💡 Result:

Yes — LanceDB SQL supports array-contains operations and logical OR. Use the array-contains functions (array_contains_any / array_contains_all — docs sometimes show array_has_any/array_has_all for LABEL_LIST indexes) or combine array checks with OR. Example WHERE clauses:

  • any of a set (preferred):
    WHERE array_contains_any(tags, 'tagA', 'tagB')

  • explicit OR:
    WHERE array_contains(tags, 'tagA') OR array_contains(tags, 'tagB')

Also note: to accelerate these queries on List columns, create a LABEL_LIST scalar index (supports array_contains_any / array_contains_all). (lancedb.github.io)

Citations:


🌐 Web query:

LanceDB SQL WHERE OR operator array_contains support 2024 2025

💡 Result:

Short answer: Yes.

  • LanceDB supports SQL WHERE with logical operators including OR (and AND/NOT) for vector and table queries (docs/current as of 2024–2025). [1]
  • LanceDB also supports queries over list/array columns (functions exposed for array membership) and provides a LabelList scalar index to accelerate array queries (array_contains_all / array_contains_any or similarly named array_has_all/array_has_any in SDKs). Use a LabelList/Label_List index for best performance. [2][3]

Example (conceptual):

  • table.search(...).where("(price > 20) OR (array_contains_any(tags, ['tag1','tag2']))")

References:

  1. LanceDB — Metadata Filtering / SQL filtering (AND / OR etc.). [turn0search2].
  2. LanceDB — Scalar Indexes (LabelList / array_contains_any & array_contains_all support). [turn0search0].
  3. LanceDB Python/SDK docs — LabelList / array list indexing details. [turn0search1].

🏁 Script executed:

# Check if array_contains_any is used anywhere else in codebase
rg -n "array_contains_any\|array_has_any" src/

Repository: Ryandonofrio3/osgrep

Length of output: 46


🏁 Script executed:

# Check if there are performance tests or benchmarks
fd . src/ -name "*bench*" -o -name "*perf*" -o -name "*metric*"

Repository: Ryandonofrio3/osgrep

Length of output: 294


🏁 Script executed:

# Look for any TODO or FIXME comments related to performance optimization
rg -n "TODO.*batch\|FIXME.*batch\|TODO.*performance\|FIXME.*performance" src/

Repository: Ryandonofrio3/osgrep

Length of output: 46


Remove sequential querying and use LanceDB's array_contains_any() for batch lookups.

LanceDB 0.22.3 supports OR conditions and the array_contains_any() function, contrary to the comment on lines 85–87. The current code makes N individual database queries (one per symbol), but a single query using array_contains_any(defined_symbols, [symbol1, symbol2, ...]) would be far more efficient. For typical callees (<20 symbols), this change would reduce database round-trips from ~20 to 1.

Update the implementation to construct a single query with all symbols instead of looping, and correct the misleading comment about LanceDB limitations.

Comment thread src/lib/native/index.ts
Comment on lines +16 to +18
throw new Error(
`Failed to load osgrep-core native binding. Run 'npm run build:release' in osgrep/osgrep-core/: ${e}`
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update build instructions for bun.

The error message references npm run build:release, but the PR migrates to bun. Consider updating to bun run build:release for consistency.

Apply this diff:

-      `Failed to load osgrep-core native binding. Run 'npm run build:release' in osgrep/osgrep-core/: ${e}`
+      `Failed to load osgrep-core native binding. Run 'bun run build:release' in osgrep/osgrep-core/: ${e}`
🤖 Prompt for AI Agents
In src/lib/native/index.ts around lines 16 to 18, the thrown error message still
tells users to run "npm run build:release"; update that text to instruct "bun
run build:release" instead (or otherwise make it package-manager-agnostic) so
the message matches the project's migration to bun and guides users to the
correct build command.

…documentation. Update server response handling for initial sync state and improve output formatting for search results.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
README.md (1)

80-80: Default value inconsistency with code.

README states -m <n> defaults to 25, but src/commands/search.ts line 267 sets the default to "5". Align these values.

♻️ Duplicate comments (1)
README.md (1)

24-24: Outdated embeddings technology reference persists.

The AI summary indicates ONNX dependencies were replaced with Rust-native bindings (osgrep-core), but line 24 still mentions onnxruntime. This is similar to the past review comment but the reference was only partially updated.

-- **Local & Private:** 100% local embeddings via `onnxruntime`.
+- **Local & Private:** 100% local embeddings via native Rust bindings.
🧹 Nitpick comments (9)
plugins/osgrep/skills/osgrep/SKILL.md (1)

58-64: Add language specifier to fenced code block.

The output code block is missing a language identifier. Consider using text or plaintext for generic output.

 **Output:**
-```
+```text
 handleRequest
   def: src/server/handler.ts:45
   calls: validateAuth routeRequest sendResponse
   called_by: src/index.ts:12 src/api/router.ts:87

</blockquote></details>
<details>
<summary>src/commands/trace.ts (1)</summary><blockquote>

`176-187`: **Inconsistent callee deduplication between formatTree and formatPlainTree.**

`formatPlainTree` deduplicates callees and shows counts (e.g., `foo (x3)`), but `formatTree` iterates over `graph.callees` directly, potentially showing duplicates. This creates inconsistent output between `--pretty` and `--plain` modes.

Consider applying the same deduplication logic:

```diff
   if (hasCallees) {
     const branch = hasCallers ? "\u251c\u2500\u2500" : "\u2514\u2500\u2500";
     lines.push(`${branch} ${style.yellow("calls:")}`);
-    graph.callees.forEach((callee, i) => {
+    
+    // Dedupe callees
+    const counts = new Map<string, number>();
+    for (const c of graph.callees) {
+      counts.set(c, (counts.get(c) || 0) + 1);
+    }
+    const calleeList = [...counts.entries()];
+    
+    calleeList.forEach(([callee, count], i) => {
       const prefix = hasCallers ? "\u2502   " : "    ";
       const sym =
-        i === graph.callees.length - 1
+        i === calleeList.length - 1
           ? "\u2514\u2500\u2500"
           : "\u251c\u2500\u2500";
-      lines.push(`${prefix}${sym} ${callee}`);
+      const countStr = count > 1 ? style.dim(` (x${count})`) : "";
+      lines.push(`${prefix}${sym} ${callee}${countStr}`);
     });
   }
src/commands/serve.ts (3)

156-180: Consider consolidating ignore logic to reduce duplication.

shouldIgnoreRelPath manually reimplements patterns from DEFAULT_IGNORE_PATTERNS. While the inline string checks are faster for the hot watcher path, the duplicated logic may drift over time. Consider extracting shared constants or documenting that this is intentionally separate for performance.


387-392: Simplify filename type handling.

The nested type checks are defensive but verbose. Node's fs.watch can return Buffer on some platforms, so the toString("utf-8") fallback is correct. Consider simplifying:

-              const name =
-                typeof filename === "string"
-                  ? filename
-                  : typeof (filename as any)?.toString === "function"
-                    ? String((filename as any).toString("utf-8"))
-                    : "";
+              const name = typeof filename === "string"
+                ? filename
+                : Buffer.isBuffer(filename)
+                  ? filename.toString("utf-8")
+                  : "";

656-733: Add timeout handling for /trace endpoint.

The /search endpoint has timeout and AbortController support, but /trace does not. buildGraph could be slow for deep traversals. Consider adding similar timeout protection:

+            const ac = new AbortController();
+            const timeout = setTimeout(() => ac.abort(), 30000);
+
             req.on("end", async () => {
               if (aborted) return;
               try {
                 // ... existing code ...
+              } catch (err) {
+                if (err instanceof Error && err.name === "AbortError") {
+                  res.statusCode = 504;
+                  res.setHeader("Content-Type", "application/json");
+                  res.end(JSON.stringify({ error: "trace_timeout" }));
+                  return;
+                }
                 // ... existing error handling ...
+              } finally {
+                clearTimeout(timeout);
               }
             });
plugins/osgrep/hooks/start.js (4)

24-30: findProjectRoot currently reduces to just resolve(startDir)

Given the current implementation, the .osgrep / .git existence check does not influence the return value: the function always returns start. That’s correct but a bit misleading to future readers and easy to let drift from src/lib/utils/project-root.ts.

Consider either:

  • Simplifying to a single return _path.resolve(startDir || process.cwd());, or
  • Mirroring the TS utility’s behavior (including any special handling like PATHS.globalRoot) if you intend this hook to share the same semantics.

32-42: Manual servers.json access can drift from server-registry behavior

getServerForProject reimplements registry lookup and directly parses $HOME/.osgrep/servers.json. This works, but it duplicates logic that is already centralized in src/lib/utils/server-registry.ts (via listServers() / getServerForProject), and any future changes to the registry path or shape will require remembering to update this hook as well.

If feasible, consider:

  • Reusing a single source of truth for computing the registry path and validating entries, or
  • At minimum, adding a brief comment pointing at src/lib/utils/server-registry.ts so maintainers know to keep these behaviors in sync.

47-57: Optional: still report an already-running server when auto-serve is disabled

The OSGREP_DISABLE_AUTO_SERVE === "1" branch always returns a generic “auto-start disabled” message without checking for or reporting an already-running server for this project. That’s probably fine for most cases, but it may be useful for users running osgrep serve manually to still see PID/port info in this message.

If you want that behavior, you could:

  • Compute projectRoot / existing even when auto-serve is disabled, and
  • Extend additionalContext accordingly when a server is already up.

59-71: Background spawn logic is reasonable; be aware of potential double-spawn window

Using getServerForProject(projectRoot) before spawn("osgrep", ["serve", "--background"]) is a good guard, and detached + unref is appropriate for a background daemon.

There are a couple of minor considerations:

  • There’s an inherent race window where servers.json hasn’t been updated yet by a just-started server, so a second fast invocation of this hook could still spawn another osgrep serve. If that’s a concern in your environment, you might later introduce a lightweight lock or a cheap health check before spawning again.
  • Since logPath is opened once and passed as stdio, you could optionally fs.closeSync(out) after child.unref() to be explicit about parent FD cleanup, though in this short-lived process it’s unlikely to matter in practice.

None of these are correctness blockers; just things to keep in mind.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between b261fdb and d230535.

📒 Files selected for processing (6)
  • README.md (4 hunks)
  • plugins/osgrep/hooks/start.js (1 hunks)
  • plugins/osgrep/skills/osgrep/SKILL.md (2 hunks)
  • src/commands/search.ts (8 hunks)
  • src/commands/serve.ts (10 hunks)
  • src/commands/trace.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
plugins/osgrep/hooks/start.js (2)
src/lib/utils/server-registry.ts (2)
  • isProcessRunning (33-40)
  • getServerForProject (66-70)
src/lib/utils/project-root.ts (1)
  • findProjectRoot (14-29)
src/commands/trace.ts (4)
src/lib/graph/graph-builder.ts (2)
  • CallGraph (13-17)
  • GraphBuilder (26-173)
src/lib/utils/project-root.ts (2)
  • findProjectRoot (14-29)
  • ensureProjectPaths (31-51)
src/lib/store/vector-db.ts (1)
  • VectorDB (20-302)
src/lib/utils/exit.ts (1)
  • gracefulExit (3-20)
src/commands/serve.ts (7)
src/lib/setup/setup-helpers.ts (1)
  • ensureSetup (25-56)
src/lib/store/meta-cache.ts (2)
  • MetaCache (12-60)
  • MetaEntry (6-10)
src/lib/utils/file-utils.ts (1)
  • isIndexableFile (44-63)
src/lib/utils/lock.ts (1)
  • acquireWriterLockWithRetry (87-109)
src/lib/index/ignore-patterns.ts (1)
  • DEFAULT_IGNORE_PATTERNS (3-60)
src/lib/utils/exit.ts (1)
  • gracefulExit (3-20)
src/lib/utils/project-root.ts (1)
  • findProjectRoot (14-29)
src/commands/search.ts (1)
src/lib/utils/server-registry.ts (1)
  • unregisterServer (50-54)
🪛 LanguageTool
plugins/osgrep/skills/osgrep/SKILL.md

[style] ~68-~68: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...on and need to know what calls it - You need to understand what a function depends on -...

(REP_NEED_TO_VB)

🪛 markdownlint-cli2 (0.18.1)
README.md

196-196: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

plugins/osgrep/skills/osgrep/SKILL.md

59-59: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
🔇 Additional comments (19)
plugins/osgrep/skills/osgrep/SKILL.md (1)

50-80: Documentation for trace command is comprehensive.

The trace command documentation clearly explains the feature with practical examples for --callers, --callees, and --path options. This aligns well with the implementation in src/commands/trace.ts.

src/commands/trace.ts (9)

7-13: LGTM!

Inline ANSI escape sequences are a reasonable choice for minimal dependencies. The reset codes are correctly matched to their opening codes.


18-26: LGTM!

Clean implementation for deduplicating and counting callees.


31-45: LGTM!

Grouping and sorting logic is correct.


50-54: LGTM!

Straightforward counting logic for format selection.


60-124: LGTM!

The adaptive format selection and tree rendering are well-implemented. The deduplication of callees in formatPlainTree is a nice touch.


129-156: LGTM!

Compact format implementation is clean and reuses helper functions appropriately.


212-233: LGTM!

JSON output preserves raw data appropriately, allowing consumers to handle deduplication as needed.


253-258: Verify: depth=0 falls back to 1 due to falsy OR.

Number.parseInt(cmd.depth, 10) || 1 will treat an explicit --depth 0 as depth: 1 because 0 is falsy. If depth=0 is intentionally invalid, this is fine. Otherwise, consider using nullish coalescing:

-        depth: Number.parseInt(cmd.depth, 10) || 1,
+        depth: Number.parseInt(cmd.depth, 10) ?? 1,

Or add validation to reject depth < 1 explicitly with an error message.


273-278: LGTM!

Resource management is correct—db.close() in the finally block ensures cleanup regardless of success or failure, and gracefulExit() handles process termination appropriately.

src/commands/serve.ts (3)

78-88: Early registration for spawned child is a good race-condition mitigation.

Registering the child process immediately (before it finishes indexing) allows discovery/stop commands to find it. However, if the child fails to start or crashes during setup, this stale entry may linger until the next serve or stop command cleans it up.


330-331: Potential unhandled rejection in write barrier chain.

dbWriteBarrier.then(apply, apply) passes apply as both the fulfillment and rejection handler. If apply throws, the rejection propagates but the next chained apply call may not properly serialize. Consider wrapping in a catch to ensure the chain continues:

-        dbWriteBarrier = dbWriteBarrier.then(apply, apply);
+        dbWriteBarrier = dbWriteBarrier.then(apply).catch((err) => {
+          console.error("[serve] write barrier error:", err);
+        }).then(apply);

Or simply ensure apply never throws by wrapping its body in try/catch (which it already does), so the current pattern should work. Verify that any throw inside apply is caught.


872-901: Robust process termination handling.

The stopPid function properly handles edge cases: sending SIGCONT to resume stopped processes before SIGTERM, and escalating to SIGKILL after timeout. The polling interval (75ms) and timeouts (2s each) are reasonable.

src/commands/search.ts (3)

351-364: Health check scope block limits AbortController reuse.

The inner { ... } block scopes ac and timeout to the health check, preventing accidental reuse. This is intentional and clean.


483-496: Proactive stale server cleanup is a good UX optimization.

Unregistering servers that fail health checks or refuse connections prevents repeated timeout penalties on subsequent searches.


504-506: Debug timing instrumentation is well-structured.

The t/te helper pattern keeps timing opt-in and non-invasive. The guards (DEBUG_TIMING && ...) ensure no overhead when disabled.

README.md (1)

312-315: Development commands updated to bun.

The migration from pnpm to bun is reflected correctly in the development section.

plugins/osgrep/hooks/start.js (2)

14-22: isProcessRunning implementation looks solid

The PID validation plus process.kill(pid, 0) pattern is standard and matches the behavior in src/lib/utils/server-registry.ts. No issues from a correctness standpoint.


77-79: Session message formatting is clear and user-friendly

The conditional additionalContext that distinguishes between “running (PID / Port)” and “starting (indexing in background)” is clear and conveys the important state to the caller.

Comment thread README.md
Comment on lines +196 to +201
```
handleRequest
def: src/server/handler.ts:45
calls: validateAuth routeRequest sendResponse
called_by: src/index.ts:12 src/api/router.ts:87
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language specifier to fenced code block.

Per markdown lint (MD040), fenced code blocks should have a language specified. Since this shows command output, use text or plaintext:

-```
+```text
 handleRequest
   def: src/server/handler.ts:45
   calls: validateAuth routeRequest sendResponse
   called_by: src/index.ts:12 src/api/router.ts:87

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

196-196: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

In README.md around lines 196 to 201, the fenced code block lacks a language
specifier which violates MD040; update the block opener to include a language
such as "text" (e.g., ```text) so the fenced block is annotated appropriately
while leaving the block contents unchanged.


</details>

<!-- fingerprinting:phantom:medusa:ocelot -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment thread src/commands/search.ts
Comment on lines 264 to 268
.option(
"-m <max_count>, --max-count <max_count>",
"-m, --max-count <max_count>",
"The maximum number of results to return (total)",
"5",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Option default mismatch with README.

The default here is "5", but README.md line 80 says -m <n> defaults to 25. Either update the code or the documentation.

🤖 Prompt for AI Agents
In src/commands/search.ts around lines 264 to 268, the CLI option default for
-m/--max-count is set to "5" but README.md documents a default of 25; update the
code to match the documented default by changing the option default to "25" (and
ensure any code that reads this option converts it to a number), or
alternatively update README.md to state the default is 5—pick one source of
truth and make the values consistent.

Comment thread src/commands/serve.ts
try {
nativeWatcher?.close();
} catch {}
nativeWatcher = null
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing semicolon.

-        nativeWatcher = null
+        nativeWatcher = null;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
nativeWatcher = null
nativeWatcher = null;
🤖 Prompt for AI Agents
In src/commands/serve.ts around line 804, the assignment statement
"nativeWatcher = null" is missing a trailing semicolon; add the semicolon to
make it "nativeWatcher = null;" and ensure the file passes the project's
linter/formatter.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
README.md (2)

24-26: Update “Local & Private” blurb to reflect Rust-native core instead of onnxruntime

This bullet still calls out “100% local embeddings via onnxruntime”, but the PR (and past review) indicate embeddings now come from the Rust-native osgrep-core backend. Consider updating this line to explicitly mention the Rust core (and drop the onnxruntime reference) so the README matches the current architecture.


200-206: Add a language to the fenced trace output block (MD040)

The osgrep trace output example is a bare fenced block; markdownlint (MD040) wants a language. Since this is plain CLI output, text is appropriate:

-```
+```text
 handleRequest
   def: src/server/handler.ts:45
   calls: validateAuth routeRequest sendResponse
   called_by: src/index.ts:12 src/api/router.ts:87

</blockquote></details>
<details>
<summary>src/commands/serve.ts (1)</summary><blockquote>

`827-838`: **Missing semicolon after `nativeWatcher = null`**

This assignment is the only statement in the block without a trailing semicolon and was previously flagged; adding it keeps the file consistent with the rest of the style and avoids linter complaints:

```diff
-        nativeWatcher = null
+        nativeWatcher = null;
src/commands/search.ts (1)

379-382: Align -m/--max-count default with README (5 vs 25 mismatch)

The CLI default here is "5", but README’s osgrep search table still advertises a default of 25. This mismatch was flagged previously and is still present. Either:

  • Bump the default here (and the fallback in limit computation) to "25", or
  • Update the README to say the default is 5.

Right now users can’t rely on the documented default.

🧹 Nitpick comments (5)
src/lib/search/expander.ts (1)

514-565: Content-cleaning logic duplicated with Searcher.mapRecordToChunk

mapRecordToChunk here reimplements similar header-stripping/truncation and symbol array handling as Searcher.mapRecordToChunk. If both diverge over time, CLI/server vs expansion output can drift in subtle ways. Consider extracting a shared helper (or factoring into a small utility module) so both code paths stay in sync.

tests/expander.test.ts (2)

154-183: Strengthen “circular references” assertion to actually exercise the loop

The “circular references” test description implies guarding against cycles, but currently:

  • Only a single mockResolvedValueOnce is set up, so depth-2 symbol expansion never actually sees the back-edge, and
  • The assertion expect(ids.size).toBe(expanded.expanded.length); is tautologically true even when expanded.expanded.length is 0 or 1.

It would be more meaningful to (a) configure two successive query responses that actually create A ↔ B, and (b) assert that the same chunk isn’t added twice (e.g., by checking for a specific id/path appearing only once).


260-267: Neighbor expansion test’s core assertion is effectively a no-op

expect(neighbors.length).toBeGreaterThanOrEqual(0); will always pass. Since you’re already guarding with if (neighbors.length > 0), you could instead assert toBe(1) (given the mock) or at least toBeGreaterThan(0) to make the test informative.

src/commands/serve.ts (1)

80-107: Verify registerServer behavior with “port: 0” early registration

You now register the background child (and the foreground server) with port: 0 before the HTTP server actually binds, then register again with the real port once listen succeeds. This relies on registerServer deduping on PID or project root; otherwise you may accumulate duplicate entries (one with port 0, one with the real port) in the registry.

If registerServer doesn’t already replace existing entries for the same PID/project, consider updating it accordingly, or document that assumption here.

src/lib/search/searcher.ts (1)

393-407: Avoid treating empty normalized FTS queries as “failed” searches

normalizeFtsQuery can legitimately return an empty string for very short/stopword-heavy queries, but the code throws new Error("Empty FTS query after normalization"), which then logs as [Searcher] FTS search failed: .... That makes normal input look like an error.

You might instead short‑circuit without logging in this case:

const ftsText = Searcher.normalizeFtsQuery(query);
if (!ftsText) {
  ftsResults = [];
} else {
  // existing FTS query logic
}

This preserves the graceful fallback without noisy warnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between d230535 and d0cf7b2.

📒 Files selected for processing (8)
  • README.md (6 hunks)
  • src/commands/agent/opencode.ts (0 hunks)
  • src/commands/search.ts (11 hunks)
  • src/commands/serve.ts (10 hunks)
  • src/lib/search/expander.ts (1 hunks)
  • src/lib/search/expansion-types.ts (1 hunks)
  • src/lib/search/searcher.ts (11 hunks)
  • tests/expander.test.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • src/commands/agent/opencode.ts
🧰 Additional context used
🧬 Code graph analysis (6)
tests/expander.test.ts (2)
src/lib/store/types.ts (1)
  • ChunkType (50-67)
src/lib/search/expander.ts (1)
  • Expander (107-567)
src/lib/search/expander.ts (3)
src/lib/search/expansion-types.ts (4)
  • ExpandOptions (23-32)
  • ExpandedResult (71-82)
  • ExpansionStats (53-66)
  • ExpansionNode (37-48)
src/lib/store/vector-db.ts (1)
  • VectorDB (20-302)
src/lib/store/types.ts (1)
  • ChunkType (50-67)
src/commands/serve.ts (5)
src/lib/store/meta-cache.ts (2)
  • MetaCache (12-60)
  • MetaEntry (6-10)
src/lib/store/types.ts (1)
  • VectorRecord (31-35)
src/lib/utils/file-utils.ts (1)
  • isIndexableFile (44-63)
src/lib/utils/lock.ts (1)
  • acquireWriterLockWithRetry (87-109)
src/lib/index/sync-helpers.ts (1)
  • InitialSyncProgress (13-22)
src/lib/search/expansion-types.ts (1)
src/lib/store/types.ts (1)
  • ChunkType (50-67)
src/commands/search.ts (4)
src/lib/search/expansion-types.ts (3)
  • ExpandOptions (23-32)
  • ExpandedResult (71-82)
  • ExpansionNode (37-48)
src/lib/output/formatter.ts (1)
  • formatResults (101-108)
src/lib/utils/server-registry.ts (1)
  • unregisterServer (50-54)
src/lib/utils/project-root.ts (2)
  • findProjectRoot (14-29)
  • ensureProjectPaths (31-51)
src/lib/search/searcher.ts (5)
src/lib/index/ignore-patterns.ts (1)
  • GENERATED_FILE_PATTERNS (65-98)
src/lib/workers/orchestrator.ts (4)
  • encodeQuery (204-235)
  • encodeQuery (352-354)
  • rerank (237-330)
  • rerank (356-362)
src/lib/utils/filter-builder.ts (1)
  • escapeSqlString (1-5)
src/lib/search/expansion-types.ts (2)
  • ExpandOptions (23-32)
  • ExpandedResult (71-82)
src/lib/search/expander.ts (1)
  • Expander (107-567)
🪛 markdownlint-cli2 (0.18.1)
README.md

201-201: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
🔇 Additional comments (3)
src/lib/search/expansion-types.ts (1)

18-101: Expansion type surface is well-structured and future-proof

The expansion model (strategies, nodes, stats, result container, and internal ExpansionChunk) looks coherent and matches how Expander and the CLI use it; the extra "coexports" strategy is fine to leave for future implementation.

src/lib/search/searcher.ts (1)

534-592: Searcher–Expander integration looks clean and opt‑in

The new expand and searchAndExpand methods keep expansion strictly opt‑in, reuse the same VectorDB instance, and thread ExpandOptions through without coupling search and expansion logic. This is a solid separation of concerns and should make it easy for both the CLI and HTTP server to wire in --deep / deep behavior consistently.

src/commands/search.ts (1)

398-402: --deep wiring from CLI through HTTP and local search looks consistent

The new --deep flag flows into getDeepExpandOptions, into the /search request body (deep: options.deep), and into local Searcher.expand via expandOpts. The default strategies (["callers", "symbols"], depth 2) match both the README description and the server’s expandOpts construction, which keeps behavior aligned across local and daemon modes.

Comment thread src/commands/serve.ts
Comment on lines +278 to +333
const apply = async () => {
let lock: { release: () => Promise<void> } | null = null;
try {
isWriting = true;
lock = await acquireWriterLockWithRetry(paths.osgrepDir, {
maxRetries: 2,
retryDelayMs: 250,
});

if (unlinks.length > 0) {
await vectorDb.deletePaths(unlinks);
for (const relPath of unlinks) {
metaCache.delete(relPath);
}
}

for (const item of prepared) {
// If the file changed again since we prepared, skip and let the next event handle it.
try {
const stats = await fs.promises.stat(item.absPath);
if (
stats.mtimeMs !== item.meta.mtimeMs ||
stats.size !== item.meta.size
) {
pendingUpserts.add(item.relPath);
continue;
}
} catch (err) {
const code = (err as NodeJS.ErrnoException)?.code;
if (code === "ENOENT") {
pendingUnlinks.add(item.relPath);
}
continue;
}

await vectorDb.deletePaths([item.relPath]);
if (!item.shouldDelete && item.vectorsCount > 0) {
await vectorDb.insertBatch(item.vectors as VectorRecord[]);
}
metaCache.put(item.relPath, item.meta);
}

if (watchVerbose && (plannedUnlinks > 0 || plannedUpserts > 0)) {
console.log(
`[serve] live index applied: upserts=${plannedUpserts} unlinks=${plannedUnlinks}`,
);
}
} finally {
if (lock) await lock.release();
isWriting = false;
}
};

dbWriteBarrier = dbWriteBarrier.then(apply, apply);
await dbWriteBarrier;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure all live-index writes complete before closing VectorDB on shutdown

flushPending chains writer work onto dbWriteBarrier and awaits it there, but shutdown closes the server and then calls vectorDb.close() without first awaiting dbWriteBarrier. If a flush is in progress when SIGINT/SIGTERM arrives, you can end up closing the DB while writes are still in flight.

Consider explicitly awaiting the barrier before closing the DB, e.g.:

-        // Clean close of vectorDB
-        try {
-          await vectorDb.close();
-        } catch (e) {
-          console.error("Error closing vector DB:", e);
-        }
+        // Wait for any pending live-index writes to finish, then close the DB
+        try {
+          await dbWriteBarrier;
+        } catch (e) {
+          console.error("Error waiting for pending writes:", e);
+        }
+        try {
+          await vectorDb.close();
+        } catch (e) {
+          console.error("Error closing vector DB:", e);
+        }

This keeps shutdown graceful even under concurrent file-change load.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/commands/serve.ts around lines 278 to 333, shutdown currently calls
vectorDb.close() without ensuring pending live-index writes chained on
dbWriteBarrier have finished; this can close the DB while writes are in flight.
Update the shutdown sequence/handlers to await the current dbWriteBarrier (and
any final flushPending chain) before calling vectorDb.close() — i.e., ensure you
await dbWriteBarrier (and any final dbWriteBarrier = dbWriteBarrier.then(noop)
resolution) or call await flushPending() from shutdown so all apply() work
completes, then call vectorDb.close(), and only after close proceed to exit the
process.

Comment on lines +126 to +145
const stats: ExpansionStats = {
symbolsResolved: 0,
callersFound: 0,
neighborsAdded: 0,
totalChunks: results.length,
totalTokens: 0,
};

// Track seen chunk IDs to avoid duplicates
const seen = new Set<string>();
for (const r of results) {
const id = this.getChunkId(r);
if (id) seen.add(id);
stats.totalTokens += estimateTokens(r.text || "");
}

// Token budget tracking
let budgetRemaining = maxTokens > 0 ? maxTokens - stats.totalTokens : Infinity;

let table: Table;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider marking truncated when the initial token budget is already exhausted

If maxTokens > 0 and the original results alone exceed that budget, budgetRemaining becomes ≤ 0 so the BFS loop never runs, but truncated remains false. From a caller’s perspective that still represents “a limit was hit during expansion”. Consider setting truncated = true in that case so clients don’t misinterpret “no expansion” as “no limits hit”.

🤖 Prompt for AI Agents
In src/lib/search/expander.ts around lines 126 to 145, the code computes
budgetRemaining after summing tokens from initial results but does not mark
truncated when the initial results alone exceed maxTokens; update the logic to
set truncated = true when maxTokens > 0 and budgetRemaining <= 0 (i.e., initial
totalTokens exceed or meet the budget) so callers know a token limit was hit
before expansion begins; ensure this is done immediately after computing
budgetRemaining and before the BFS/expansion loop so existing flow and return
values reflect the truncated state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant