Skip to content

feat(embeddings): add GITNEXUS_EMBEDDING_OMIT_DIMENSIONS for provider…#2047

Open
CheckPickerUpper wants to merge 3 commits into
abhigyanpatwari:mainfrom
CheckPickerUpper:feat/embedding-omit-dimensions
Open

feat(embeddings): add GITNEXUS_EMBEDDING_OMIT_DIMENSIONS for provider…#2047
CheckPickerUpper wants to merge 3 commits into
abhigyanpatwari:mainfrom
CheckPickerUpper:feat/embedding-omit-dimensions

Conversation

@CheckPickerUpper

Copy link
Copy Markdown

Summary

Adds an opt-in GITNEXUS_EMBEDDING_OMIT_DIMENSIONS flag so embedding providers that reject the dimensions request field (e.g. Voyage) can be used, while the returned vector length is still validated against GITNEXUS_EMBEDDING_DIMS.

Motivation / context

GITNEXUS_EMBEDDING_DIMS is overloaded: it is both the dimensions field sent in the request and the length the returned vector is validated against. Some OpenAI-compatible endpoints reject the dimensions field with a 400 while still returning a fixed-size vector. Voyage, for example, returns 400 "Argument 'dimensions' is not supported by our API". That leaves no working configuration: setting DIMS triggers the 400, and leaving it unset makes validation expect the 384 default and reject the provider's real 1024-d vector.

Areas touched

  • gitnexus/ (CLI / core / MCP server)
  • gitnexus-web/ (Vite / React UI)
  • .github/ (workflows, actions)
  • eval/ or other tooling
  • Docs / agent config only

Scope & constraints

In scope

  • New opt-in GITNEXUS_EMBEDDING_OMIT_DIMENSIONS env flag (1/true) read in the HTTP embedding client.
  • Suppresses only the sent dimensions field, on both the batch (httpEmbed) and single-query (httpEmbedQuery) paths.
  • GITNEXUS_EMBEDDING_DIMS is still honoured to validate the returned vector length.
  • README env-var docs plus two unit tests.

Explicitly out of scope / not done here

  • No change to the local (transformers.js) embedding path.
  • No provider auto-detection. The flag is explicit and off by default.
  • No change to default behaviour: with the flag unset, the request body is byte-for-byte identical to today.

Implementation notes

readConfig() gains an omitDimensionsField boolean derived from the env var. Both call sites pass undefined for the request dimensions when it is set, so the field is omitted; the existing validation (config.dimensions ?? DEFAULT_DIMS) is untouched, so a mismatched response is still rejected.

Testing & verification

  • npx vitest run test/unit/http-embedder.test.ts (in gitnexus/): 29/29 pass, including two new cases asserting the dimensions field is omitted when GITNEXUS_EMBEDDING_OMIT_DIMENSIONS is set (batch and single-query paths) and still sent when it is not.
  • Type-check via the project tsconfig.json (in gitnexus/): clean, 0 type errors.
  • Manual end-to-end against Voyage: GITNEXUS_EMBEDDING_URL=https://api.voyageai.com/v1, model voyage-code-3, GITNEXUS_EMBEDDING_DIMS=1024, GITNEXUS_EMBEDDING_OMIT_DIMENSIONS=1. gitnexus analyze then gitnexus query both succeed. Without the flag, analyze fails with the Voyage 400.

Risk & rollout

Opt-in and backward-compatible. With the flag unset, behaviour is unchanged: no migration, no forced re-index. Only users switching to a field-rejecting provider set the flag and re-run gitnexus analyze.

Checklist

  • PR body meets repo minimum length
  • If AGENTS.md / overlays changed: N/A, not touched
  • No secrets, tokens, or machine-specific paths committed

…s that reject the dimensions field

GITNEXUS_EMBEDDING_DIMS drives both the `dimensions` field sent in the request
and the length the returned vector is validated against. Some OpenAI-compatible
endpoints (e.g. Voyage) reject the `dimensions` field with a 400 while still
returning a fixed-size vector, so they cannot be configured today: setting DIMS
triggers the 400, and leaving it unset makes validation expect the 384 default
and reject the real vector.

Add an opt-in GITNEXUS_EMBEDDING_OMIT_DIMENSIONS flag that suppresses only the
sent field. DIMS is still honoured for validation. Default behaviour is unchanged.
@vercel

vercel Bot commented Jun 4, 2026

Copy link
Copy Markdown

@CheckPickerUpper is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant