Skip to content

Compatibility: add EMBEDDING_SEND_DIM / EMBEDDING_TOKEN_LIMIT and Cohere rerank chunking options #18

@joonsoome

Description

@joonsoome

LightRAG’s latest update (+1.4.9.9) introduced new embedding and rerank configuration options. We should verify embed-rerank compatibility and add support where missing so LightRAG integrations keep working.

New options to cover:

  • EMBEDDING_SEND_DIM: control whether to send the dimensions parameter for OpenAI/Gemini embeddings (Gemini requires true).
  • EMBEDDING_TOKEN_LIMIT: token cap used for automatic embedding truncation.
  • Cohere rerank chunking:
    RERANK_ENABLE_CHUNKING
    RERANK_MAX_TOKENS_PER_DOC

Tasks:

  • Audit current embedding/rerank config and request handling for these options (or equivalent aliases).
  • If missing, add settings/env aliases and wire them into:
    • Embedding request handling (dimensions on/off + default token limit).
    • Cohere rerank handling (optional chunking by token limit, deterministic aggregation back to doc-level scores).
  • Update README/docs with new configuration keys and defaults.
  • Add a minimal test or repro checklist for these options.

Acceptance criteria:

  • LightRAG can set these env vars without errors and with expected behavior.
  • Default behavior is unchanged when the options are unset.
  • Documentation is updated with clear examples.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions