Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Nov 9, 2025

This PR attempts to address Issue #9142. Feedback and guidance are welcome.

Problem

When using Qwen3-Embedding via Alibaba Cloud's OpenAI-compatible interface, users encounter an HTTP 400 error stating that the batch size should not be larger than 10. The embedding service was sending batches based only on token limits, without considering the item count limit.

Solution

  • Added a configurable maxBatchSize parameter to the OpenAICompatibleEmbedder class
  • Implemented automatic detection of Qwen models (including text-embedding-v* patterns used by Alibaba Cloud)
  • Set automatic batch size limit of 10 items for detected Qwen models
  • Modified the batching logic to respect both token limits AND item count limits
  • Added comprehensive test coverage for the new functionality

Changes

  1. Modified src/services/code-index/embedders/openai-compatible.ts:

    • Added maxBatchSize property and constructor parameter
    • Added isQwenModel() method to detect Qwen and Alibaba Cloud models
    • Updated batching logic to check both token and item count limits
  2. Added src/services/code-index/embedders/__tests__/openai-compatible-batch-size.spec.ts:

    • Tests for automatic batch size limiting with Qwen models
    • Tests for custom batch size configuration
    • Tests for various model name patterns
    • Tests for interaction between token and batch size limits

Testing

  • All new tests pass ✅
  • All existing tests pass ✅
  • No regression in existing functionality

Fixes #9142


Important

Adds batch size limiting for Qwen models in OpenAICompatibleEmbedder with automatic detection and comprehensive tests.

  • Behavior:
    • Adds maxBatchSize parameter to OpenAICompatibleEmbedder to limit batch size for Qwen models.
    • Automatically detects Qwen models and sets batch size limit to 10.
    • Updates batching logic in openai-compatible.ts to respect both token and item count limits.
  • Testing:
    • Adds openai-compatible-batch-size.spec.ts with tests for Qwen model detection, batch size limits, and custom batch size configuration.
    • Tests interaction between token and batch size limits.
  • Misc:
    • Implements isQwenModel() method in openai-compatible.ts to identify Qwen models.

This description was created by Ellipsis for d08c5c2. You can customize this summary. It will automatically update as commits are pushed.

…e interface

- Added configurable maxBatchSize parameter to OpenAICompatibleEmbedder
- Automatically detect and limit batch size to 10 for Qwen models
- Respect both token limits and batch size limits when batching
- Added comprehensive tests for batch size handling

Fixes #9142
@roomote roomote bot requested review from cte, jr and mrubens as code owners November 9, 2025 06:39
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Nov 9, 2025
@roomote
Copy link
Contributor Author

roomote bot commented Nov 9, 2025

Rooviewer Clock   See task on Roo Cloud

Review complete. Found 1 issue that should be addressed:

  • Fix overly broad model detection pattern in isQwenModel() to avoid false positives

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment on lines +516 to +517
lowerModelId.includes("qwen") ||
lowerModelId.includes("text-embedding-v") || // Alibaba Cloud uses text-embedding-v4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern lowerModelId.includes("text-embedding-v") is too broad and will incorrectly match non-Qwen models. For example, it would match hypothetical OpenAI models like text-embedding-v1 or custom models like my-text-embedding-v2. Since Alibaba Cloud specifically uses text-embedding-v3 and text-embedding-v4 (with a digit after the 'v'), the pattern should be more precise to avoid false positives.

Suggested change
lowerModelId.includes("qwen") ||
lowerModelId.includes("text-embedding-v") || // Alibaba Cloud uses text-embedding-v4
lowerModelId.includes("qwen") ||
/text-embedding-v\d/.test(lowerModelId) || // Alibaba Cloud uses text-embedding-v4
lowerModelId.includes("text-embedding-3d")

Fix it with Roo Code or mention @roomote and request a fix.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

[BUG] Error when using Qwen3-Embedding via OpenAI-compatible interface

3 participants