fix: add batch size limiting for Qwen embeddings via OpenAI-compatible interface #9143

roomote · 2025-11-09T06:39:24Z

This PR attempts to address Issue #9142. Feedback and guidance are welcome.

Problem

When using Qwen3-Embedding via Alibaba Cloud's OpenAI-compatible interface, users encounter an HTTP 400 error stating that the batch size should not be larger than 10. The embedding service was sending batches based only on token limits, without considering the item count limit.

Solution

Added a configurable maxBatchSize parameter to the OpenAICompatibleEmbedder class
Implemented automatic detection of Qwen models (including text-embedding-v* patterns used by Alibaba Cloud)
Set automatic batch size limit of 10 items for detected Qwen models
Modified the batching logic to respect both token limits AND item count limits
Added comprehensive test coverage for the new functionality

Changes

Modified src/services/code-index/embedders/openai-compatible.ts:
- Added maxBatchSize property and constructor parameter
- Added isQwenModel() method to detect Qwen and Alibaba Cloud models
- Updated batching logic to check both token and item count limits
Added src/services/code-index/embedders/__tests__/openai-compatible-batch-size.spec.ts:
- Tests for automatic batch size limiting with Qwen models
- Tests for custom batch size configuration
- Tests for various model name patterns
- Tests for interaction between token and batch size limits

Testing

All new tests pass ✅
All existing tests pass ✅
No regression in existing functionality

Fixes #9142

Important

Adds batch size limiting for Qwen models in OpenAICompatibleEmbedder with automatic detection and comprehensive tests.

Behavior:
- Adds maxBatchSize parameter to OpenAICompatibleEmbedder to limit batch size for Qwen models.
- Automatically detects Qwen models and sets batch size limit to 10.
- Updates batching logic in openai-compatible.ts to respect both token and item count limits.
Testing:
- Adds openai-compatible-batch-size.spec.ts with tests for Qwen model detection, batch size limits, and custom batch size configuration.
- Tests interaction between token and batch size limits.
Misc:
- Implements isQwenModel() method in openai-compatible.ts to identify Qwen models.

^{This description was created by}^{for d08c5c2. You can customize this summary. It will automatically update as commits are pushed.}

…e interface - Added configurable maxBatchSize parameter to OpenAICompatibleEmbedder - Automatically detect and limit batch size to 10 for Qwen models - Respect both token limits and batch size limits when batching - Added comprehensive tests for batch size handling Fixes #9142

roomote · 2025-11-09T06:39:50Z

Rooviewer See task on Roo Cloud

Review complete. Found 1 issue that should be addressed:

Fix overly broad model detection pattern in isQwenModel() to avoid false positives

_{Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.}

roomote · 2025-11-09T06:42:43Z

src/services/code-index/embedders/openai-compatible.ts

+			lowerModelId.includes("qwen") ||
+			lowerModelId.includes("text-embedding-v") || // Alibaba Cloud uses text-embedding-v4


The pattern lowerModelId.includes("text-embedding-v") is too broad and will incorrectly match non-Qwen models. For example, it would match hypothetical OpenAI models like text-embedding-v1 or custom models like my-text-embedding-v2. Since Alibaba Cloud specifically uses text-embedding-v3 and text-embedding-v4 (with a digit after the 'v'), the pattern should be more precise to avoid false positives.

Suggested change

lowerModelId.includes("qwen") ||

lowerModelId.includes("text-embedding-v") || // Alibaba Cloud uses text-embedding-v4

lowerModelId.includes("qwen") ||

/text-embedding-v\d/.test(lowerModelId) || // Alibaba Cloud uses text-embedding-v4

lowerModelId.includes("text-embedding-3d")

_{Fix it with Roo Code or mention @roomote and request a fix.}

roomote bot requested review from cte, jr and mrubens as code owners November 9, 2025 06:39

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Nov 9, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Nov 9, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Nov 9, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Nov 9, 2025

roomote bot commented Nov 9, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 9, 2025

roomote bot mentioned this pull request Nov 9, 2025

[BUG] Error when using Qwen3-Embedding via OpenAI-compatible interface #9142

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add batch size limiting for Qwen embeddings via OpenAI-compatible interface #9143

fix: add batch size limiting for Qwen embeddings via OpenAI-compatible interface #9143

roomote bot commented Nov 9, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot commented Nov 9, 2025 •

edited

Loading

Uh oh!

roomote bot Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		lowerModelId.includes("qwen") \|\|
		lowerModelId.includes("text-embedding-v") \|\| // Alibaba Cloud uses text-embedding-v4

fix: add batch size limiting for Qwen embeddings via OpenAI-compatible interface #9143

Are you sure you want to change the base?

fix: add batch size limiting for Qwen embeddings via OpenAI-compatible interface #9143

Conversation

roomote bot commented Nov 9, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Testing

Uh oh!

roomote bot commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roomote bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

roomote bot commented Nov 9, 2025 •

edited by ellipsis-dev bot

Loading

roomote bot commented Nov 9, 2025 •

edited

Loading