Fix llm_id validation for image2text Gemini usage.#14320
Fix llm_id validation for image2text Gemini usage.#14320lokinhh wants to merge 5 commits intoinfiniflow:mainfrom
Conversation
Allow chat assistant llm_id validation to accept image2text models and resolve chat model config using actual llm type so Gemini vision models no longer fail with misleading doesn't-exist errors. Made-with: Cursor
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughLLM ID validation and dialog model-resolution now try multiple candidate model types ( Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant DialogSvc as DialogService\n(_get_dialog_chat_model_config)
participant ModelRepo as ModelConfigRepo
participant TenantSvc as TenantService
rect rgba(200,200,255,0.5)
Client->>DialogSvc: request chat (dialog with llm_id?)
DialogSvc->>ModelRepo: query model_config by (provided llm_id, provided model_type)
alt found
ModelRepo-->>DialogSvc: model_config
else not found
DialogSvc->>ModelRepo: query model_config by (llm_id, alternate model_type)
alt found
ModelRepo-->>DialogSvc: model_config
else not found
DialogSvc->>TenantSvc: resolve tenant_llm_id / tenant default
TenantSvc-->>DialogSvc: tenant candidate llm_type(s)/ids
DialogSvc->>ModelRepo: query model_config by tenant id/type
ModelRepo-->>DialogSvc: model_config or none
end
end
alt model_config found
DialogSvc-->>Client: resolved model_config + llm_type
else none
DialogSvc-->>Client: error (not found)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/db/services/dialog_service.py (1)
256-306: Recommended: reuse the helper (or at least its type resolution) inasync_chattoo.Lines 501–505 of
async_chat(outside this diff) still open-code the samellm_type == "image2text"dispatch that the new helper centralizes. Leaving that copy in place defeats much of the DRY benefit of introducing_get_dialog_chat_model_configand will silently diverge if fallback/priority rules change here. Consider either calling the helper inasync_chator factoring out a small_resolve_dialog_model_type(dialog)that both call sites share. Not blocking for this PR since that region is untouched, but worth a follow-up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/dialog_service.py` around lines 256 - 306, The async_chat implementation duplicates the llm_type resolution logic that _get_dialog_chat_model_config already centralizes; update async_chat to call _get_dialog_chat_model_config(dialog) (or extract a shared _resolve_dialog_model_type(dialog) used by both) and use its returned model_config/llm_type instead of re-implementing the "image2text" dispatch so fallback/priority rules stay in one place and avoid silent divergence between async_chat and get_models.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/db/services/dialog_service.py`:
- Around line 256-306: The async_chat implementation duplicates the llm_type
resolution logic that _get_dialog_chat_model_config already centralizes; update
async_chat to call _get_dialog_chat_model_config(dialog) (or extract a shared
_resolve_dialog_model_type(dialog) used by both) and use its returned
model_config/llm_type instead of re-implementing the "image2text" dispatch so
fallback/priority rules stay in one place and avoid silent divergence between
async_chat and get_models.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: eab15055-956d-4655-9b45-5ebfe49e6f48
📒 Files selected for processing (2)
api/apps/restful_apis/chat_api.pyapi/db/services/dialog_service.py
Retry both chat and image2text model types when resolving dialog llm_id to prevent false lookup failures when type detection and tenant model records are temporarily inconsistent. Made-with: Cursor
Reuse a single resolver for model config and llm type in async_chat paths so chat/image2text fallback rules stay consistent across runtime call sites. Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/db/services/dialog_service.py`:
- Around line 233-251: _get_dialog_chat_model_config currently returns only a
config while callers (e.g., async_chat and places around lines 266/316/511)
still re-derive a model type via TenantLLMService.llm_id2llm_type, causing
mismatches when the fallback chooses a different type; change
_get_dialog_chat_model_config to return a tuple (resolved_type, config), update
all call sites to use the returned resolved_type before any attachment/image
handling or request-shape decisions (remove direct calls to
TenantLLMService.llm_id2llm_type where you now have the resolved_type), ensure
async_chat uses this helper instead of its own single-type lookup, and add
logging inside the helper to record fallback attempts and the final
resolved_type so resolution choices are auditable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b12aed35-1d49-4310-8fc1-ff97625cbdcb
📒 Files selected for processing (1)
api/db/services/dialog_service.py
Return resolved model type together with model config and reuse it across async chat paths so attachment handling and request shape stay consistent with fallback selection. Made-with: Cursor
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/db/services/dialog_service.py (1)
518-540: Duplicate model-config resolution per chat request.
_resolve_dialog_chat_model(dialog)at Line 518 fully resolvesllm_model_config, thenget_models(dialog)at Line 540 calls_get_dialog_chat_model_config(dialog)again at Line 323 to buildchat_mdl. That's two DB round-trips (plus thellm_id2llm_typelookup inside each) for the same dialog on every request. Consider threading the already-resolved config/type intoget_modelsto avoid the re-lookup.♻️ Proposed direction
-def get_models(dialog): +def get_models(dialog, chat_model_config=None): embd_mdl, chat_mdl, rerank_mdl, tts_mdl = None, None, None, None ... - chat_model_config = _get_dialog_chat_model_config(dialog) + if chat_model_config is None: + chat_model_config = _get_dialog_chat_model_config(dialog) chat_mdl = LLMBundle(dialog.tenant_id, chat_model_config)And at the
async_chatcall site:- kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog) + kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog, chat_model_config=llm_model_config)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/dialog_service.py` around lines 518 - 540, The code currently calls _resolve_dialog_chat_model(dialog) to get llm_model_config and llm_type, then calls get_models(dialog) which internally calls _get_dialog_chat_model_config(dialog) again causing duplicate DB lookups; change get_models to accept the already-resolved llm_model_config and llm_type (e.g., get_models(dialog, llm_model_config=None, llm_type=None)), update get_models implementation to use the passed-in values instead of calling _get_dialog_chat_model_config, and update all call sites (including async_chat callers) to pass the resolved config/type so the second DB round-trip and duplicate llm_id2llm_type lookup are eliminated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/db/services/dialog_service.py`:
- Around line 518-540: The code currently calls
_resolve_dialog_chat_model(dialog) to get llm_model_config and llm_type, then
calls get_models(dialog) which internally calls
_get_dialog_chat_model_config(dialog) again causing duplicate DB lookups; change
get_models to accept the already-resolved llm_model_config and llm_type (e.g.,
get_models(dialog, llm_model_config=None, llm_type=None)), update get_models
implementation to use the passed-in values instead of calling
_get_dialog_chat_model_config, and update all call sites (including async_chat
callers) to pass the resolved config/type so the second DB round-trip and
duplicate llm_id2llm_type lookup are eliminated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8de8361c-eee8-4851-86cf-689a1cfd6259
📒 Files selected for processing (1)
api/db/services/dialog_service.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
api/db/services/dialog_service.py (1)
554-576:⚠️ Potential issue | 🟡 MinorAvoid resolving the chat model twice in the same request.
Line 554 resolves
llm_model_config/llm_type, but Line 576 callsget_models(dialog), which resolves the chat config again via_get_dialog_chat_model_config(). If model records change mid-request or duplicate model-type records resolve differently,llm_type,max_tokens, Langfuse metadata, and the actualchat_mdlcan diverge.Consider passing the already-resolved chat config into
get_models()or returning the resolved config/type fromget_models()so the request uses one resolution result end-to-end.🐛 Proposed direction
-def get_models(dialog): +def get_models(dialog, chat_model_config=None): @@ - chat_model_config, _ = _get_dialog_chat_model_config(dialog) + if chat_model_config is None: + chat_model_config, _ = _get_dialog_chat_model_config(dialog) chat_mdl = LLMBundle(dialog.tenant_id, chat_model_config)- kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog) + kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models( + dialog, + chat_model_config=llm_model_config, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/dialog_service.py` around lines 554 - 576, The code currently resolves the dialog chat config via _get_dialog_chat_model_config(dialog) into llm_model_config/llm_type but then calls get_models(dialog) which calls _get_dialog_chat_model_config again, risking divergent model resolution; update get_models to accept the already-resolved config/type (e.g., change signature to get_models(dialog, llm_model_config=None, llm_type=None) or similar) and use those values inside get_models (or alternately have get_models return the resolved llm_model_config and llm_type along with models), then change the call site in dialog_service.py (the call that currently does kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog)) to pass the existing llm_model_config/llm_type (or unpack the returned config) so the same resolution (and derived values like max_tokens and Langfuse context) is used throughout the request.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@api/db/services/dialog_service.py`:
- Around line 554-576: The code currently resolves the dialog chat config via
_get_dialog_chat_model_config(dialog) into llm_model_config/llm_type but then
calls get_models(dialog) which calls _get_dialog_chat_model_config again,
risking divergent model resolution; update get_models to accept the
already-resolved config/type (e.g., change signature to get_models(dialog,
llm_model_config=None, llm_type=None) or similar) and use those values inside
get_models (or alternately have get_models return the resolved llm_model_config
and llm_type along with models), then change the call site in dialog_service.py
(the call that currently does kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl =
get_models(dialog)) to pass the existing llm_model_config/llm_type (or unpack
the returned config) so the same resolution (and derived values like max_tokens
and Langfuse context) is used throughout the request.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3be5d3d8-3444-456c-a2fc-ae86f0652b1c
📒 Files selected for processing (1)
api/db/services/dialog_service.py
Use OpenAI-compatible typed multimodal blocks for Gemini to avoid LiteLLM validation errors, and pass pre-resolved llm config/type into get_models to prevent duplicate resolution in async chat. Made-with: Cursor
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/db/services/dialog_service.py (1)
345-362: Override semantics inget_modelsare all-or-nothing — consider decoupling.The guard
if llm_model_config is None or llm_type is Nonemeans that if a caller passes only one of the two (e.g., a resolvedllm_model_configbut forgetsllm_type, or vice versa), the helper silently re-resolves both and discards the caller's explicit value. Today the only non-default caller (async_chat, Lines 581–583) passes both, so this is latent, but it's an easy foot-gun for future callers.♻️ Proposed tweak
- if llm_model_config is None or llm_type is None: - llm_model_config, llm_type = _get_dialog_chat_model_config(dialog) + if llm_model_config is None and llm_type is None: + llm_model_config, llm_type = _get_dialog_chat_model_config(dialog) + elif llm_model_config is None or llm_type is None: + raise ValueError("llm_model_config and llm_type must be provided together")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/dialog_service.py` around lines 345 - 362, The current guard in get_models uses "if llm_model_config is None or llm_type is None" which re-resolves and overwrites both values even when only one is missing; change this to resolve defaults but only fill the missing pieces: call _get_dialog_chat_model_config(dialog) when either is None, capture its two-return values (e.g., resolved_config, resolved_type), then set llm_model_config = resolved_config only if llm_model_config is None and set llm_type = resolved_type only if llm_type is None, then proceed to create chat_mdl = LLMBundle(dialog.tenant_id, llm_model_config).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/db/services/dialog_service.py`:
- Around line 345-362: The current guard in get_models uses "if llm_model_config
is None or llm_type is None" which re-resolves and overwrites both values even
when only one is missing; change this to resolve defaults but only fill the
missing pieces: call _get_dialog_chat_model_config(dialog) when either is None,
capture its two-return values (e.g., resolved_config, resolved_type), then set
llm_model_config = resolved_config only if llm_model_config is None and set
llm_type = resolved_type only if llm_type is None, then proceed to create
chat_mdl = LLMBundle(dialog.tenant_id, llm_model_config).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 958c8bbc-1f96-4e9f-bd49-5aba916aa0af
📒 Files selected for processing (1)
api/db/services/dialog_service.py
Allow chat assistant llm_id validation to accept image2text models and resolve chat model config using actual llm type so Gemini vision models no longer fail with misleading doesn't-exist errors.
Made-with: Cursor
What problem does this PR solve?
RAGFlow currently validates chat assistant llm_id too strictly against model_type=chat, causing code 102 ("llm_id ... doesn't exist") when users configure Gemini as image2text. This PR allows llm_id validation to accept both chat and image2text, and resolves runtime model config based on the actual llm type, so Gemini vision models can be used without misleading validation errors.
Type of change