-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Docling's api_image_request()
fails when VLM APIs return responses via OpenAI's tool calling format (used by NVIDIA NemoRetriever Parse) because OpenAiChatMessage.content
is required but tool calling responses don't include it.
Error:
pydantic_core.ValidationError: 1 validation error for OpenAiApiResponse
choices.0.message.content
Field required [type=missing]
Tool calling response format:
{
"choices": [{
"message": {
"role": "assistant",
"tool_calls": [{
"function": {
"name": "markdown_no_bbox",
"arguments": "[{\"text\": \"Extracted text\"}]"
}
}]
// No "content" field
}
}]
}
Current Workaround (Monkey Patch)
We have to replace the entire api_image_request()
function to handle tool_calls:
def nvidia_parse_compatible_api_request(image, prompt, url, timeout=20, headers=None, **params):
"""Replacement that handles both content and tool_calls responses."""
# ... standard request code ...
r = requests.post(str(url), headers=headers, json=payload, timeout=timeout)
r.raise_for_status()
response_data = json.loads(r.text)
message = response_data['choices'][0]['message']
# Handle tool_calls (the key difference)
if 'tool_calls' in message and message['tool_calls']:
arguments = json.loads(message['tool_calls'][0]['function']['arguments'])
# Extract text from tool response...
return extracted_text
elif 'content' in message:
return message['content'].strip()
else:
raise ValueError("Response has neither content nor tool_calls")
# Must patch before importing Docling
import docling.utils.api_image_request as api_module
api_module.api_image_request = nvidia_parse_compatible_api_request
I think this could be fixed with minimal changes (untested) by making content
optional and adding tool_calls extraction similar to our monkey patching:
File: docling/datamodel/base_models.py
(line 335)
class OpenAiChatMessage(BaseModel):
role: str
content: Optional[str] = None # Make optional
tool_calls: Optional[List[dict]] = None # Add tool_calls field
Are there any downsides to this? If not, I can submit a PR.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request