Skip to content

Add Support for Claude-4.5-Sonnet & Gemini-3 in GAIA Workflow #591

@Lemaqwq

Description

@Lemaqwq

Add Support for Claude-4.5-Sonnet & Gemini-3 in GAIA Workflow

Summary

When running run_gaia_workforce_claude.py with ModelType.CLAUDE_4_SONNET (claude-sonnet-4-5-20250929), two critical errors occur during task execution that prevent successful completion of GAIA benchmark tasks.

Environment

  • Model: claude-sonnet-4-5-20250929 (Claude 4.5 Sonnet)
  • OpenAI SDK: 2.15.0
  • Framework: CAMEL-AI (modified 0.2.46)
  • Python: 3.10

Error 1: Format Validation Error (JSON Parsing)

Symptom

The model returns plain text instead of valid JSON when structured output (response_format) is expected.

Log

2026-01-23 11:23:05,195 - camel.camel.models.openai_compatible_model - WARNING - Format validation error: 1 validation error for TaskResult
  Invalid JSON: expected ident at line 1 column 2 [type=json_invalid, input_value="I'll help you find Eliud...g for this information.", input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/json_invalid. Attempting fallback with JSON format.

Analysis

The model response starts with "I'll help you find Eliud..." (conversational text) instead of a JSON object. This suggests:

  • The response_format parameter may not be properly enforced for Claude models via OpenAI-compatible endpoint
  • The beta.chat.completions.parse method behaves differently for Anthropic models

Error 2: Empty Content Block (BadRequestError 400)

Symptom

API returns 400 error stating text content blocks must be non-empty.

Log

openai.BadRequestError: Error code: 400 - {'error': {'message': 'messages: text content blocks must be non-empty (request id: 20260123195408753875355OP3qL5yB)', 'type': '<nil>', 'param': '', 'code': None}}

Stack Trace

File "/Users/lihengchen/Github/owl/camel/models/openai_compatible_model.py", line 235, in _arequest_parse
    return await self._async_client.beta.chat.completions.parse(
File ".../openai/resources/chat/completions/completions.py", line 1670, in parse
    return await self._post(
...
openai.BadRequestError: Error code: 400 - {'error': {'message': 'messages: text content blocks must be non-empty'...

Analysis

Somewhere in the message construction pipeline, an empty text content block is being sent to the API. This could be caused by:

  1. Tool call results being processed incorrectly
  2. Message history containing empty assistant responses
  3. Incompatibility between OpenAI SDK v2.x beta.chat.completions.parse and Anthropic's API

Task Context

The error occurred during this GAIA benchmark task:

Worker node 5375996352 (A helpful assistant that can search the web, extract webpage content,
simulate browser actions, and retrieve relevant information.) get task e1fc63a2-da7a-432f-be78-7c4a95598703.0:
Search for Eliud Kipchoge's world record marathon time and determine his pace (speed in km/h or m/s).

Suggested Investigation

  1. Check if beta.chat.completions.parse is the correct method for Anthropic models
  2. Add validation to ensure no empty content blocks are sent
  3. Verify response_format parameter compatibility with Claude 4.5 Sonnet
  4. Consider using Anthropic's native SDK instead of OpenAI-compatible endpoint for structured outputs

Related Files

File Line Description
camel/models/openai_compatible_model.py 235 _arequest_parse method using beta.chat.completions.parse
camel/agents/chat_agent.py 839 _aget_model_response method
run_gaia_workforce_claude.py - Main entry point with model configuration

Reproduction Steps

  1. Configure run_gaia_workforce_claude.py to use ModelType.CLAUDE_4_SONNET
  2. Run the GAIA benchmark: python run_gaia_workforce_claude.py
  3. Observe errors during web search tasks requiring structured output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions