feat: Add XSearchTool for X/Twitter search with xAI models#4165
feat: Add XSearchTool for X/Twitter search with xAI models#4165colesmcintosh wants to merge 1 commit intopydantic:mainfrom
Conversation
| from_date: datetime | str | None = None | ||
| """Date filter for start date. Accepts datetime object or ISO8601 string (e.g., '2024-01-01').""" | ||
|
|
||
| to_date: datetime | str | None = None | ||
| """Date filter for end date. Accepts datetime object or ISO8601 string (e.g., '2024-12-31').""" |
There was a problem hiding this comment.
Consider accepting only datetime (or date) here instead of datetime | str. Accepting str on a plain dataclass means invalid strings (e.g., from_date='not-a-date') won't produce an error until the model is actually invoked — at which point _parse_date in xai.py will call datetime.fromisoformat() and raise a confusing error far from the user's code.
If you do want to accept strings for convenience, the conversion/validation should happen in __post_init__ on this class (normalizing to datetime eagerly), rather than deferring it to model-specific code. This keeps the model code simpler and ensures consistent validation regardless of which model handles this tool.
@DouweM would appreciate your input on the preferred approach here — just datetime, or datetime | str with eager validation?
| """ | ||
| OpenAI announced their latest model updates, while Anthropic shared research on AI safety... | ||
| """ |
There was a problem hiding this comment.
This is a single-line output, so it should use the #> format for consistency with the first example and other docs examples:
| """ | |
| OpenAI announced their latest model updates, while Anthropic shared research on AI safety... | |
| """ | |
| #> OpenAI announced their latest model updates, while Anthropic shared research on AI safety... |
You'll also need to update the corresponding entry in tests/test_examples.py to match.
| def _parse_date(value: datetime | str) -> datetime: | ||
| """Parse a date value to datetime object. | ||
|
|
||
| Args: | ||
| value: A datetime object or ISO8601 formatted string (e.g., '2024-01-01'). | ||
|
|
||
| Returns: | ||
| A datetime object. | ||
| """ | ||
| if isinstance(value, datetime): | ||
| return value | ||
| return datetime.fromisoformat(value) |
There was a problem hiding this comment.
Related to my comment on the from_date/to_date field types: if str is accepted, this parsing logic should live in XSearchTool.__post_init__ rather than here in the model-specific code. That way the tool normalizes its own fields eagerly, and this function can be removed entirely.
tests/models/test_xai.py
Outdated
| assert len(tools) == 1 | ||
| # The x_search tool should have from_date and to_date configured | ||
| x_search_config = tools[0].get('x_search', {}) | ||
| assert x_search_config is not None |
There was a problem hiding this comment.
This assertion is too weak — it only checks the config dict is not None but doesn't verify that the from_date and to_date values were actually passed through correctly. Compare with test_xai_builtin_x_search_tool_with_handles which at least checks the handle values. Consider asserting the actual date values in the tool config, and using a snapshot() assertion like the test_xai_builtin_web_search_tool test does for full message history verification.
tests/models/test_xai.py
Outdated
| assert result.output == 'Found posts about PydanticAI.' | ||
|
|
||
| # Verify the builtin tool call and result appear in message history | ||
| messages = result.all_messages() | ||
| assert len(messages) == 2 | ||
| assert isinstance(messages[0], ModelRequest) | ||
| assert isinstance(messages[1], ModelResponse) |
There was a problem hiding this comment.
The existing test_xai_builtin_web_search_tool test uses a full snapshot() assertion on result.all_messages() which verifies the complete message structure including BuiltinToolCallPart, BuiltinToolReturnPart, tool call IDs, provider names, usage details, etc. These x_search tests only check a few individual parts. For consistency and thoroughness, please use snapshot() on the full result.all_messages() like the web_search tests do — this catches regressions in the full response processing pipeline.
tests/models/test_xai.py
Outdated
|
|
||
| async def test_xai_builtin_x_search_tool_with_date_range(allow_model_requests: None): | ||
| """Test xAI's built-in x_search tool with date filtering.""" | ||
| from datetime import datetime |
There was a problem hiding this comment.
Nit: this should be a module-level import rather than a function-level import. timezone from datetime is already imported at the top of the file (line 20), so you can add datetime there too.
| from datetime import datetime |
| if model_settings.get('xai_include_x_search_output'): | ||
| include.append(chat_pb2.IncludeOption.INCLUDE_OPTION_X_SEARCH_CALL_OUTPUT) |
There was a problem hiding this comment.
The existing test_xai_include_settings test (around line 4275 in test_xai.py) verifies all include options are correctly passed through. It needs to be updated to also cover xai_include_x_search_output and verify INCLUDE_OPTION_X_SEARCH_CALL_OUTPUT appears in the snapshot. Without this, the new include option lacks integration test coverage.
| class XSearchTool(AbstractBuiltinTool): | ||
| """A builtin tool that allows your agent to search X/Twitter for information. | ||
|
|
||
| This tool provides real-time access to X/Twitter posts, user profiles, and threads. |
There was a problem hiding this comment.
The xAI docs describe x_search as searching X/Twitter posts — there's no mention of "user profiles" or "threads" as distinct search capabilities. This line overclaims what the tool does. I'd suggest simplifying it:
| This tool provides real-time access to X/Twitter posts, user profiles, and threads. | |
| This tool provides access to X/Twitter posts and content. |
Also, the WebSearchTool links to provider docs for individual parameters (e.g. see <https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters>). Please add a similar link to the xAI x_search docs in this docstring and in the per-parameter docstrings below, so users can find authoritative details without us needing to keep our docs in sync with theirs.
| # from_date/to_date are normalized to datetime in XSearchTool.__post_init__ | ||
| from_date = builtin_tool.from_date if isinstance(builtin_tool.from_date, datetime) else None | ||
| to_date = builtin_tool.to_date if isinstance(builtin_tool.to_date, datetime) else None |
There was a problem hiding this comment.
Since XSearchTool.__post_init__ already normalizes date → datetime, these isinstance checks are always true when the value is not None. You can simplify to:
| # from_date/to_date are normalized to datetime in XSearchTool.__post_init__ | |
| from_date = builtin_tool.from_date if isinstance(builtin_tool.from_date, datetime) else None | |
| to_date = builtin_tool.to_date if isinstance(builtin_tool.to_date, datetime) else None | |
| # from_date/to_date are normalized to datetime in XSearchTool.__post_init__ | |
| from_date = builtin_tool.from_date | |
| to_date = builtin_tool.to_date |
The comment is already there explaining the normalization, so the redundant runtime check is just noise.
docs/builtin-tools.md
Outdated
|
|
||
| ## X Search Tool | ||
|
|
||
| The [`XSearchTool`][pydantic_ai.builtin_tools.XSearchTool] allows your agent to search X/Twitter for real-time posts and content. This tool is exclusive to xAI models. |
There was a problem hiding this comment.
Please add a link to the official xAI x_search documentation here, similar to how other tool sections link to provider docs. Something like:
The [`XSearchTool`][pydantic_ai.builtin_tools.XSearchTool] allows your agent to search X/Twitter for real-time posts and content. This tool is exclusive to xAI models. See the [xAI X Search documentation](https://docs.x.ai/developers/tools/x-search) for more details.
This follows the project convention of linking to provider docs for features rather than re-explaining them, and prevents the docs from going stale when xAI updates their API.
docs/builtin-tools.md
Outdated
| allowed_x_handles=['OpenAI', 'AnthropicAI', 'xaboratory'], # Only search posts from these handles (max 10) | ||
| from_date=datetime(2024, 1, 1), # Filter posts from this date | ||
| to_date=datetime(2024, 12, 31), # Filter posts until this date | ||
| enable_image_understanding=True, # Enable image analysis | ||
| enable_video_understanding=True, # Enable video analysis |
There was a problem hiding this comment.
These inline comments just restate what the parameter names already make obvious, and there's a full parameter reference table right below this example. Per the project's docs guidelines, examples should be stripped of boilerplate and focus on the feature being demonstrated. I'd drop all the inline comments:
| allowed_x_handles=['OpenAI', 'AnthropicAI', 'xaboratory'], # Only search posts from these handles (max 10) | |
| from_date=datetime(2024, 1, 1), # Filter posts from this date | |
| to_date=datetime(2024, 12, 31), # Filter posts until this date | |
| enable_image_understanding=True, # Enable image analysis | |
| enable_video_understanding=True, # Enable video analysis | |
| allowed_x_handles=['OpenAI', 'AnthropicAI', 'xaboratory'], | |
| from_date=datetime(2024, 1, 1), | |
| to_date=datetime(2024, 12, 31), | |
| enable_image_understanding=True, | |
| enable_video_understanding=True, |
| [ | ||
| ModelRequest( | ||
| parts=[ | ||
| UserPromptPart( |
There was a problem hiding this comment.
There's a test_xai_builtin_web_search_tool_stream test for the web search streaming path, but no equivalent streaming test for x_search. Since the streaming code path shares some but not all logic with the non-streaming path, a test_xai_builtin_x_search_tool_stream test would be valuable to ensure the streaming response processing also handles x_search tool calls correctly.
|
look forward to this merging- great work @colesmcintosh |
| # Normalize date to datetime for downstream consumers | ||
| if isinstance(self.from_date, date) and not isinstance(self.from_date, datetime): | ||
| self.from_date = datetime(self.from_date.year, self.from_date.month, self.from_date.day) | ||
| if isinstance(self.to_date, date) and not isinstance(self.to_date, datetime): | ||
| self.to_date = datetime(self.to_date.year, self.to_date.month, self.to_date.day) |
There was a problem hiding this comment.
📝 Info: SDK x_search() correctly handles naive datetimes as UTC
The docs example at line 193 creates naive datetime(2024, 1, 1) objects for from_date/to_date. The xAI SDK's x_search() function converts these via Timestamp.FromDatetime(), which treats naive datetimes as UTC. The test snapshot at tests/models/test_xai.py:2175-2176 confirms serialization as '2024-01-01T00:00:00Z'. This is correct behavior but worth noting: users who intend a specific timezone should pass timezone-aware datetimes. The docstring doesn't mention this UTC assumption.
Was this helpful? React with 👍 or 👎 to provide feedback.
| @dataclass(kw_only=True) | ||
| class XSearchTool(AbstractBuiltinTool): | ||
| """A builtin tool that allows your agent to search X/Twitter for information. | ||
|
|
||
| This tool provides real-time access to X/Twitter posts, user profiles, and threads. | ||
|
|
||
| This tool is exclusive to xAI models. | ||
|
|
||
| Supported by: | ||
|
|
||
| * xAI | ||
| """ | ||
|
|
||
| allowed_x_handles: list[str] | None = None | ||
| """If provided, only posts from these X handles will be included (max 10).""" | ||
|
|
||
| excluded_x_handles: list[str] | None = None | ||
| """If provided, posts from these X handles will be excluded (max 10).""" | ||
|
|
||
| from_date: datetime | date | None = None | ||
| """Date filter for start date.""" | ||
|
|
||
| to_date: datetime | date | None = None | ||
| """Date filter for end date.""" | ||
|
|
||
| enable_image_understanding: bool = False | ||
| """Enable image analysis from X posts.""" | ||
|
|
||
| enable_video_understanding: bool = False | ||
| """Enable video analysis from X content.""" | ||
|
|
||
| kind: str = 'x_search' | ||
| """The kind of tool.""" | ||
|
|
||
| def __post_init__(self) -> None: | ||
| if self.allowed_x_handles is not None and self.excluded_x_handles is not None: | ||
| raise ValueError('Cannot specify both allowed_x_handles and excluded_x_handles') | ||
| if self.allowed_x_handles and len(self.allowed_x_handles) > 10: | ||
| raise ValueError('allowed_x_handles cannot contain more than 10 handles') | ||
| if self.excluded_x_handles and len(self.excluded_x_handles) > 10: | ||
| raise ValueError('excluded_x_handles cannot contain more than 10 handles') | ||
| # Normalize date to datetime for downstream consumers | ||
| if isinstance(self.from_date, date) and not isinstance(self.from_date, datetime): | ||
| self.from_date = datetime(self.from_date.year, self.from_date.month, self.from_date.day) | ||
| if isinstance(self.to_date, date) and not isinstance(self.to_date, datetime): | ||
| self.to_date = datetime(self.to_date.year, self.to_date.month, self.to_date.day) |
There was a problem hiding this comment.
🚩 XSearchTool now exposed in CLI without xAI-only guard
Adding XSearchTool to SUPPORTED_BUILTIN_TOOLS (via BUILTIN_TOOL_TYPES auto-registration at builtin_tools.py:70) means it also appears in SUPPORTED_CLI_TOOL_IDS (at _cli/__init__.py:59-61), since XSearchTool is not in BUILTIN_TOOLS_REQUIRING_CONFIG. This means --builtin-tool x_search will be offered to CLI users regardless of what model they select.
This is not a bug because the model-level validation at models/__init__.py:766-774 correctly rejects unsupported builtin tools with a clear UserError. The user will see something like Builtin tool(s) ['XSearchTool'] not supported by this model. However, this differs from tools like MCPServerTool which are excluded from the CLI because they require configuration. Since XSearchTool is xAI-exclusive, it may be worth considering whether it should also be excluded from the generic CLI tool list to avoid user confusion.
Was this helpful? React with 👍 or 👎 to provide feedback.
| print(result.output) | ||
| """ | ||
| OpenAI announced their latest model updates, while Anthropic shared research on AI safety... | ||
| """ |
There was a problem hiding this comment.
This was flagged in an earlier review and is still unresolved: this is a single-line output, so it should use the #> format for consistency with the first example and other doc examples:
print(result.output)
#> OpenAI announced their latest model updates, while Anthropic shared research on AI safety...There was a problem hiding this comment.
This may not be true since these are auto-generated, and the line was long enough to make our logic wrap it.
| |----------|-----------|-------| | ||
| | xAI | ✅ | Full feature support including date filtering and handle filtering. | | ||
| | All other providers | ❌ | Not supported | | ||
|
|
There was a problem hiding this comment.
This provider support table is redundant — the sentence right above already says "This tool is exclusive to xAI models." A 2-row table with one "Not supported" row for "All other providers" doesn't add information. I'd remove the table entirely and let the prose + the "Supported by" section in the docstring do the work, similar to how other single-provider features are documented.
| excluded_x_handles: list[str] | None = None | ||
| """If provided, posts from these X handles will be excluded (max 10).""" | ||
|
|
||
| from_date: datetime | date | None = None |
There was a problem hiding this comment.
Accepting date here adds complexity (the __post_init__ normalization, the cast in xai.py) without much benefit. The xAI SDK's x_search() function only accepts datetime, and the normalization creates naive datetimes (no timezone) which has subtle implications for timezone-aware users.
I'd simplify to just datetime | None — this matches the SDK's actual type and avoids the normalization logic. Users who have a date can do datetime(d.year, d.month, d.day) themselves, which makes the timezone choice explicit. This was the direction @DouweM indicated in his earlier review too ("Just datetime/date!" — I read "date" there as the module name, not datetime.date).
There was a problem hiding this comment.
Heh, I did mean "datetime or date" there, but let's just support what the X API does.
| # from_date/to_date are normalized from date to datetime in XSearchTool.__post_init__ | ||
| tools.append( | ||
| x_search( | ||
| from_date=cast(datetime | None, builtin_tool.from_date), |
There was a problem hiding this comment.
The cast(datetime | None, ...) calls are needed because from_date/to_date are typed as datetime | date | None but __post_init__ normalizes date to datetime. If you simplify the field types to just datetime | None (as suggested on builtin_tools.py), these casts become unnecessary.
tests/models/test_xai.py
Outdated
| final_response = response | ||
|
|
||
| assert final_response is not None | ||
| builtin_calls = [p for p in final_response.parts if isinstance(p, BuiltinToolCallPart)] |
There was a problem hiding this comment.
This streaming test uses manual field assertions instead of a snapshot() on result.all_messages(). The non-streaming test_xai_builtin_x_search_tool test uses full snapshot assertions, and so does the existing test_xai_builtin_web_search_tool_stream test. Please use snapshot() on the full message structure here for consistency and to catch regressions in all fields (timestamps, usage, provider details, etc.).
|
@DouweM can I get a review on this pls |
| from xai_sdk.chat import assistant, file, image, system, tool, tool_result, user | ||
| from xai_sdk.proto import chat_pb2, sample_pb2, usage_pb2 | ||
| from xai_sdk.tools import code_execution, get_tool_call_type, mcp, web_search # x_search not yet supported | ||
| from xai_sdk.tools import code_execution, get_tool_call_type, mcp, web_search, x_search |
There was a problem hiding this comment.
🚩 x_search import assumes SDK version with x_search support
The import at pydantic_ai_slim/pydantic_ai/models/xai.py:62 adds x_search to the from xai_sdk.tools import ... line. The previous code had a comment # x_search not yet supported, implying the SDK already exported this function but pydantic-ai hadn't integrated it. If users have an older version of xai-sdk installed that doesn't export x_search, the entire xai module will fail to import with a confusing error about xai-sdk not being installed. This is the standard pattern for this module though (all SDK imports are in a single try/except block).
Was this helpful? React with 👍 or 👎 to provide feedback.
|
@DouweM this is ready for review |
| """ | ||
|
|
||
| effort: Literal['high', 'medium', 'low'] | ||
| effort: Literal['xhigh', 'high', 'medium', 'low', 'minimal', 'none'] |
There was a problem hiding this comment.
🚩 OpenRouterReasoning effort type expanded beyond what OpenRouter may support
The effort field type on OpenRouterReasoning at pydantic_ai_slim/pydantic_ai/models/openrouter.py:201 was changed from Literal['high', 'medium', 'low'] to Literal['xhigh', 'high', 'medium', 'low', 'minimal', 'none']. While the unified thinking mapping at pydantic_ai_slim/pydantic_ai/models/openrouter.py:539-545 only produces 'low', 'medium', 'high', users who explicitly set openrouter_reasoning={'effort': 'xhigh'} or 'none' would now pass type checking but may get API errors from OpenRouter if those values aren't supported. This is borderline — it could be intentional to match the full ThinkingLevel range — but it's worth verifying against OpenRouter's actual API.
Was this helpful? React with 👍 or 👎 to provide feedback.
2d52b7d to
673ab81
Compare
Add the xAI-only XSearchTool built-in integration, including request mapping, docs, and test coverage. This keeps the PR focused on the X/Twitter search feature without the extra branch history.
673ab81 to
bd5ba31
Compare
Summary
XSearchToolas a new built-in tool for searching X/Twitter posts with xAI modelsallowed_x_handles/excluded_x_handles, max 10 each)from_date/to_dateas datetime or ISO8601 string)enable_image_understanding/enable_video_understanding)xai_include_x_search_outputmodel settingCloses #3896