Add Wikipedia Omi integration app#7413
Conversation
|
Added a small follow-up commit (9786102) to clean MediaWiki search snippets more generally before returning them in tool responses. Validation rerun:
|
Greptile SummaryThis PR adds a new standalone Wikipedia integration plugin for Omi, exposing three no-auth chat tools (article search, article summary, and random article) as a FastAPI service deployable to Railway.
Confidence Score: 3/5The app is straightforward and the core logic is sound, but a type-coercion gap in _safe_limit will cause unhandled 500 errors whenever the caller passes limit as a string rather than an integer. The _safe_limit function assumes its input is already an int or None, but the endpoint accepts a raw dict payload where limit could arrive as a string. The call to min(limit, MAX_LIMIT) then raises an uncaught TypeError before the try/except httpx.HTTPError block, returning a 500 instead of a graceful error response. plugins/omi-wikipedia-app/main.py warrants a close look, specifically _safe_limit, the snippet HTML-stripping logic, and the tool manifest parameter schemas. Important Files Changed
Sequence DiagramsequenceDiagram
participant Omi as Omi Platform
participant App as Wikipedia App
participant Wiki as Wikipedia API
Omi->>App: GET /.well-known/omi-tools.json
App-->>Omi: Tool manifest (3 tools)
Note over Omi,App: Tool invocation
Omi->>App: POST /tools/search_articles
App->>Wiki: "GET /w/api.php?action=query&list=search"
Wiki-->>App: Search results (with HTML snippets)
App-->>Omi: "ChatToolResponse {result: formatted list}"
Omi->>App: POST /tools/get_article_summary
App->>Wiki: "GET /api/rest_v1/page/summary/{title}"
Wiki-->>App: Summary JSON
App-->>Omi: "ChatToolResponse {result: formatted summary}"
Omi->>App: POST /tools/get_random_article
App->>Wiki: "GET /w/api.php?action=query&list=random"
Wiki-->>App: Random title
App->>Wiki: "GET /api/rest_v1/page/summary/{title}"
Wiki-->>App: Summary JSON
App-->>Omi: "ChatToolResponse {result: random article summary}"
Reviews (1): Last reviewed commit: "Clean Wikipedia search snippets" | Re-trigger Greptile |
|
|
||
|
|
||
| def _safe_limit(limit: Optional[int]) -> int: | ||
| if limit is None: |
There was a problem hiding this comment.
_safe_limit is called before the try block in search_articles, so a non-integer limit value — e.g., {"limit": "5"} sent by an LLM — will cause min("5", 10) to raise TypeError: '<' not supported between instances of 'str' and 'int'. Because the exception occurs outside the except httpx.HTTPError handler, FastAPI will return an unhandled 500 instead of a graceful ChatToolResponse(error=...). Add an int() coercion with a fallback so any non-numeric string is treated as the default.
| def _safe_limit(limit: Optional[int]) -> int: | |
| if limit is None: | |
| def _safe_limit(limit: Optional[int]) -> int: | |
| if limit is None: | |
| return 5 | |
| try: | |
| limit = int(limit) | |
| except (TypeError, ValueError): | |
| return 5 | |
| return max(1, min(limit, MAX_LIMIT)) |
| "action": "query", | ||
| "list": "search", | ||
| "srsearch": query, | ||
| "srlimit": limit, |
There was a problem hiding this comment.
Wikipedia's search API returns snippets that can contain arbitrary HTML entities (
&, <, ", etc.) in addition to the <span class="searchmatch"> tags. The current stripping only removes those specific span tags, so entities like & will appear literally in the tool output returned to the user. Using html.unescape after the tag removal resolves this cleanly.
| "srlimit": limit, | |
| import html | |
| snippet = html.unescape( | |
| (item.get("snippet") or "").replace("<span class=\"searchmatch\">", "").replace("</span>", "") | |
| ) |
| return {"status": "ok"} | ||
|
|
||
|
|
||
| @app.get("/.well-known/omi-tools.json") | ||
| async def get_omi_tools_manifest(): | ||
| return { | ||
| "tools": [ | ||
| { | ||
| "name": "search_articles", | ||
| "description": "Search Wikipedia articles by keyword. Use this when the user asks about a topic, person, place, event, concept, or wants matching encyclopedia articles.", | ||
| "endpoint": "/tools/search_articles", | ||
| "method": "POST", | ||
| "parameters": { | ||
| "properties": { | ||
| "query": { | ||
| "type": "string", | ||
| "description": "Search query, such as a topic, person, place, event, or concept.", |
There was a problem hiding this comment.
The
parameters objects in the manifest are missing the "type": "object" field required by JSON Schema. Without it, strict parsers may reject the schema, and the Omi platform or toolchain may fail to validate or display the tool parameters correctly.
| return {"status": "ok"} | |
| @app.get("/.well-known/omi-tools.json") | |
| async def get_omi_tools_manifest(): | |
| return { | |
| "tools": [ | |
| { | |
| "name": "search_articles", | |
| "description": "Search Wikipedia articles by keyword. Use this when the user asks about a topic, person, place, event, concept, or wants matching encyclopedia articles.", | |
| "endpoint": "/tools/search_articles", | |
| "method": "POST", | |
| "parameters": { | |
| "properties": { | |
| "query": { | |
| "type": "string", | |
| "description": "Search query, such as a topic, person, place, event, or concept.", | |
| "parameters": { | |
| "type": "object", | |
| "properties": { | |
| "query": { | |
| "type": "string", | |
| "description": "Search query, such as a topic, person, place, event, or concept.", | |
| }, | |
| "language": { | |
| "type": "string", | |
| "description": "Wikipedia language code. Defaults to en.", | |
| }, | |
| "limit": { | |
| "type": "integer", | |
| "description": "Maximum results to return. Defaults to 5, maximum 10.", | |
| }, | |
| }, | |
| "required": ["query"], | |
| }, |
|
Addressed the latest Greptile feedback in commit 398eafc:\n\n- made _safe_limit coerce string values like "2" instead of raising TypeError\n- kept invalid/empty limits graceful by falling back to the default\n- added type: object to all tool manifest parameter schemas\n\nValidation rerun:\n- python -m py_compile plugins/omi-wikipedia-app/main.py\n- git diff --check\n- FastAPI TestClient manifest checks\n- explicit _safe_limit checks for string, zero, oversized, invalid, and empty inputs\n- live search_articles smoke check with limit passed as a string |
Summary
Validation
Refs #3120