Skip to content

Add Wikipedia Omi integration app#7413

Open
absalonCRC wants to merge 3 commits into
BasedHardware:mainfrom
absalonCRC:omi-wikipedia-app
Open

Add Wikipedia Omi integration app#7413
absalonCRC wants to merge 3 commits into
BasedHardware:mainfrom
absalonCRC:omi-wikipedia-app

Conversation

@absalonCRC
Copy link
Copy Markdown

Summary

  • Add a standalone no-auth Wikipedia integration app under plugins/omi-wikipedia-app.
  • Expose Omi chat tools for article search, exact-title summaries, and random article discovery.
  • Include README, requirements, Procfile, Railway config, and runtime metadata for deployment.

Validation

  • /opt/homebrew/bin/python3.11 -m py_compile plugins/omi-wikipedia-app/main.py
  • git diff --check
  • /tmp/omi-wiki-venv311/bin/python -m pip install -r plugins/omi-wikipedia-app/requirements.txt
  • FastAPI TestClient check for /health and /.well-known/omi-tools.json
  • Live tool smoke checks: search_articles, get_article_summary, get_random_article
  • gitleaks detect --source plugins/omi-wikipedia-app --no-git --redact --verbose

Refs #3120

@absalonCRC
Copy link
Copy Markdown
Author

Added a small follow-up commit (9786102) to clean MediaWiki search snippets more generally before returning them in tool responses. Validation rerun:

  • python -m py_compile plugins/omi-wikipedia-app/main.py
  • git diff --check
  • manifest + snippet cleanup check through FastAPI TestClient
  • live search_articles smoke check
  • gitleaks detect --source plugins/omi-wikipedia-app --no-git --redact --verbose

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR adds a new standalone Wikipedia integration plugin for Omi, exposing three no-auth chat tools (article search, article summary, and random article) as a FastAPI service deployable to Railway.

  • main.py implements the three /tools/* POST endpoints, a /.well-known/omi-tools.json manifest, health and root routes, and helper utilities for safe language/limit normalization and Wikipedia API calls via httpx.
  • Deployment files (Procfile, railway.toml, runtime.txt, requirements.txt) are complete and consistent with each other.

Confidence Score: 3/5

The app is straightforward and the core logic is sound, but a type-coercion gap in _safe_limit will cause unhandled 500 errors whenever the caller passes limit as a string rather than an integer.

The _safe_limit function assumes its input is already an int or None, but the endpoint accepts a raw dict payload where limit could arrive as a string. The call to min(limit, MAX_LIMIT) then raises an uncaught TypeError before the try/except httpx.HTTPError block, returning a 500 instead of a graceful error response.

plugins/omi-wikipedia-app/main.py warrants a close look, specifically _safe_limit, the snippet HTML-stripping logic, and the tool manifest parameter schemas.

Important Files Changed

Filename Overview
plugins/omi-wikipedia-app/main.py Core FastAPI app implementing three Wikipedia chat tools; has a type-coercion gap in _safe_limit that causes unhandled 500s when limit arrives as a string, incomplete HTML entity decoding in search snippets, and missing "type":"object" in the manifest parameter schemas.
plugins/omi-wikipedia-app/requirements.txt Pins fastapi, uvicorn, pydantic, and httpx to specific versions; correct and complete.
plugins/omi-wikipedia-app/Procfile Standard Procfile using uvicorn with $PORT; correct for Railway/Heroku deployment.
plugins/omi-wikipedia-app/railway.toml Railway deployment config with Nixpacks builder and ON_FAILURE restart policy; no issues.
plugins/omi-wikipedia-app/runtime.txt Specifies Python 3.11.9 runtime; no issues.
plugins/omi-wikipedia-app/README.md Clear README covering features, chat tools, local development, and deployment; no issues.

Sequence Diagram

sequenceDiagram
    participant Omi as Omi Platform
    participant App as Wikipedia App
    participant Wiki as Wikipedia API

    Omi->>App: GET /.well-known/omi-tools.json
    App-->>Omi: Tool manifest (3 tools)

    Note over Omi,App: Tool invocation

    Omi->>App: POST /tools/search_articles
    App->>Wiki: "GET /w/api.php?action=query&list=search"
    Wiki-->>App: Search results (with HTML snippets)
    App-->>Omi: "ChatToolResponse {result: formatted list}"

    Omi->>App: POST /tools/get_article_summary
    App->>Wiki: "GET /api/rest_v1/page/summary/{title}"
    Wiki-->>App: Summary JSON
    App-->>Omi: "ChatToolResponse {result: formatted summary}"

    Omi->>App: POST /tools/get_random_article
    App->>Wiki: "GET /w/api.php?action=query&list=random"
    Wiki-->>App: Random title
    App->>Wiki: "GET /api/rest_v1/page/summary/{title}"
    Wiki-->>App: Summary JSON
    App-->>Omi: "ChatToolResponse {result: random article summary}"
Loading

Reviews (1): Last reviewed commit: "Clean Wikipedia search snippets" | Re-trigger Greptile

Comment thread plugins/omi-wikipedia-app/main.py Outdated
Comment on lines +37 to +40


def _safe_limit(limit: Optional[int]) -> int:
if limit is None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 _safe_limit is called before the try block in search_articles, so a non-integer limit value — e.g., {"limit": "5"} sent by an LLM — will cause min("5", 10) to raise TypeError: '<' not supported between instances of 'str' and 'int'. Because the exception occurs outside the except httpx.HTTPError handler, FastAPI will return an unhandled 500 instead of a graceful ChatToolResponse(error=...). Add an int() coercion with a fallback so any non-numeric string is treated as the default.

Suggested change
def _safe_limit(limit: Optional[int]) -> int:
if limit is None:
def _safe_limit(limit: Optional[int]) -> int:
if limit is None:
return 5
try:
limit = int(limit)
except (TypeError, ValueError):
return 5
return max(1, min(limit, MAX_LIMIT))

"action": "query",
"list": "search",
"srsearch": query,
"srlimit": limit,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Wikipedia's search API returns snippets that can contain arbitrary HTML entities (&amp;, &lt;, &quot;, etc.) in addition to the <span class="searchmatch"> tags. The current stripping only removes those specific span tags, so entities like &amp; will appear literally in the tool output returned to the user. Using html.unescape after the tag removal resolves this cleanly.

Suggested change
"srlimit": limit,
import html
snippet = html.unescape(
(item.get("snippet") or "").replace("<span class=\"searchmatch\">", "").replace("</span>", "")
)

Comment on lines +105 to +121
return {"status": "ok"}


@app.get("/.well-known/omi-tools.json")
async def get_omi_tools_manifest():
return {
"tools": [
{
"name": "search_articles",
"description": "Search Wikipedia articles by keyword. Use this when the user asks about a topic, person, place, event, concept, or wants matching encyclopedia articles.",
"endpoint": "/tools/search_articles",
"method": "POST",
"parameters": {
"properties": {
"query": {
"type": "string",
"description": "Search query, such as a topic, person, place, event, or concept.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The parameters objects in the manifest are missing the "type": "object" field required by JSON Schema. Without it, strict parsers may reject the schema, and the Omi platform or toolchain may fail to validate or display the tool parameters correctly.

Suggested change
return {"status": "ok"}
@app.get("/.well-known/omi-tools.json")
async def get_omi_tools_manifest():
return {
"tools": [
{
"name": "search_articles",
"description": "Search Wikipedia articles by keyword. Use this when the user asks about a topic, person, place, event, concept, or wants matching encyclopedia articles.",
"endpoint": "/tools/search_articles",
"method": "POST",
"parameters": {
"properties": {
"query": {
"type": "string",
"description": "Search query, such as a topic, person, place, event, or concept.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query, such as a topic, person, place, event, or concept.",
},
"language": {
"type": "string",
"description": "Wikipedia language code. Defaults to en.",
},
"limit": {
"type": "integer",
"description": "Maximum results to return. Defaults to 5, maximum 10.",
},
},
"required": ["query"],
},

@absalonCRC
Copy link
Copy Markdown
Author

Addressed the latest Greptile feedback in commit 398eafc:\n\n- made _safe_limit coerce string values like "2" instead of raising TypeError\n- kept invalid/empty limits graceful by falling back to the default\n- added type: object to all tool manifest parameter schemas\n\nValidation rerun:\n- python -m py_compile plugins/omi-wikipedia-app/main.py\n- git diff --check\n- FastAPI TestClient manifest checks\n- explicit _safe_limit checks for string, zero, oversized, invalid, and empty inputs\n- live search_articles smoke check with limit passed as a string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant