Add Hacker News Omi integration app#7412
Conversation
Greptile SummaryAdds a new standalone FastAPI plugin under
Confidence Score: 3/5The app is functional for single-user / low-traffic use but will misbehave under concurrent load due to blocking network calls on the async event loop. Every route handler calls synchronous plugins/omi-hacker-news-app/main.py — the route handlers and Important Files Changed
Sequence DiagramsequenceDiagram
participant Omi as Omi Client
participant App as FastAPI App
participant Algolia as HN Algolia API
Omi->>App: GET /.well-known/omi-tools.json
App-->>Omi: tools manifest (3 tools)
Omi->>App: "POST /tools/get_front_page {limit}"
App->>Algolia: "GET /search?tags=front_page&hitsPerPage=N"
Algolia-->>App: "{hits: [...]}"
App-->>Omi: "{result: "Current HN front page..."}"
Omi->>App: "POST /tools/search_stories {query, sort_by, limit}"
alt "sort_by == date"
App->>Algolia: "GET /search_by_date?query=...&tags=story"
else relevance (default)
App->>Algolia: "GET /search?query=...&tags=story"
end
Algolia-->>App: "{hits: [...]}"
App-->>Omi: "{result: "HN stories for query"}"
Omi->>App: "POST /tools/get_discussion {item_id, comment_limit}"
App->>Algolia: "GET /items/{item_id}"
Algolia-->>App: "{title, author, children: [...]}"
App-->>Omi: "{result: "title by author points comments"}"
Reviews (1): Last reviewed commit: "Improve Hacker News text cleanup" | Re-trigger Greptile |
| }, | ||
| ] | ||
| } | ||
|
|
||
|
|
||
| @app.post("/tools/get_front_page", tags=["chat_tools"], response_model=ChatToolResponse) | ||
| async def get_front_page(payload: dict[str, Any]): | ||
| try: | ||
| limit = _safe_limit(payload.get("limit")) | ||
| data = _request_json("/search", {"tags": "front_page", "hitsPerPage": limit}) | ||
| hits = data.get("hits", [])[:limit] | ||
|
|
||
| if not hits: |
There was a problem hiding this comment.
Blocking synchronous I/O inside async route handlers
All three route handlers (get_front_page, search_stories, get_discussion) call _request_json(), which uses the synchronous requests.get(). Inside async def FastAPI handlers that run on the asyncio event loop, a blocking network call stalls the entire event loop until the external HTTP response arrives. Under any concurrent load, every in-flight request queues behind the HN API call, eliminating the concurrency benefit of async. Swap to httpx.AsyncClient with await client.get(...), or wrap the call with await asyncio.to_thread(_request_json, ...) at the minimum.
|
|
||
| text = unescape(value) | ||
| text = ( | ||
| text.replace("<p>", "\n") | ||
| .replace("<pre>", "\n") | ||
| .replace("<code>", "`") | ||
| .replace("</code>", "`") | ||
| ) | ||
| text = re.sub(r"<[^>]+>", "", text) | ||
| text = re.sub(r"[ \t]+", " ", text) | ||
| text = re.sub(r"\n{3,}", "\n\n", text) | ||
| return text.strip() |
There was a problem hiding this comment.
Incomplete HTML stripping leaves raw tags in tool output
The HN Algolia API returns comments with a range of HTML elements beyond the 8 replaced here — most commonly <a href="...">…</a>, <br>, <b>, <strong>, and <span>. These pass through _clean_text unchanged, so tool responses will contain literal HTML markup. A simple re.sub(r"<[^>]+>", "", text) after handling structural tags would cover the general case.
| text = unescape(value) | |
| text = ( | |
| text.replace("<p>", "\n") | |
| .replace("<pre>", "\n") | |
| .replace("<code>", "`") | |
| .replace("</code>", "`") | |
| ) | |
| text = re.sub(r"<[^>]+>", "", text) | |
| text = re.sub(r"[ \t]+", " ", text) | |
| text = re.sub(r"\n{3,}", "\n\n", text) | |
| return text.strip() | |
| import re | |
| text = unescape(value) | |
| text = ( | |
| text.replace("<p>", "\n") | |
| .replace("</p>", "") | |
| .replace("<br>", "\n") | |
| .replace("<br/>", "\n") | |
| .replace("<br />", "\n") | |
| .replace("<pre>", "\n") | |
| .replace("</pre>", "") | |
| .replace("<code>", "`") | |
| .replace("</code>", "`") | |
| .replace("<i>", "") | |
| .replace("</i>", "") | |
| .replace("<b>", "") | |
| .replace("</b>", "") | |
| .replace("<strong>", "") | |
| .replace("</strong>", "") | |
| .replace("<em>", "") | |
| .replace("</em>", "") | |
| ) | |
| text = re.sub(r"<[^>]+>", "", text) | |
| return text.strip() |
| title = item.get("title") or "(untitled)" | ||
| author = item.get("author") or "unknown" | ||
| points = item.get("points") or 0 | ||
| url = item.get("url") or f"https://news.ycombinator.com/item?id={item_id}" |
There was a problem hiding this comment.
comment_limit: 0 is silently coerced to 5 instead of 1
The expression payload.get("comment_limit") or 5 treats a caller-supplied 0 as falsy and substitutes 5, which is inconsistent with get_front_page and search_stories where _safe_limit(0) would correctly clamp to 1. Using payload.get("comment_limit") alone and letting _safe_limit handle the None case would be consistent.
|
Addressed the Greptile feedback in commit 0e67b8d:\n\n- switched HN API calls from synchronous requests to httpx.AsyncClient so FastAPI handlers do not block the event loop\n- expanded comment/text cleanup for common HN HTML tags including links, br, strong/bold wrappers, lists, pre/code\n- fixed comment_limit=0 handling to clamp consistently through _safe_limit\n\nValidation run locally with Python 3.11 runtime deps:\n- python -m py_compile plugins/omi-hacker-news-app/main.py\n- mocked async endpoint checks for front page/search/discussion\n- live get_front_page smoke check against the HN Algolia API |
|
Added one more hardening commit (edfabee):\n\n- made _safe_limit coerce string values like "2" instead of raising TypeError\n- kept invalid/empty limits graceful by falling back to the default\n- added type: object to all tool manifest parameter schemas\n\nValidation rerun:\n- python -m py_compile plugins/omi-hacker-news-app/main.py\n- git diff --check\n- FastAPI TestClient manifest checks\n- explicit _safe_limit checks for string, zero, oversized, invalid, and empty inputs\n- live get_front_page smoke check with limit passed as a string |
Summary
Adds a standalone Hacker News integration app for Omi under
plugins/omi-hacker-news-app.The app exposes three unauthenticated chat tools:
get_front_pagefor current Hacker News front page storiessearch_storiesfor topic/company/project searchesget_discussionfor a Hacker News item plus top-level commentsThis is intentionally small and deployable as a standalone FastAPI service with no required environment variables.
Related to #3120 and the integration-app bounty discussion.
Verification
python3 -m py_compile main.pyget_front_pagesmoke check against the Hacker News Algolia APIPayment fallback: PayPal cultofrozen@gmail.com