Skip to content

Add Hacker News Omi integration app#7412

Open
absalonCRC wants to merge 4 commits into
BasedHardware:mainfrom
absalonCRC:omi-hacker-news-app
Open

Add Hacker News Omi integration app#7412
absalonCRC wants to merge 4 commits into
BasedHardware:mainfrom
absalonCRC:omi-hacker-news-app

Conversation

@absalonCRC
Copy link
Copy Markdown

@absalonCRC absalonCRC commented May 20, 2026

Summary

Adds a standalone Hacker News integration app for Omi under plugins/omi-hacker-news-app.

The app exposes three unauthenticated chat tools:

  • get_front_page for current Hacker News front page stories
  • search_stories for topic/company/project searches
  • get_discussion for a Hacker News item plus top-level comments

This is intentionally small and deployable as a standalone FastAPI service with no required environment variables.

Related to #3120 and the integration-app bounty discussion.

Verification

  • python3 -m py_compile main.py
  • Direct endpoint checks with mocked Hacker News API responses
  • Live get_front_page smoke check against the Hacker News Algolia API

Payment fallback: PayPal cultofrozen@gmail.com

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

Adds a new standalone FastAPI plugin under plugins/omi-hacker-news-app that exposes three unauthenticated Omi chat tools for reading the HN front page, searching stories, and fetching item discussions via the HN Algolia API. The integration is self-contained with no required environment variables.

  • Blocking I/O in async handlers: all three route handlers call the synchronous requests.get() directly on the event loop, which will serialize concurrent requests — replacing with httpx.AsyncClient or asyncio.to_thread is needed before production use.
  • Incomplete HTML sanitization: _clean_text only handles 8 specific tags; HN comments routinely include <a>, <br>, <b>, <strong>, and others that will appear as raw HTML in tool responses.
  • Minor default inconsistency: comment_limit=0 in get_discussion is coerced to 5 via the or 5 pattern, unlike the other endpoints where _safe_limit would clamp it to 1.

Confidence Score: 3/5

The app is functional for single-user / low-traffic use but will misbehave under concurrent load due to blocking network calls on the async event loop.

Every route handler calls synchronous requests.get() inside async def, which blocks the asyncio event loop for the duration of each HN API round-trip. Under any meaningful concurrency this serializes all requests, making the service unresponsive. The HTML-stripping gap and comment_limit default inconsistency are quality issues but don't affect correctness as significantly. The blocking I/O issue should be resolved before this is deployed behind a real Omi integration.

plugins/omi-hacker-news-app/main.py — the route handlers and _clean_text helper need attention before production use.

Important Files Changed

Filename Overview
plugins/omi-hacker-news-app/main.py Core app logic — synchronous requests calls inside async handlers will block the event loop under concurrent load; HTML stripping is incomplete; minor comment_limit=0 coercion inconsistency.
plugins/omi-hacker-news-app/requirements.txt Pinned dependency versions for FastAPI, uvicorn, requests, and pydantic — all recent stable releases. No httpx present, consistent with the synchronous-requests choice.
plugins/omi-hacker-news-app/Procfile Standard uvicorn startup with ${PORT:-8080} fallback — correct for Heroku/Railway.
plugins/omi-hacker-news-app/railway.toml Nixpacks build config with health-check path and 300s timeout — straightforward and correct.
plugins/omi-hacker-news-app/README.md Clear setup guide with local dev, deployment notes, and example curl commands.
plugins/omi-hacker-news-app/runtime.txt Pins Python 3.11.9 for the deployment environment — no issues.

Sequence Diagram

sequenceDiagram
    participant Omi as Omi Client
    participant App as FastAPI App
    participant Algolia as HN Algolia API

    Omi->>App: GET /.well-known/omi-tools.json
    App-->>Omi: tools manifest (3 tools)

    Omi->>App: "POST /tools/get_front_page {limit}"
    App->>Algolia: "GET /search?tags=front_page&hitsPerPage=N"
    Algolia-->>App: "{hits: [...]}"
    App-->>Omi: "{result: "Current HN front page..."}"

    Omi->>App: "POST /tools/search_stories {query, sort_by, limit}"
    alt "sort_by == date"
        App->>Algolia: "GET /search_by_date?query=...&tags=story"
    else relevance (default)
        App->>Algolia: "GET /search?query=...&tags=story"
    end
    Algolia-->>App: "{hits: [...]}"
    App-->>Omi: "{result: "HN stories for query"}"

    Omi->>App: "POST /tools/get_discussion {item_id, comment_limit}"
    App->>Algolia: "GET /items/{item_id}"
    Algolia-->>App: "{title, author, children: [...]}"
    App-->>Omi: "{result: "title by author points comments"}"
Loading

Reviews (1): Last reviewed commit: "Improve Hacker News text cleanup" | Re-trigger Greptile

Comment on lines +175 to +187
},
]
}


@app.post("/tools/get_front_page", tags=["chat_tools"], response_model=ChatToolResponse)
async def get_front_page(payload: dict[str, Any]):
try:
limit = _safe_limit(payload.get("limit"))
data = _request_json("/search", {"tags": "front_page", "hitsPerPage": limit})
hits = data.get("hits", [])[:limit]

if not hits:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Blocking synchronous I/O inside async route handlers

All three route handlers (get_front_page, search_stories, get_discussion) call _request_json(), which uses the synchronous requests.get(). Inside async def FastAPI handlers that run on the asyncio event loop, a blocking network call stalls the entire event loop until the external HTTP response arrives. Under any concurrent load, every in-flight request queues behind the HN API call, eliminating the concurrency benefit of async. Swap to httpx.AsyncClient with await client.get(...), or wrap the call with await asyncio.to_thread(_request_json, ...) at the minimum.

Comment on lines +41 to +52

text = unescape(value)
text = (
text.replace("<p>", "\n")
.replace("<pre>", "\n")
.replace("<code>", "`")
.replace("</code>", "`")
)
text = re.sub(r"<[^>]+>", "", text)
text = re.sub(r"[ \t]+", " ", text)
text = re.sub(r"\n{3,}", "\n\n", text)
return text.strip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Incomplete HTML stripping leaves raw tags in tool output

The HN Algolia API returns comments with a range of HTML elements beyond the 8 replaced here — most commonly <a href="...">…</a>, <br>, <b>, <strong>, and <span>. These pass through _clean_text unchanged, so tool responses will contain literal HTML markup. A simple re.sub(r"<[^>]+>", "", text) after handling structural tags would cover the general case.

Suggested change
text = unescape(value)
text = (
text.replace("<p>", "\n")
.replace("<pre>", "\n")
.replace("<code>", "`")
.replace("</code>", "`")
)
text = re.sub(r"<[^>]+>", "", text)
text = re.sub(r"[ \t]+", " ", text)
text = re.sub(r"\n{3,}", "\n\n", text)
return text.strip()
import re
text = unescape(value)
text = (
text.replace("<p>", "\n")
.replace("</p>", "")
.replace("<br>", "\n")
.replace("<br/>", "\n")
.replace("<br />", "\n")
.replace("<pre>", "\n")
.replace("</pre>", "")
.replace("<code>", "`")
.replace("</code>", "`")
.replace("<i>", "")
.replace("</i>", "")
.replace("<b>", "")
.replace("</b>", "")
.replace("<strong>", "")
.replace("</strong>", "")
.replace("<em>", "")
.replace("</em>", "")
)
text = re.sub(r"<[^>]+>", "", text)
return text.strip()

Comment on lines +228 to +231
title = item.get("title") or "(untitled)"
author = item.get("author") or "unknown"
points = item.get("points") or 0
url = item.get("url") or f"https://news.ycombinator.com/item?id={item_id}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 comment_limit: 0 is silently coerced to 5 instead of 1

The expression payload.get("comment_limit") or 5 treats a caller-supplied 0 as falsy and substitutes 5, which is inconsistent with get_front_page and search_stories where _safe_limit(0) would correctly clamp to 1. Using payload.get("comment_limit") alone and letting _safe_limit handle the None case would be consistent.

@absalonCRC
Copy link
Copy Markdown
Author

Addressed the Greptile feedback in commit 0e67b8d:\n\n- switched HN API calls from synchronous requests to httpx.AsyncClient so FastAPI handlers do not block the event loop\n- expanded comment/text cleanup for common HN HTML tags including links, br, strong/bold wrappers, lists, pre/code\n- fixed comment_limit=0 handling to clamp consistently through _safe_limit\n\nValidation run locally with Python 3.11 runtime deps:\n- python -m py_compile plugins/omi-hacker-news-app/main.py\n- mocked async endpoint checks for front page/search/discussion\n- live get_front_page smoke check against the HN Algolia API

@absalonCRC
Copy link
Copy Markdown
Author

Added one more hardening commit (edfabee):\n\n- made _safe_limit coerce string values like "2" instead of raising TypeError\n- kept invalid/empty limits graceful by falling back to the default\n- added type: object to all tool manifest parameter schemas\n\nValidation rerun:\n- python -m py_compile plugins/omi-hacker-news-app/main.py\n- git diff --check\n- FastAPI TestClient manifest checks\n- explicit _safe_limit checks for string, zero, oversized, invalid, and empty inputs\n- live get_front_page smoke check with limit passed as a string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant