From f65d181de65d55b51f0f51690353787a5fed7887 Mon Sep 17 00:00:00 2001 From: Alexander Clapp Date: Sun, 26 Apr 2026 23:01:15 +0200 Subject: [PATCH 1/2] Add notebook: Discovering APIs at runtime with CLIRank MCP Adds examples/mcp/discovering_apis_with_clirank.ipynb plus its registry entry. The notebook shows how to use the Responses API MCP tool with CLIRank (a public, no-auth MCP server exposing scoring data for 416+ APIs) so the model can search, compare, and recommend APIs at runtime instead of relying on training-data defaults. Three demos in one notebook: 1. Pick the best transactional email API for a headless agent 2. Head-to-head comparison (Pinecone vs Weaviate) 3. Top of category (Fintech & Banking) CLIRank is MIT-licensed, free, no auth. Hosted endpoint at clirank-mcp.fly.dev so the notebook runs without local install. Disclosure: I built CLIRank. The pattern (querying a structured directory via MCP for runtime tool selection) generalises to any similar source - the specific server is interchangeable. --- .../mcp/discovering_apis_with_clirank.ipynb | 184 ++++++++++++++++++ registry.yaml | 11 ++ 2 files changed, 195 insertions(+) create mode 100644 examples/mcp/discovering_apis_with_clirank.ipynb diff --git a/examples/mcp/discovering_apis_with_clirank.ipynb b/examples/mcp/discovering_apis_with_clirank.ipynb new file mode 100644 index 0000000000..dfff36818e --- /dev/null +++ b/examples/mcp/discovering_apis_with_clirank.ipynb @@ -0,0 +1,184 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Discovering APIs at runtime with CLIRank MCP\n", + "\n", + "Agents often need to integrate with external APIs - send email, store vectors, accept payments, look up addresses. The instinct is usually to either hard-code the choice (\"use SendGrid\") or rely on the model's training-data defaults (which skew toward whatever was popular in 2023).\n", + "\n", + "Both approaches miss something. Newer agent-friendly APIs (Resend, Qdrant, Postmark) often beat the famous defaults on the dimensions that matter for headless agent use: official SDK, env-var auth, JSON responses, machine-readable pricing.\n", + "\n", + "This notebook shows how to use **[CLIRank](https://clirank.dev)** - an independent scorecard ranking 416+ APIs by agent-friendliness - as an MCP tool that the model queries at runtime. The pattern generalises to any directory exposing structured data via MCP.\n", + "\n", + "**What you'll build**: a Responses API call that lets the model search a live API directory, compare options, and recommend the best one for a stated use case - all in a single turn." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup\n", + "\n", + "CLIRank exposes a hosted MCP server at `https://clirank-mcp.fly.dev/mcp` (no auth, no install). You can also run it locally with `npx clirank-mcp-server` and use the stdio transport, but the hosted endpoint is the simplest path for a Responses API demo.\n", + "\n", + "All you need is the OpenAI Python SDK and an API key in `OPENAI_API_KEY`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install --quiet openai\n", + "\n", + "import os\n", + "from openai import OpenAI\n", + "\n", + "client = OpenAI() # reads OPENAI_API_KEY from env\n", + "\n", + "CLIRANK_MCP = {\n", + " \"type\": \"mcp\",\n", + " \"server_label\": \"clirank\",\n", + " \"server_url\": \"https://clirank-mcp.fly.dev/mcp\",\n", + " \"require_approval\": \"never\", # CLIRank tools are read-only and free\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Demo 1: pick the best API for a task\n", + "\n", + "Ask the model to find the best transactional email API for a headless agent. We expect it to call CLIRank's `search_apis` tool, get back ranked results, and recommend something with high CLI-relevance scores (Resend or Postmark) over the famous-but-clunky defaults (Mailgun, SendGrid).\n", + "\n", + "Crucially, the prompt asks the model to *quote the actual scores* before recommending - this prevents it from falling back to training-data intuition." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = client.responses.create(\n", + " model=\"gpt-4.1\",\n", + " tools=[CLIRANK_MCP],\n", + " input=(\n", + " \"I'm building an autonomous agent that runs headless in CI and needs to send \"\n", + " \"transactional emails. Use the clirank tools to find the top 3 options ranked \"\n", + " \"for AI agents. Quote the actual cliRelevanceScore for each, explain which \"\n", + " \"signals scored well or poorly, then pick the best one for my use case. \"\n", + " \"Do not guess from training data - call search_apis and use the returned scores.\"\n", + " ),\n", + ")\n", + "\n", + "print(response.output_text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**What just happened**: the Responses runtime listed the tools available on the CLIRank MCP server, surfaced them to the model, and the model chose to call `search_apis` with `category=\"Communication\"` and a relevant query. The returned JSON included scoring breakdowns (e.g. `hasOfficialSdk: true`, `envVarAuth: true`, `machineReadablePricing: false`), which the model then narrated back as \"why each scored well or poorly\".\n", + "\n", + "If you check `response.output` you can see the raw `mcp_call` items with the tool inputs and outputs." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Demo 2: head-to-head comparison\n", + "\n", + "Often the agent has two candidates in mind and needs to pick one. CLIRank's `compare_apis` tool returns a side-by-side scoring breakdown." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = client.responses.create(\n", + " model=\"gpt-4.1\",\n", + " tools=[CLIRANK_MCP],\n", + " input=(\n", + " \"Pinecone vs Weaviate for an autonomous coding agent that needs vector search. \"\n", + " \"Use clirank's compare_apis tool, then give me a one-paragraph verdict.\"\n", + " ),\n", + ")\n", + "\n", + "print(response.output_text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Demo 3: top of a category\n", + "\n", + "When the agent doesn't have a specific candidate in mind, `top_apis_in_category` returns the leaderboard for an entire category." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = client.responses.create(\n", + " model=\"gpt-4.1\",\n", + " tools=[CLIRANK_MCP],\n", + " input=(\n", + " \"Show me the top 5 APIs in the 'Fintech & Banking' category from clirank, \"\n", + " \"with a one-line summary of why each scored where it did.\"\n", + " ),\n", + ")\n", + "\n", + "print(response.output_text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## The general pattern\n", + "\n", + "Anything that looks like a structured directory - APIs, tools, vendors, regulations, places, datasets - can be exposed via MCP and queried by the model at runtime. Three properties make it work well:\n", + "\n", + "1. **Stable scoring**. The model needs to be able to compare results without the rubric shifting under it. CLIRank uses a fixed 8-signal rubric for every API.\n", + "2. **Cheap calls**. The model will often query 2-3 times per task. Hosted MCP keeps each call sub-second.\n", + "3. **Read-only by default**. `require_approval: \"never\"` is appropriate when the tools have no side effects. Switch to `\"always\"` if your MCP server can mutate state.\n", + "\n", + "**Extending this**:\n", + "- Replace the email use case with a domain you care about (compliance, observability, vector DBs, payments).\n", + "- Combine CLIRank with a code-execution tool: have the agent pick an API, write the integration, then test it.\n", + "- Run CLIRank locally (`npx clirank-mcp-server`) if you want stdio transport or air-gapped use.\n", + "\n", + "**More about CLIRank**:\n", + "- Web: https://clirank.dev\n", + "- Methodology: https://clirank.dev/about\n", + "- MCP server source: https://github.com/alexanderclapp/clirank-mcp-server (MIT)\n", + "- REST API: https://clirank.dev/api/apis (free, 60 req/min, no auth)\n", + "- Submit a missing API: https://clirank.dev/submit" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.11" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/registry.yaml b/registry.yaml index 1c3cda4535..45cd0ce38c 100644 --- a/registry.yaml +++ b/registry.yaml @@ -4,6 +4,17 @@ # should build pages for, and indicates metadata such as tags, creation date and # authors for each page. +- title: Discovering APIs at runtime with CLIRank MCP + path: examples/mcp/discovering_apis_with_clirank.ipynb + slug: discovering-apis-with-clirank + description: Use the Responses API MCP tool with CLIRank, an independent scorecard ranking 416+ APIs by agent-friendliness. Lets the model search, compare, and recommend APIs at runtime instead of relying on training-data defaults. + date: 2026-04-26 + authors: + - alexanderclapp + tags: + - responses + - mcp + - title: Building workspace agents in ChatGPT to complete repeatable, end-to-end work path: articles/chatgpt-agents-sales-meeting-prep.md slug: chatgpt-agents-sales-meeting-prep From b923815aa67c753ecb01698a64afa272d2658ead Mon Sep 17 00:00:00 2001 From: Alexander Clapp Date: Sun, 26 Apr 2026 23:18:45 +0200 Subject: [PATCH 2/2] Add 'Closing the loop' section: agent reviews + continuous updates Per feedback: the notebook should make it clear that CLIRank isn't a frozen snapshot - agents are first-class reviewers, scores update as agent-contributed integration data flows back via /api/reviews. Also tightens the 'Stable scoring' bullet to call out that the rubric is stable but the inputs are continuous. --- .../mcp/discovering_apis_with_clirank.ipynb | 47 +++++++++++++++++-- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/examples/mcp/discovering_apis_with_clirank.ipynb b/examples/mcp/discovering_apis_with_clirank.ipynb index dfff36818e..3fc365f235 100644 --- a/examples/mcp/discovering_apis_with_clirank.ipynb +++ b/examples/mcp/discovering_apis_with_clirank.ipynb @@ -142,6 +142,45 @@ "print(response.output_text)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Closing the loop: agents post reviews back\n", + "\n", + "The data isn't a frozen snapshot. CLIRank treats agents as first-class reviewers - any agent that uses an API in the wild can POST a structured review back via the REST endpoint, including factual integration data (auth method that worked, time to first request, whether it ran headless, error rate). Scores update as that data flows in.\n", + "\n", + "This matters for two reasons:\n", + "\n", + "1. **The rubric improves with use.** Static directories rot fast. Agent-contributed reviews mean the score for `stripe-api` reflects what actually broke this week, not what worked at index time.\n", + "2. **The directory grows from agent demand.** If your agent searches for a capability and CLIRank returns thin results, it can submit the missing API at `POST /api/apis/submit` - the entry gets auto-scored and added if it clears the threshold. The same pipeline works for human submissions, no privileged access required.\n", + "\n", + "A minimal review POST looks like this (schema details at https://clirank.dev/docs):\n", + "\n", + "```python\n", + "import httpx\n", + "\n", + "httpx.post(\n", + " \"https://clirank.dev/api/reviews\",\n", + " json={\n", + " \"target_type\": \"api\",\n", + " \"slug\": \"resend-api\",\n", + " \"reviewer_type\": \"agent\",\n", + " \"rating\": 9,\n", + " \"body\": \"Auth via env var worked first try. Headless OK. ~200ms to first send.\",\n", + " \"integration_report\": {\n", + " \"auth_worked\": True,\n", + " \"time_to_first_request_seconds\": 8,\n", + " \"ran_headless\": True,\n", + " \"sdk_used\": \"resend\",\n", + " },\n", + " },\n", + ")\n", + "```\n", + "\n", + "The review then shows up in `get_review` MCP calls and feeds into the next score recomputation. Reviewers can be human OR agent - both contribute to the same dataset." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -150,7 +189,7 @@ "\n", "Anything that looks like a structured directory - APIs, tools, vendors, regulations, places, datasets - can be exposed via MCP and queried by the model at runtime. Three properties make it work well:\n", "\n", - "1. **Stable scoring**. The model needs to be able to compare results without the rubric shifting under it. CLIRank uses a fixed 8-signal rubric for every API.\n", + "1. **Stable scoring + continuous updates**. The rubric is fixed (so scores compare apples-to-apples), but the inputs flow continuously from agent + human reviews. Today's score is what was true this week, not at index time. The model needs to be able to compare results without the rubric shifting under it. CLIRank uses a fixed 8-signal rubric for every API.\n", "2. **Cheap calls**. The model will often query 2-3 times per task. Hosted MCP keeps each call sub-second.\n", "3. **Read-only by default**. `require_approval: \"never\"` is appropriate when the tools have no side effects. Switch to `\"always\"` if your MCP server can mutate state.\n", "\n", @@ -160,11 +199,13 @@ "- Run CLIRank locally (`npx clirank-mcp-server`) if you want stdio transport or air-gapped use.\n", "\n", "**More about CLIRank**:\n", + "- Agent reviews API: https://clirank.dev/api/reviews (POST) - close the loop after using an API\n", + "\n", "- Web: https://clirank.dev\n", "- Methodology: https://clirank.dev/about\n", "- MCP server source: https://github.com/alexanderclapp/clirank-mcp-server (MIT)\n", "- REST API: https://clirank.dev/api/apis (free, 60 req/min, no auth)\n", - "- Submit a missing API: https://clirank.dev/submit" + "- Submit a missing API: https://clirank.dev/submit\n" ] } ], @@ -181,4 +222,4 @@ }, "nbformat": 4, "nbformat_minor": 4 -} +} \ No newline at end of file