Skip to content

Commit 895cc33

Browse files
committed
test: add full auth ops canary coverage
1 parent 19da65c commit 895cc33

File tree

9 files changed

+524
-22
lines changed

9 files changed

+524
-22
lines changed

.github/workflows/live-canary.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,22 @@ jobs:
178178
AUTH_LIVE_NOTION_ACCESS_TOKEN: ${{ secrets.AUTH_LIVE_NOTION_ACCESS_TOKEN }}
179179
AUTH_LIVE_NOTION_REFRESH_TOKEN: ${{ secrets.AUTH_LIVE_NOTION_REFRESH_TOKEN }}
180180
AUTH_LIVE_NOTION_QUERY: ${{ secrets.AUTH_LIVE_NOTION_QUERY }}
181+
AUTH_LIVE_LINEAR_ACCESS_TOKEN: ${{ secrets.AUTH_LIVE_LINEAR_ACCESS_TOKEN }}
182+
AUTH_LIVE_LINEAR_REFRESH_TOKEN: ${{ secrets.AUTH_LIVE_LINEAR_REFRESH_TOKEN }}
183+
AUTH_LIVE_LINEAR_QUERY: ${{ secrets.AUTH_LIVE_LINEAR_QUERY }}
184+
AUTH_LIVE_LINEAR_TOOL_NAME: ${{ vars.AUTH_LIVE_LINEAR_TOOL_NAME || 'linear_search_issues' }}
185+
AUTH_LIVE_LINEAR_TOOL_ARGS_JSON: ${{ secrets.AUTH_LIVE_LINEAR_TOOL_ARGS_JSON }}
186+
AUTH_LIVE_BRAVE_API_KEY: ${{ secrets.AUTH_LIVE_BRAVE_API_KEY }}
187+
AUTH_LIVE_SLACK_BOT_TOKEN: ${{ secrets.AUTH_LIVE_SLACK_BOT_TOKEN }}
188+
AUTH_LIVE_COMPOSIO_API_KEY: ${{ secrets.AUTH_LIVE_COMPOSIO_API_KEY }}
189+
AUTH_LIVE_TELEGRAM_API_ID: ${{ secrets.AUTH_LIVE_TELEGRAM_API_ID }}
190+
AUTH_LIVE_TELEGRAM_API_HASH: ${{ secrets.AUTH_LIVE_TELEGRAM_API_HASH }}
191+
AUTH_LIVE_TELEGRAM_SESSION_JSON: ${{ secrets.AUTH_LIVE_TELEGRAM_SESSION_JSON }}
192+
AUTH_LIVE_GOOGLE_DRIVE_QUERY: ${{ vars.AUTH_LIVE_GOOGLE_DRIVE_QUERY || 'trashed = false' }}
193+
AUTH_LIVE_GOOGLE_DOC_ID: ${{ secrets.AUTH_LIVE_GOOGLE_DOC_ID }}
194+
AUTH_LIVE_GOOGLE_SHEET_ID: ${{ secrets.AUTH_LIVE_GOOGLE_SHEET_ID }}
195+
AUTH_LIVE_GOOGLE_SHEET_RANGE: ${{ vars.AUTH_LIVE_GOOGLE_SHEET_RANGE || 'A1:Z10' }}
196+
AUTH_LIVE_GOOGLE_SLIDES_ID: ${{ secrets.AUTH_LIVE_GOOGLE_SLIDES_ID }}
181197
steps:
182198
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
183199
with:

scripts/auth_live_canary/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,13 @@ not model behavior.
3737
- `notion`
3838
Uses `mcp_notion_access_token`
3939
Runs through Responses API
40+
- `linear`
41+
Uses `mcp_linear_access_token`
42+
Runs through Responses API
43+
- `ops_workflow`
44+
Installs Gmail, Calendar, Drive, Docs, Sheets, Slides, GitHub, Web Search,
45+
LLM Context, Slack, Telegram, Composio, Notion, and Linear. It dispatches one
46+
deterministic multi-tool ops brief probe through `/v1/responses`.
4047

4148
## Setup
4249

@@ -76,6 +83,7 @@ Run only selected providers:
7683

7784
```bash
7885
python3 scripts/auth_live_canary/run_live_canary.py --case gmail --case github
86+
python3 scripts/auth_live_canary/run_live_canary.py --case ops_workflow
7987
```
8088

8189
CI-style fresh-machine install:

scripts/auth_live_canary/config.example.env

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
1-
# Google / Gmail / Calendar
1+
# Google / Gmail / Calendar / Drive / Docs / Sheets / Slides
22
GOOGLE_OAUTH_CLIENT_ID=
33
GOOGLE_OAUTH_CLIENT_SECRET=
44
AUTH_LIVE_GOOGLE_ACCESS_TOKEN=
55
AUTH_LIVE_GOOGLE_REFRESH_TOKEN=
6-
AUTH_LIVE_GOOGLE_SCOPES=gmail.modify gmail.compose calendar.events
6+
AUTH_LIVE_GOOGLE_SCOPES="https://www.googleapis.com/auth/gmail.modify https://www.googleapis.com/auth/gmail.compose https://www.googleapis.com/auth/calendar.events https://www.googleapis.com/auth/drive https://www.googleapis.com/auth/documents https://www.googleapis.com/auth/spreadsheets https://www.googleapis.com/auth/presentations"
77
# Set to 0 to skip forced refresh on first probe.
88
AUTH_LIVE_FORCE_GOOGLE_REFRESH=1
9+
AUTH_LIVE_GOOGLE_DRIVE_QUERY="trashed = false"
10+
AUTH_LIVE_GOOGLE_DOC_ID=
11+
AUTH_LIVE_GOOGLE_SHEET_ID=
12+
AUTH_LIVE_GOOGLE_SHEET_RANGE=A1:Z10
13+
AUTH_LIVE_GOOGLE_SLIDES_ID=
914

1015
# GitHub
1116
AUTH_LIVE_GITHUB_TOKEN=
@@ -17,3 +22,24 @@ AUTH_LIVE_GITHUB_ISSUE_NUMBER=
1722
AUTH_LIVE_NOTION_ACCESS_TOKEN=
1823
AUTH_LIVE_NOTION_REFRESH_TOKEN=
1924
AUTH_LIVE_NOTION_QUERY=canary
25+
26+
# Linear MCP
27+
AUTH_LIVE_LINEAR_ACCESS_TOKEN=
28+
AUTH_LIVE_LINEAR_REFRESH_TOKEN=
29+
AUTH_LIVE_LINEAR_QUERY=canary
30+
AUTH_LIVE_LINEAR_TOOL_NAME=linear_search_issues
31+
AUTH_LIVE_LINEAR_TOOL_ARGS_JSON=
32+
33+
# Brave-backed tools
34+
AUTH_LIVE_BRAVE_API_KEY=
35+
36+
# Slack
37+
AUTH_LIVE_SLACK_BOT_TOKEN=
38+
39+
# Composio
40+
AUTH_LIVE_COMPOSIO_API_KEY=
41+
42+
# Telegram user-mode tool
43+
AUTH_LIVE_TELEGRAM_API_ID=
44+
AUTH_LIVE_TELEGRAM_API_HASH=
45+
AUTH_LIVE_TELEGRAM_SESSION_JSON=

scripts/auth_live_canary/run_live_canary.py

Lines changed: 107 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,12 @@
2929
sys.path.insert(0, str(ROOT))
3030

3131
from scripts.live_canary.auth_registry import SeededProviderCase, configured_seeded_cases
32-
from scripts.live_canary.auth_runtime import activate_extension, install_extension, put_secret
32+
from scripts.live_canary.auth_runtime import (
33+
activate_extension,
34+
install_extension,
35+
put_secret,
36+
write_memory,
37+
)
3338
from scripts.live_canary.common import (
3439
DEFAULT_SECRETS_MASTER_KEY,
3540
DEFAULT_VENV,
@@ -49,7 +54,17 @@
4954

5055
DEFAULT_OUTPUT_DIR = ROOT / "artifacts" / "auth-live-canary"
5156
OWNER_USER_ID = "auth-live-owner"
52-
GOOGLE_SCOPE_DEFAULT = "gmail.modify gmail.compose calendar.events"
57+
GOOGLE_SCOPE_DEFAULT = " ".join(
58+
[
59+
"https://www.googleapis.com/auth/gmail.modify",
60+
"https://www.googleapis.com/auth/gmail.compose",
61+
"https://www.googleapis.com/auth/calendar.events",
62+
"https://www.googleapis.com/auth/drive",
63+
"https://www.googleapis.com/auth/documents",
64+
"https://www.googleapis.com/auth/spreadsheets",
65+
"https://www.googleapis.com/auth/presentations",
66+
]
67+
)
5368

5469

5570
def expire_secret_in_db(db_path: Path, user_id: str, secret_name: str) -> None:
@@ -95,6 +110,7 @@ async def create_response_probe(
95110
response_id = body.get("id")
96111
output = body.get("output", [])
97112
tool_names = [item.get("name") for item in output if item.get("type") == "function_call"]
113+
expected_tool_names = probe.required_tool_names
98114
tool_outputs = [
99115
item.get("output", "")
100116
for item in output
@@ -120,14 +136,14 @@ async def create_response_probe(
120136

121137
success = (
122138
body.get("status") == "completed"
123-
and probe.expected_tool_name in tool_names
139+
and all(tool_name in tool_names for tool_name in expected_tool_names)
124140
and bool(tool_outputs)
125141
and not any(
126142
marker in output_text.lower()
127143
for output_text in tool_outputs
128144
for marker in ("error", "authentication required", "unauthorized", "forbidden")
129145
)
130-
and probe.expected_text in response_text
146+
and (not probe.expected_text or probe.expected_text in response_text)
131147
and fetched_status == 200
132148
)
133149

@@ -140,6 +156,7 @@ async def create_response_probe(
140156
"response_id": response_id,
141157
"status": body.get("status"),
142158
"tool_names": tool_names,
159+
"expected_tool_names": expected_tool_names,
143160
"tool_outputs": tool_outputs,
144161
"response_text": response_text,
145162
"get_status_code": fetched_status,
@@ -291,6 +308,64 @@ async def seed_live_credentials(base_url: str, token: str, db_path: Path) -> Non
291308
provider="mcp:notion",
292309
)
293310

311+
linear_access = env_str("AUTH_LIVE_LINEAR_ACCESS_TOKEN")
312+
linear_refresh = env_str("AUTH_LIVE_LINEAR_REFRESH_TOKEN")
313+
if linear_refresh and not linear_access:
314+
raise CanaryError(
315+
"AUTH_LIVE_LINEAR_ACCESS_TOKEN is required when AUTH_LIVE_LINEAR_REFRESH_TOKEN is set"
316+
)
317+
if linear_access:
318+
await put_secret(
319+
base_url,
320+
token,
321+
user_id=OWNER_USER_ID,
322+
name="mcp_linear_access_token",
323+
value=linear_access,
324+
provider="mcp:linear",
325+
)
326+
if linear_refresh:
327+
await put_secret(
328+
base_url,
329+
token,
330+
user_id=OWNER_USER_ID,
331+
name="mcp_linear_access_token_refresh_token",
332+
value=linear_refresh,
333+
provider="mcp:linear",
334+
)
335+
336+
for env_name, secret_name, provider in (
337+
("AUTH_LIVE_BRAVE_API_KEY", "brave_api_key", "brave"),
338+
("AUTH_LIVE_SLACK_BOT_TOKEN", "slack_bot_token", "slack"),
339+
("AUTH_LIVE_COMPOSIO_API_KEY", "composio_api_key", "composio"),
340+
("AUTH_LIVE_TELEGRAM_API_ID", "telegram_api_id", "telegram"),
341+
("AUTH_LIVE_TELEGRAM_API_HASH", "telegram_api_hash", "telegram"),
342+
):
343+
value = env_str(env_name)
344+
if value:
345+
await put_secret(
346+
base_url,
347+
token,
348+
user_id=OWNER_USER_ID,
349+
name=secret_name,
350+
value=value,
351+
provider=provider,
352+
)
353+
354+
telegram_api_id = env_str("AUTH_LIVE_TELEGRAM_API_ID")
355+
telegram_api_hash = env_str("AUTH_LIVE_TELEGRAM_API_HASH")
356+
telegram_session = env_str("AUTH_LIVE_TELEGRAM_SESSION_JSON")
357+
if telegram_api_id:
358+
await write_memory(base_url, token, path="telegram/api_id", content=telegram_api_id)
359+
if telegram_api_hash:
360+
await write_memory(base_url, token, path="telegram/api_hash", content=telegram_api_hash)
361+
if telegram_session:
362+
await write_memory(
363+
base_url,
364+
token,
365+
path="telegram/session.json",
366+
content=telegram_session,
367+
)
368+
294369

295370
def parse_args() -> argparse.Namespace:
296371
parser = argparse.ArgumentParser(description=__doc__)
@@ -325,7 +400,14 @@ def parse_args() -> argparse.Namespace:
325400
parser.add_argument(
326401
"--case",
327402
action="append",
328-
choices=("gmail", "google_calendar", "github", "notion"),
403+
choices=(
404+
"gmail",
405+
"google_calendar",
406+
"github",
407+
"notion",
408+
"linear",
409+
"ops_workflow",
410+
),
329411
help="Limit the run to specific providers. Repeat for multiple values.",
330412
)
331413
parser.add_argument(
@@ -374,21 +456,27 @@ async def async_main(args: argparse.Namespace) -> int:
374456
try:
375457
await seed_live_credentials(stack.base_url, stack.gateway_token, stack.db_path)
376458

459+
installed: dict[str, dict[str, Any]] = {}
377460
for probe in probes:
378-
ext = await install_extension(
379-
stack.base_url,
380-
stack.gateway_token,
381-
name=probe.extension_install_name,
382-
expected_display_name=probe.expected_display_name,
383-
install_kind=probe.install_kind,
384-
install_url=probe.install_url,
385-
)
386-
await activate_extension(
387-
stack.base_url,
388-
stack.gateway_token,
389-
extension_name=ext["name"],
390-
expected_display_name=ext.get("display_name") or probe.expected_display_name,
391-
)
461+
for installation in probe.installations:
462+
if installation.name in installed:
463+
continue
464+
ext = await install_extension(
465+
stack.base_url,
466+
stack.gateway_token,
467+
name=installation.name,
468+
expected_display_name=installation.expected_display_name,
469+
install_kind=installation.install_kind,
470+
install_url=installation.install_url,
471+
)
472+
await activate_extension(
473+
stack.base_url,
474+
stack.gateway_token,
475+
extension_name=ext["name"],
476+
expected_display_name=ext.get("display_name")
477+
or installation.expected_display_name,
478+
)
479+
installed[installation.name] = ext
392480

393481
results: list[ProbeResult] = []
394482
for probe in probes:

scripts/live-canary/ACCOUNTS.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,17 @@ Every provider should have one stable, low-risk probe target.
7070

7171
- Gmail: one inbox with at least one readable message or draft
7272
- Google Calendar: one calendar with at least one upcoming event
73+
- Google Drive: one accessible stable fixture query or file set
74+
- Google Docs: one readable fixture document
75+
- Google Sheets: one readable fixture spreadsheet/range
76+
- Google Slides: one readable fixture presentation
7377
- GitHub: one dedicated repository with one stable issue
78+
- Brave Search: one low-volume API key shared by Web Search and LLM Context
79+
- Slack: one workspace with a bot token that can list channels
80+
- Telegram: one logged-in user-mode MTProto session
81+
- Composio: one API key with at least one readable connected-account state
7482
- Notion: one test workspace with one searchable page or database row
83+
- Linear: one workspace with one searchable issue
7584

7685
## Seeded Lane Secrets
7786

@@ -100,6 +109,21 @@ Recommended scopes:
100109
- `https://www.googleapis.com/auth/gmail.modify`
101110
- `https://www.googleapis.com/auth/gmail.compose`
102111
- `https://www.googleapis.com/auth/calendar.events`
112+
- `https://www.googleapis.com/auth/drive`
113+
- `https://www.googleapis.com/auth/documents`
114+
- `https://www.googleapis.com/auth/spreadsheets`
115+
- `https://www.googleapis.com/auth/presentations`
116+
117+
Required only for the combined `ops_workflow` case:
118+
119+
- `AUTH_LIVE_GOOGLE_DOC_ID`
120+
- `AUTH_LIVE_GOOGLE_SHEET_ID`
121+
- `AUTH_LIVE_GOOGLE_SLIDES_ID`
122+
123+
Optional:
124+
125+
- `AUTH_LIVE_GOOGLE_DRIVE_QUERY` (defaults to `trashed = false`)
126+
- `AUTH_LIVE_GOOGLE_SHEET_RANGE` (defaults to `A1:Z10`)
103127

104128
### GitHub
105129

@@ -125,6 +149,72 @@ Optional:
125149

126150
The probe should match a stable test page or database entry.
127151

152+
### Linear
153+
154+
Required:
155+
156+
- `AUTH_LIVE_LINEAR_ACCESS_TOKEN`
157+
- `AUTH_LIVE_LINEAR_QUERY`
158+
159+
Optional:
160+
161+
- `AUTH_LIVE_LINEAR_REFRESH_TOKEN`
162+
- `AUTH_LIVE_LINEAR_TOOL_NAME`
163+
- `AUTH_LIVE_LINEAR_TOOL_ARGS_JSON`
164+
165+
Use `AUTH_LIVE_LINEAR_TOOL_NAME` and `AUTH_LIVE_LINEAR_TOOL_ARGS_JSON` if the
166+
Linear MCP server's tool name or argument schema changes. The default tool name
167+
is `linear_search_issues`, with arguments `{"query": "<AUTH_LIVE_LINEAR_QUERY>"}`.
168+
169+
### Brave Search
170+
171+
Required for Web Search and LLM Context probes:
172+
173+
- `AUTH_LIVE_BRAVE_API_KEY`
174+
175+
### Slack
176+
177+
Required:
178+
179+
- `AUTH_LIVE_SLACK_BOT_TOKEN`
180+
181+
The combined workflow uses `list_channels` to avoid posting on every scheduled
182+
run.
183+
184+
### Telegram
185+
186+
Required:
187+
188+
- `AUTH_LIVE_TELEGRAM_API_ID`
189+
- `AUTH_LIVE_TELEGRAM_API_HASH`
190+
- `AUTH_LIVE_TELEGRAM_SESSION_JSON`
191+
192+
The seeded runner writes these to `telegram/api_id`, `telegram/api_hash`, and
193+
`telegram/session.json` in the fresh workspace before activating the tool. The
194+
combined workflow uses `get_me` to avoid sending messages on every scheduled
195+
run.
196+
197+
### Composio
198+
199+
Required:
200+
201+
- `AUTH_LIVE_COMPOSIO_API_KEY`
202+
203+
The combined workflow uses `connected_accounts`, which is read-only.
204+
205+
### Combined Ops Workflow
206+
207+
Run this after provisioning every fixture above:
208+
209+
```bash
210+
LANE=auth-live-seeded CASES=ops_workflow scripts/live-canary/run.sh
211+
```
212+
213+
It installs and activates Gmail, Google Calendar, Google Drive, Google Docs,
214+
Google Sheets, Google Slides, GitHub, Web Search, LLM Context, Slack, Telegram,
215+
Composio, Notion, and Linear, then dispatches one deterministic `/v1/responses`
216+
turn that calls every tool.
217+
128218
## Browser-Consent Lane Secrets
129219

130220
These are read by `scripts/auth_browser_canary/run_browser_canary.py`.

scripts/live-canary/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ Run selected auth provider cases:
8383

8484
```bash
8585
LANE=auth-live-seeded CASES=gmail,github scripts/live-canary/run.sh
86+
LANE=auth-live-seeded CASES=ops_workflow scripts/live-canary/run.sh
8687
LANE=auth-browser-consent CASES=google,github scripts/live-canary/run.sh
8788
```
8889

0 commit comments

Comments
 (0)