Skip to content

Commit 036b077

Browse files
committed
review: drip-182 batch 2 — BerriAI/litellm#26793, google-gemini/gemini-cli#26208+26207
- BerriAI/litellm#26793 feat(proxy): durable agent workflow run tracking (needs-discussion — sequence-number race + missing tenancy scoping + Prisma error leak) - google-gemini/gemini-cli#26208 fix: suppress duplicate extension warnings (merge-after-nits) - google-gemini/gemini-cli#26207 Add @ mention the gemini robot (request-changes — critique-semantics flip + comment-step if guard regression + PAT scope concern)
1 parent 12d2e47 commit 036b077

3 files changed

Lines changed: 202 additions & 0 deletions

File tree

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
---
2+
pr: BerriAI/litellm#26793
3+
sha: 9d9efc1afba6fd1d6421f775a9a150524ad27a5c
4+
verdict: needs-discussion
5+
reviewed_at: 2026-04-29T18:31:00Z
6+
---
7+
8+
# feat(proxy): durable agent workflow run tracking via /v1/workflows/runs
9+
10+
## Context
11+
12+
Adds three new Prisma models (`LiteLLM_WorkflowRun`,
13+
`LiteLLM_WorkflowEvent`, `LiteLLM_WorkflowMessage`) and 8 REST endpoints
14+
under `/v1/workflows/runs` in
15+
`litellm/proxy/management_endpoints/workflow_management_endpoints.py`.
16+
The motivating use case: long-running agents (the PR mentions
17+
"shin-builder") whose conversation state and step history were
18+
in-memory only, so a process restart wiped everything.
19+
20+
## What's good
21+
22+
- Generic schema design — `workflow_type`, `step_name`, and `data` are
23+
caller-defined strings/JSON. No hardcoded stage names baked into
24+
the proxy. The status auto-update map (`_EVENT_STATUS_MAP` at the top
25+
of the file: `step.started→running`, `hook.waiting→paused`, etc.)
26+
is small and overridable via `PATCH /v1/workflows/runs/{run_id}`.
27+
- The `session_id` bridge to `LiteLLM_SpendLogs.session_id` is a clean
28+
reuse: existing `/ui/spend_logs/view_session_spend_logs?session_id=`
29+
works for free. No new cost-tracking machinery needed.
30+
- Comma-separated status filter (`?status=running,paused`) handled
31+
cleanly in `list_workflow_runs`: splits and emits `{"in": statuses}`
32+
only when there's more than one. That's the correct Prisma idiom.
33+
34+
## Concerns
35+
36+
1. **Sequence number race.** `_get_next_sequence_number` does
37+
`MAX(sequence_number) + 1` then the caller does an insert. Two
38+
concurrent appenders racing on the same `run_id` will both compute
39+
the same next value and one will conflict on the unique index (or
40+
silently overwrite, depending on schema constraints not shown in
41+
the diff). For an agent that may have multiple step emitters or
42+
hook receivers, this is a real failure mode. Either:
43+
(a) wrap the read+insert in a Prisma transaction with
44+
`SERIALIZABLE` isolation, or
45+
(b) use a Postgres sequence / `RETURNING sequence_number` with a
46+
partitioned-by-run sequence.
47+
2. **No authorization scoping.** All endpoints depend on
48+
`user_api_key_auth` but don't filter `where` by the calling key's
49+
tenant/team. Any authenticated key can `GET /v1/workflows/runs/{any_id}`
50+
and read another team's conversation. The conversation-message table
51+
stores full content (the PR explicitly notes spend logs truncate at
52+
`MAX_STRING_LENGTH_PROMPT_IN_DB` while this doesn't), so the blast
53+
radius of a missing scope check is high.
54+
3. **`raise HTTPException(status_code=500, detail=str(e))`** leaks
55+
raw Prisma error messages (table names, column names, sometimes
56+
query fragments) to the caller. Standard advice: log with
57+
`verbose_proxy_logger.exception(...)` (already done) and return
58+
a generic 500 message.
59+
60+
## Verdict
61+
62+
`needs-discussion` — the data model and endpoint shape are good, but
63+
the concurrency model and tenancy scoping need to be settled before
64+
this goes live. The error-leak issue is a small follow-up.
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
pr: google-gemini/gemini-cli#26207
3+
sha: 16ff1e342a67de573d2301ebe22d27e9ed96daf5
4+
verdict: request-changes
5+
reviewed_at: 2026-04-29T18:31:00Z
6+
---
7+
8+
# Add the ability to @ mention the gemini robot
9+
10+
## Context
11+
12+
Extends `.github/workflows/gemini-cli-bot-brain.yml` to fire on
13+
`issue_comment` events when a maintainer `@gemini-cli-robot` mentions
14+
the bot. The bot then loads `tools/gemini-cli-bot/brain/interactive.md`,
15+
runs in `ENABLE_PRS=true` mode, and is permitted to post comments and
16+
push to `bot/`-prefixed branches via the
17+
`GEMINI_CLI_ROBOT_GITHUB_PAT` secret.
18+
19+
## What's good
20+
21+
- The trigger guard at the job level is the right shape — uses
22+
`contains(fromJSON('["COLLABORATOR", "MEMBER", "OWNER"]'), github.event.comment.author_association)`
23+
to gate on association rather than username, which is harder to
24+
spoof. The reasoning job is also `permissions: contents: read`,
25+
so the LLM call itself can't push.
26+
- `concurrency.group` now includes `github.event.issue.number || …`
27+
so two concurrent maintainer mentions on different issues don't
28+
cancel each other.
29+
- The `<untrusted_context>...</untrusted_context>` wrapper around
30+
the comment body and `gh issue view` output in `trigger_context.md`
31+
is the correct prompt-injection mitigation idiom, and concatenating
32+
it *before* the system prompt keeps untrusted text from clobbering
33+
later instructions.
34+
- The push-time branch guard (`if [[ ! "$BRANCH_NAME" =~ ^bot/ ]]; then exit 1; fi`)
35+
and the comment-time author guard (`PR_AUTHOR=$(gh pr view "$PR_NUM" --json author --jq '.author.login'); if [ "$PR_AUTHOR" != "gemini-cli-robot" ]; then exit 1; fi`)
36+
are the right safety belt for the publish phase.
37+
38+
## Concerns
39+
40+
1. **Critique semantics inverted.** The previous code rejected on
41+
`[REJECTED]` *or* non-zero exit, defaulted to approve. The new
42+
code rejects on anything that isn't *explicitly* `[APPROVED]` and
43+
also lacks `[REJECTED]`. That's a stronger gate (good), but the
44+
comment "Critique failed, rejected, or did not explicitly approve
45+
changes" hides the fact that this is now a fail-closed contract.
46+
Maintainers who relied on the old "neutral = approve" semantics
47+
for their own dispatched runs will see silent skips. Worth a
48+
release note.
49+
2. **`Post PR/Issue Comment` step lost its `if:` guard.** Old code
50+
had `if: "${{ github.event.inputs.enable_prs == 'true' }}"`. The
51+
new step runs unconditionally — it's safe because the inner `if [ -s ... ]`
52+
tests check for content, but on a scheduled run with no comment
53+
artifact you now do extra work and the step shows green-with-noop
54+
in the UI. Minor but a regression in clarity.
55+
3. **`gh issue view ... 2>/dev/null || gh pr view "$TRIGGER_ISSUE_NUMBER"`**
56+
— if both fail (deleted issue, permissions hiccup), the script
57+
keeps going with empty trigger context. The bot then operates
58+
blind on whatever `interactive.md` says by default. Should
59+
`set -e` or explicitly check `[ -s trigger_context.md ]` before
60+
proceeding.
61+
4. **PAT scope.** `GEMINI_CLI_ROBOT_GITHUB_PAT` now ships into a
62+
workflow that any maintainer comment can trigger. If a
63+
compromised maintainer account exists, this is a remote command
64+
execution into the bot's branch space. The `bot/` prefix limits
65+
the blast radius but doesn't eliminate it. Consider narrowing
66+
the PAT to `contents: write` on `refs/heads/bot/*` only via a
67+
GitHub App installation token rather than a long-lived PAT.
68+
69+
## Verdict
70+
71+
`request-changes` — the security model is mostly sound, but the
72+
critique-result semantics flip needs an explicit comment in the
73+
diff, the comment-step `if:` guard should be restored, and the
74+
unconditional fall-through when both `gh issue view` and `gh pr view`
75+
fail needs to abort. The PAT-vs-App-token discussion is a separate
76+
follow-up but worth raising before this lands.
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
pr: google-gemini/gemini-cli#26208
3+
sha: baeccee504acbf3d57aac0836ce3d2720f83c107
4+
verdict: merge-after-nits
5+
reviewed_at: 2026-04-29T18:31:00Z
6+
---
7+
8+
# fix: suppress duplicate extension warnings during startup
9+
10+
## Context
11+
12+
`gemini.tsx` calls `loadCliConfig` twice during startup — once for
13+
`partialConfig` (auth/sandbox bootstrap) and once for the full
14+
interactive session. Both calls were unconditionally instantiating
15+
`new ExtensionManager(...)` and calling `loadExtensions()`, which
16+
emits warnings (missing settings, MCP deprecations) through
17+
`coreEvents`. Result: every warning bubble showed twice. Fix in
18+
`packages/cli/src/config/config.ts` adds a `skipExtensions?: boolean`
19+
to `LoadCliConfigOptions` and the bootstrap call passes `true`.
20+
21+
## What's good
22+
23+
- Targeted, minimal change. Only the bootstrap call is opted out
24+
(`packages/cli/src/gemini.tsx` line 410: `skipExtensions: true`),
25+
and the second/interactive call still loads extensions normally.
26+
- The `SimpleExtensionLoader([])` fallback at config.ts line 681
27+
(`const finalExtensionLoader = extensionManager ?? new SimpleExtensionLoader([])`)
28+
preserves the type contract for downstream consumers
29+
(`loadServerHierarchicalMemory`, the final `Config` object's
30+
`extensionLoader` field). Good defensive choice — avoids forcing
31+
every callsite to handle `undefined`.
32+
- Optional chaining on `extensionManager?.getExtensions()?.find(...)`
33+
for `extensionPlanSettings` is correct: if extensions are skipped,
34+
there are no plan settings to find.
35+
36+
## Concerns / nits
37+
38+
1. **`pr_body.md` checked in.** The diff includes a top-level
39+
`pr_body.md` file that's just the PR description duplicated.
40+
That's almost certainly an accident from a `gh pr create
41+
--body-file pr_body.md` workflow. Should be removed before merge
42+
(or `.gitignore`d).
43+
2. **`skipExtensions` semantics aren't tested in this diff.** The PR
44+
description references a manual test file path
45+
(`packages/cli/src/config/skipExtensions.test.ts`) but the diff
46+
shown doesn't contain it. Either add the test or update the PR
47+
body — saying "added/updated tests" is checked in the checklist
48+
but the test isn't visible.
49+
3. **Subtle behavior change.** When `skipExtensions: true`,
50+
`extensionPlanSettings` is `undefined`, so any setting that
51+
*only* comes from an extension plan (e.g. plan-injected
52+
`includeDirectories`) is silently absent during the bootstrap
53+
pass. If anything in the auth/sandbox bootstrap actually consults
54+
that, this becomes a regression. Worth a quick audit of what
55+
`partialConfig` is used for between those two `loadCliConfig`
56+
calls.
57+
58+
## Verdict
59+
60+
`merge-after-nits` — remove the stray `pr_body.md`, add or point to
61+
the actual test, and confirm bootstrap-phase consumers don't depend
62+
on extension plan settings.

0 commit comments

Comments
 (0)