review: drip-159 litellm + gemini-cli + goose (4 PRs)

Bojun-Vvibe · Bojun-Vvibe · commit 9e53a1ec19dd · 2026-04-29T09:52:57.000+08:00
- BerriAI/litellm#26730: aiohttp SO_KEEPALIVE for NAT idle timeouts (merge-after-nits) - BerriAI/litellm#26714: gate skills auto-execution and guardrail exec() behind env vars (merge-after-nits) - google-gemini/gemini-cli#26139: FooterConfigDialog stale-closure reset bug (merge-after-nits) - aaif-goose/goose#8877: inline MCP apps in goose2 with local axum proxy (request-changes)
diff --git a/reviews/2026-W18/drip-159/BerriAI-litellm-pr-26714.md b/reviews/2026-W18/drip-159/BerriAI-litellm-pr-26714.md
@@ -0,0 +1,36 @@
+# Review: BerriAI/litellm#26714 — Security: gate skills auto-execution and guardrail exec() behind env vars
+
+- **Author**: johnpippett
+- **Head SHA**: `67fb5adbfd28cf33f7f00d3421a1e93f10ba08a7`
+- **Diff**: +33 / −1 across 5 files
+
+## What it does
+
+Closes two security-class default-on behaviors in the proxy:
+
+1. **Skills auto-execution** — `litellm/proxy/hooks/__init__.py:23-37` removes `"litellm_skills": SkillsInjectionHook` from the unconditional `PROXY_HOOKS` dict, then re-adds it inside an `if os.getenv("LITELLM_ENABLE_SKILLS", "false").lower() == "true":` block alongside the existing `LEGACY_MULTI_INSTANCE_RATE_LIMITING` env-gated entry. Pre-PR every proxy startup loaded `SkillsInjectionHook`, which (per its own contract) injects skill content into request flows automatically; post-PR an operator must opt in.
+
+2. **Guardrail `exec()` payload** — `litellm/proxy/guardrails/guardrail_hooks/custom_code/custom_code_guardrail.py:147-156` adds a static method `_require_custom_code_enabled()` that raises `CustomCodeCompilationError("Custom code guardrails are disabled by default. Set LITELLM_ENABLE_CUSTOM_CODE_GUARDRAILS=true to enable.")` when the env var isn't `true`, and calls it as the first line of `_do_compile`. The endpoint side (`guardrail_endpoints.py:2074-2080`) mirrors the gate at the `test_custom_code_guardrail` admin endpoint, returning a `TestCustomCodeGuardrailResponse(success=False, error=..., error_type="compilation")` instead of executing.
+
+3. **Test fixtures** — `tests/test_litellm/proxy/guardrails/guardrail_hooks/test_response_rejection_guardrail_code.py:9-13` and `tests/test_litellm/proxy/guardrails/test_custom_code_security.py:11-15` both add an `@pytest.fixture(autouse=True) def enable_custom_code_guardrails(monkeypatch): monkeypatch.setenv("LITELLM_ENABLE_CUSTOM_CODE_GUARDRAILS", "true")` so existing tests still exercise the compile path.
+
+## Concerns
+
+1. **Default-disable for `SkillsInjectionHook` is a behavior change for existing operators.** Anyone relying on skills today will silently lose them on next deploy with no migration log. The PR doesn't emit a startup `verbose_proxy_logger.warning("skills hook now requires LITELLM_ENABLE_SKILLS=true...")` and there's no CHANGELOG entry in the diff. For a security-class default-flip this is the right direction, but operators need a loud signal — at minimum a one-time WARN at proxy startup if a config file references skills features but the env var is unset.
+
+2. **`_require_custom_code_enabled()` raises `CustomCodeCompilationError`, which is the same error class used for actual code-compilation failures.** That collapses two very different operational signals (legitimate compile error vs. policy disable) into the same exception type. Catchers downstream now can't distinguish "user wrote bad code" from "admin disabled the feature." A subclass like `CustomCodeDisabledError(CustomCodeCompilationError)` or a separate `CustomCodeDisabledByPolicy` exception would let observability differentiate without breaking existing handlers.
+
+3. **The endpoint-side gate at `guardrail_endpoints.py:2074-2080` returns 200 with `success=False`, but the `_do_compile` path raises an exception.** Two paths to the same disable signal returning two different shapes: the endpoint says "200 OK, success=False, error_type=compilation"; the hook raises. If a CLI/SDK consumer of the endpoint uses HTTP status to discriminate, it'll never see this disable as a non-200. That's actually consistent with how compilation errors are reported by this endpoint today (200 with `success=False`), so it's defensible — but the symmetry should be called out in a comment so future maintainers don't "fix" it.
+
+4. **Env var naming inconsistency.** `LEGACY_MULTI_INSTANCE_RATE_LIMITING` (no `LITELLM_` prefix), `LITELLM_ENABLE_SKILLS`, `LITELLM_ENABLE_CUSTOM_CODE_GUARDRAILS`. The new vars are correctly prefixed; the legacy one should probably get an alias. Out of scope for this PR but worth a follow-up issue.
+
+5. **No CI claim or testing checklist filled in the PR body.** The body says "Related tests updated and passing" but doesn't link to a CI run. The two changed test files do appear to set the autouse fixture correctly, but there's no test for the *disabled* path (e.g., `test_custom_code_guardrail_disabled_by_default_returns_error`). Without that, a future revert to default-on would only break in production.
+
+6. **`litellm/proxy/hooks/__init__.py` still imports `SkillsInjectionHook` unconditionally** at module load time (the import is at the top of the file, not gated). Cheap, but if `SkillsInjectionHook` ever has expensive import-time side effects (it shouldn't, but), the gate is incomplete. Better: `if os.getenv(...) == "true": from .skills import SkillsInjectionHook; PROXY_HOOKS["litellm_skills"] = SkillsInjectionHook`.
+
+7. **No test for the skills hook gate itself.** The new `if` block in `hooks/__init__.py` has no assertion that, with the env var set, `PROXY_HOOKS["litellm_skills"] is SkillsInjectionHook`, and with it unset, the key is absent. Should be a 6-line `monkeypatch`-driven import-reload test.
+
+## Verdict
+
+**merge-after-nits** — correct security-default-flip for both classes (skills auto-injection and arbitrary-code `exec()` in guardrails), with the right env-var allowlist pattern. Land after (a) CI is green and linked, (b) the disabled-by-default test path is covered for both gates, (c) `CustomCodeCompilationError`-vs-`CustomCodeDisabledError` distinction is decided, and (d) a CHANGELOG / startup-warning is added so operators using `SkillsInjectionHook` today learn about the flip before they ship.
+
diff --git a/reviews/2026-W18/drip-159/BerriAI-litellm-pr-26730.md b/reviews/2026-W18/drip-159/BerriAI-litellm-pr-26730.md
@@ -0,0 +1,40 @@
+# Review: BerriAI/litellm#26730 — fix: added keepalive args for aiohttp tcpconnector
+
+- **Author**: yassinkortam
+- **Head SHA**: `0e1ba362ebf021f44f75d579937737f98ba64c9c`
+- **Diff**: +182 / −0 across 4 files
+- **Linear**: LIT-2592
+
+## What it does
+
+Adds opt-in TCP `SO_KEEPALIVE` (with platform-aware `TCP_KEEPIDLE` / `TCP_KEEPINTVL` / `TCP_KEEPCNT` tuning) to the aiohttp `TCPConnector` used by both the in-process httpx-on-aiohttp transport and the proxy's shared aiohttp session. The motivating failure is the well-known mismatch between AWS NAT Gateway's 350-second idle timeout and OpenAI/Azure's up-to-600-second slow-response window: the kernel sends nothing on the wire while the provider thinks; the NAT reaps the flow; the response never arrives.
+
+The implementation:
+
+- **`litellm/constants.py:227-235`** adds four env-driven constants: `AIOHTTP_SO_KEEPALIVE` (default `False`, opt-in), `AIOHTTP_TCP_KEEPIDLE` (60s), `AIOHTTP_TCP_KEEPINTVL` (30s), `AIOHTTP_TCP_KEEPCNT` (5). Defaults are conservative (60+30*5=210s before connection drop) and well below the 350s NAT idle.
+- **`litellm/llms/custom_httpx/http_handler.py:63-100`** introduces a one-time module-level capability probe `_AIOHTTP_SUPPORTS_SOCKET_FACTORY = "socket_factory" in inspect.signature(TCPConnector.__init__).parameters` (aiohttp ≥3.10 only) and a `_build_aiohttp_keepalive_socket_factory()` helper that returns either `None` (when disabled or unsupported) or a callable matching aiohttp's `socket_factory` contract: takes `addr_info: Tuple[Any, ...]`, returns a non-blocking `socket.socket(family, type, proto)` with `SO_KEEPALIVE=1` plus the three platform-conditional TCP knobs (`TCP_KEEPIDLE` on Linux, `TCP_KEEPALIVE` on Darwin/macOS as documented in the inline comment, `TCP_KEEPINTVL`/`TCP_KEEPCNT` where available).
+- **`http_handler.py:982-987`** in `_create_aiohttp_transport` and **`proxy_server.py:672-691`** in `_initialize_shared_aiohttp_session` both consult `_build_aiohttp_keepalive_socket_factory()` and only attach `socket_factory` to `transport_connector_kwargs` / `connector_kwargs` when the helper returns non-None. Clean opt-in with no behavior change for users who don't set `AIOHTTP_SO_KEEPALIVE=true`.
+- **`tests/test_litellm/llms/custom_httpx/test_aiohttp_so_keepalive.py`** (+115) is a four-test surface using `monkeypatch` to flip the constants and `unittest.mock.patch` to spy on `TCPConnector` kwargs:
+  - `test_socket_factory_omitted_when_disabled` — flag off → kwarg absent
+  - `test_socket_factory_attached_when_enabled` — flag on, supports kwarg → kwarg present and callable
+  - `test_socket_factory_skipped_on_old_aiohttp` — flag on, doesn't support kwarg → kwarg absent (no crash)
+  - `test_socket_factory_sets_keepalive_options` — exercises the factory itself with a `MagicMock(spec=socket.socket)`, asserts `setblocking(False)` and the exact `setsockopt` triple-tuples, with `if hasattr(socket, "TCP_KEEPIDLE") ... elif hasattr(socket, "TCP_KEEPALIVE")` branches matching the implementation's platform fork.
+
+## Concerns
+
+1. **PR's pre-submission checklist is fully unchecked** — no Greptile review, no CI links, no claim that `make test-unit` passes. Given the change touches both transport paths *and* the proxy's shared session bootstrap, both should have green CI before merge. Test surface in-PR is decent but doesn't cover the actual integration (no aiohttp server stood up).
+
+2. **`socket.setblocking(False)` is set on the bare socket before aiohttp gets it.** aiohttp normally creates non-blocking sockets internally; setting it ourselves is fine for standard `AF_INET`/`AF_INET6` SOCK_STREAM, but worth confirming aiohttp's `socket_factory` contract doesn't expect it to be left blocking and let the loop handle it. The aiohttp 3.10 release notes for `socket_factory` should clarify; the PR doesn't cite them. If aiohttp expects blocking, this is fine on Linux/Darwin but could subtly break on Windows (which doesn't use this code path heavily, but still).
+
+3. **`AIOHTTP_TCP_KEEPIDLE=60` defaults to one probe per minute, which is aggressive.** If a deployment has tens of thousands of long-lived connections (provider-side keepalive pools), that's tens of thousands of unnecessary keepalive probes per minute. The `TCPConnector(keepalive_timeout=AIOHTTP_KEEPALIVE_TIMEOUT=120)` default already reaps idle conns at 2 minutes, so KEEPIDLE=60 fires *during* the still-warm window. A default of `300` (closer to NAT idle / 2) would be less probe-spammy and still well under the 350s NAT timeout. Document in the constants block what the math is for the recommended values and what the user's tradeoff is when overriding.
+
+4. **Two call sites that build aiohttp connectors duplicate the `if socket_factory is not None: kwargs["socket_factory"] = socket_factory` pattern.** Worth hoisting both kwargs-builders into `_build_aiohttp_connector_kwargs(base_kwargs: Dict)` in `http_handler.py` so the next person adding a connector option doesn't have to remember to update both.
+
+5. **`socket.SO_KEEPALIVE=1` reset behavior on connection reuse.** aiohttp's connector reuses sockets across requests. The factory only fires on socket *creation*, so reused sockets keep their keepalive — good. But sockets in the pool that were created *before* the user flipped the env var won't have it. This is fine for typical deployment (env var set at boot), but it should be documented that the flag is read once at module import via `AIOHTTP_SO_KEEPALIVE = os.getenv(...)` and runtime toggles via dynamic config won't take effect.
+
+6. **No test for the proxy `_initialize_shared_aiohttp_session` path.** The four tests cover `_create_aiohttp_transport` only. A fifth test patching the proxy bootstrap function and asserting `connector_kwargs["socket_factory"]` is set under the same conditions would close the symmetry.
+
+## Verdict
+
+**merge-after-nits** — correct opt-in fix for a real production failure mode (NAT-gateway-vs-LLM-latency mismatch), with an `inspect.signature`-based capability probe that gracefully degrades on old aiohttp. Land after the PR template is filled in (CI links, unit tests passing), the `_initialize_shared_aiohttp_session` path picks up the same test coverage, and the default `AIOHTTP_TCP_KEEPIDLE=60` is reconsidered or its tradeoff documented in the constants block.
+
diff --git a/reviews/2026-W18/drip-159/block-goose-pr-8877.md b/reviews/2026-W18/drip-159/block-goose-pr-8877.md
@@ -0,0 +1,45 @@
+# Review: block/goose#8877 — render mcp apps inline in goose2
+
+- **Author**: aharvard
+- **Head SHA**: `5d37fdd91653b6446ba62c3eda2d801f8d7be1a5`
+- **Diff**: +2120 / −52 across 39 files
+- **Tracking**: #8591
+- **Draft**: yes
+
+## What it does
+
+Turns goose2 into a real inline MCP-Apps host. Pre-PR (after #8632) the durable MCP-app payload was carried through live updates and replay, but the UI only displayed it as data. This PR adds:
+
+1. **A local authenticated proxy + sandboxed iframe for app HTML.** `crates/goose/src/acp/mcp_app_proxy.rs` (+229) is a brand-new axum router with three endpoints: a `GET` proxy serving `templates/mcp_app_proxy.html` (a 141-line CSP-shaped wrapper) gated by a `secret` query param plus configurable CSP allowlists (`connect_domains`, `resource_domains`, `frame_domains`, `base_uri_domains`, `script_domains`); a `GET` for guest HTML keyed by `(secret, nonce)`; and a `POST` to store guest HTML keyed by `secret` with TTL=300s and max-entries=64 (`GUEST_HTML_TTL_SECS`, `GUEST_HTML_MAX_ENTRIES`).
+2. **A new ACP method `_goose/tool/call`** (`crates/goose-sdk/src/custom_requests.rs:77-99`) with `GooseToolCallRequest { session_id, name, arguments }` and `GooseToolCallResponse { content, structured_content, is_error, _meta }`. Mirrored in `acp-meta.json` and `acp-schema.json` (+59). This is the channel for app→Goose nested tool calls without going through chat.
+3. **CLI plumbing for the proxy secret**: `crates/goose-cli/src/cli.rs:1086-1091` reads `GOOSE_SERVER__SECRET_KEY` (default `"goose-acp-local"`) and threads it into `create_router(server, secret_key)`.
+4. **Goose2 frontend host bridge** (`ui/goose2/src/features/chat/ui/McpAppView.tsx` +412 / −10): mounts the app inside an `AppRenderer` sandbox, fetches `httpBaseUrl + secretKey` via `get_goose_serve_host_info()` (new Tauri command, `ui/goose2/src-tauri/src/services/acp/goose_serve.rs:14`), provides hostContext (theme, locale, display mode, timezone, container dimensions), routes app→host requests to:
+   - `ui/message` → `onSendMessage(text)` → normal chat send path
+   - `ui/get-context` → return current `hostContext`
+   - `tools/call` → `_goose/tool/call(sessionId, extension__toolName, args)`
+   - `resources/read` → ACP resource-read for the same session/extension
+   - `ui/open-link` → native opener
+   - `ui/notifications/size-changed` → resize inline frame + sticky autoscroll
+5. **Replay / MessageBubble / MessageTimeline integration** (+57/+176/+64) — preserve MCP app payloads through both live updates and replay; attach to the correct assistant message; respect app's `border` preference.
+6. **Tests**: `McpAppView.test.tsx` (+238), `ChatView.mcpApp.test.tsx` (+122), `MessageBubble.mcpApp.test.tsx` (+28/-6), `acpNotificationHandler.test.ts` (+54), `MessageBubble.test.tsx` (+20). Covers the renderer, the chat-view integration, the bubble path, and the ACP notification handler.
+
+## Concerns (DRAFT — author lists own follow-ups)
+
+1. **HTTP (not HTTPS) local proxy with a long-lived shared secret.** `GOOSE_SERVER__SECRET_KEY` defaults to the literal string `"goose-acp-local"` (`cli.rs:1088`). If the user never sets it, every goose2 install on every machine has the same secret. The `GET /proxy?secret=goose-acp-local` endpoint is bound to localhost (presumably — not shown in the new file's address-bind, which inherits from `handle_serve_command`'s socket), but if the host is later reused on a multi-user box (shared dev VM, codespace), any unprivileged process can fetch app HTML and exfiltrate it. The PR's own scope-statement acknowledges "explore whether the local proxy should move to HTTPS and/or a stronger origin model for long-term hardening" — that should be in scope before this leaves DRAFT, not after, because the path is structurally vulnerable.
+
+2. **Guest HTML store is unbounded-growth in pathological flows.** `GUEST_HTML_MAX_ENTRIES=64` is the cap, but eviction policy isn't shown in the visible diff (need to check the full `mcp_app_proxy.rs`). If it's FIFO, a malicious or buggy app can flush legitimate guest HTML out of the store within one tool call. If it's LRU, the same attack is harder but still possible. A test that fills the store past the cap and asserts the eviction shape would pin it.
+
+3. **`extension__toolName` namespacing in the `_goose/tool/call` payload.** The frontend constructs `extension__toolName` (double-underscore concat) and sends it to `_goose/tool/call` (`McpAppView.tsx`, per the README). The server side dispatches by name. If two extensions register tools with double-underscore-bearing names (`weather__forecast` and a hypothetical `weather__forecast` from a second extension), the routing collides silently. The right fix is structured `{ extension, tool }` fields, not a flat string. The added `GooseToolCallRequest` shape has only `name`, so this is a wire-protocol decision that's hard to walk back.
+
+4. **i18n updated for `en` and `es` only** (`ui/goose2/src/shared/i18n/locales/{en,es}/chat.json` +6/−2 each). `de`, `fr`, `ja`, `pt-BR`, `zh-CN` (the rest of goose2's i18n contract per recent drips) are not updated. This is consistent with #8881 and #8886 reviewed earlier this week (drip-157, drip-156) — locale parity drift is becoming a pattern in goose2 PRs that should be addressed at process level (e.g., a CI gate on translation-key parity).
+
+5. **No CSRF protection on `POST` store-guest-html.** The proxy uses a shared secret in the *query string* for both the GET and POST endpoints. Query strings end up in browser history, server logs (depending on logging config), and `Referer` headers. Even bound to localhost, a CSRF attack from a malicious page that the user's browser is currently rendering can issue a cross-origin POST against `http://localhost:<port>/store-guest-html?secret=goose-acp-local` and inject arbitrary HTML into the next iframe load. Standard mitigations (move secret to a header, add CSRF token, strict CORS) are missing.
+
+6. **Container dimensions in hostContext are user-controlled and reflected back to the app.** `ui/notifications/size-changed` triggers `Resize inline frame + request sticky autoscroll`. If the app reports a 100000px height, what's the ceiling? A misbehaving app could push the page into a scroll-storm. A bound on the iframe `height` would be defensive.
+
+7. **Generated SDK files (`ui/sdk/src/generated/{client,index,types,zod}.gen.ts`) are committed in the PR.** That's correct for goose2's pattern, but means the PR's logical surface is smaller than the +2120 line count suggests. Worth flagging in the description so reviewers don't get sticker-shock.
+
+## Verdict
+
+**request-changes** — the core integration shape (separate inline-app proxy, distinct ACP method for app-originated tool calls, host-context surface, replay-preservation) is the right architecture. But the security shape of the local proxy (default-shared secret, query-string auth, no HTTPS, no CSRF protection on the POST endpoint, no clear eviction policy on the guest-HTML store) is not yet ready to ship even in DRAFT-merge form, because the security model has to be the *first* thing this code locks in — UI iteration is cheap, retrofitting authn/authz onto a deployed iframe-host bridge is not. The wire-protocol choice of flat-string `extension__toolName` is also harder to walk back than to fix now. Locale parity (Concern 4) is a process issue but should not block this PR alone. Move to merge-after-nits once Concerns 1, 3, and 5 are addressed structurally (not merely commented away as "future hardening").
+
diff --git a/reviews/2026-W18/drip-159/google-gemini-gemini-cli-pr-26139.md b/reviews/2026-W18/drip-159/google-gemini-gemini-cli-pr-26139.md