perf(proxy): cut request-body memory amplification + admit-control large uploads#594
Conversation
…rge uploads Addresses OOM under load. Root cause was the request/response body being copied and re-serialized many times per request, plus unbounded concurrent large uploads (e.g. /v1/images/edits, tens of MB) all resident at once. Changes: - repo Update no longer re-marshals the big body fields on every status transition. Split toModel -> toModelMeta; Update never encodes request_info (immutable after Create, gorm Omit) and only encodes response_info when non-nil. Kills ~7 of ~8 multi-MB json.Marshal calls per request lifecycle (proxy_request.go, proxy_upstream_attempt.go) — the dominant allocator. - drop the redundant bytes.Clone of the request body in proxy.go ingress (converter/normalize never mutate in place; converting_writer re-clones). - domain.RequestBodySnapshot: for multipart/image/audio/video/octet-stream uploads, store a "<type, N bytes, body omitted>" placeholder in RequestInfo.Body instead of stringifying tens of MB of binary into SQLite (and a redundant attempt copy). dev_mode keeps the full body. Applied at all three RequestInfo construction sites. - large-upload admission control (handler/upload_limit.go): bodies over a threshold are gated by a Content-Length-aware concurrency limiter so peak memory is bounded to ~maxConcurrency x size; small text/JSON requests are never gated. Optional hard cap (413). All tunable via MAXX_* env, safe defaults. Keeps the []byte pipeline untouched (no adapter changes). Deliberately deferred (per design discussion): full streaming / spool-to-disk refactor of the body pipeline — high blast radius across 7 provider adapters for a marginal win now that the above is done. Revisit if production still OOMs. Tests: new coverage for repo Update invariants (request_info preserved, response_info persisted), RequestBodySnapshot, and the upload limiter (gating, concurrency bound, wait+timeout, ctx-cancel, hard cap, disabled). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
📝 WalkthroughWalkthrough此 PR 实现了请求体快照、大上传并发准入控制与数据库持久化优化三个独立功能:按 content-type 与 devMode 的可配置快照截断、基于阈值/并发/硬上限的上传准入、以及 Update 路径的选择性大字段写入以优化数据库性能。 Changes请求体快照
上传大小与并发准入
数据库持久化优化
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
internal/repository/sqlite/proxy_request_test.go (1)
217-228: ⚡ Quick win测试未真正覆盖注释(190-193 行)声明的“nil ResponseInfo 不覆盖已有 response_info”不变量。
状态类 Update 执行时,库中
response_info还是 Create 写入的"null"占位(从未写入过真实响应体)。因此第 226 行的断言只能证明“没写入新值”,无法证明“已有的真实response_info在 nil Update 下被保留”。建议补一步:先写入真实response_info,再用ResponseInfo == nil的 Update 校验其仍被保留,才能锁定该不变量。💚 建议补充的用例(示意)
// 1.5) 先写入真实 response_info,再做 nil ResponseInfo 的状态类 Update, // 验证已有 response_info 不被 Omit 之外的逻辑清空。 req.ResponseInfo = &domain.ResponseInfo{Status: 200, Body: "the-response-body"} if err := repo.Update(req); err != nil { t.Fatalf("seed response_info: %v", err) } req.ResponseInfo = nil // 后续状态类 Update 不携带 ResponseInfo if err := repo.Update(req); err != nil { t.Fatalf("update status after response set: %v", err) } if !strings.Contains(string(reload().ResponseInfo), "the-response-body") { t.Fatalf("existing response_info wiped by nil-ResponseInfo Update") }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/repository/sqlite/proxy_request_test.go` around lines 217 - 228, The test currently only checks that a nil ResponseInfo in a status-only Update doesn't write a new value, but doesn't verify that an existing real response_info is preserved; to fix, before the status-only Update seed a real response payload by setting req.ResponseInfo to a non-nil domain.ResponseInfo and calling repo.Update(req), then set req.ResponseInfo = nil and call repo.Update(req) again, and finally call reload() and assert that reload().ResponseInfo still contains the original "the-response-body"; modify the test around the existing RequestInfo/ResponseInfo assertions (using repo.Update, req.ResponseInfo and reload()) to include this additional seed-and-verify step.internal/handler/proxy.go (1)
114-146: ⚖️ Poor tradeoff上传限流拒绝(413/429)未落库,admin 侧不可见。
与令牌并发限流路径(Line 217-219 会调用
h.executor.RecordRejectedProxyRequest)不同,这里的tooLarge413 与acquire失败 429 直接writeError/writeRateLimitError后 abort,不会生成 REJECTED 记录。被准入控制挡掉的大上传在管理后台与统计中将完全不可见,难以观测限流是否生效或被误触发。可考虑在这两个分支也记录一次拒绝(注意此时apiToken尚未解析,需评估可用上下文)。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/handler/proxy.go` around lines 114 - 146, The 413 (tooLarge) and 429 (acquire failure) branches in the upload limiter path currently call writeError/writeRateLimitError and abort without emitting a REJECTED record; update both branches to call h.executor.RecordRejectedProxyRequest (the same recorder used in the token-concurrency path) before writeError/writeRateLimitError so the admin UI and metrics see the rejection, populating it with available context (e.g. r.Context(), r.Method, r.URL.Path, r.ContentLength and client IP if extractable) because apiToken is not yet parsed; ensure you still call c.Abort() and return after recording the rejection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/domain/request_snapshot.go`:
- Around line 20-28: RequestBodySnapshot currently returns the full body for
non-binary content, allowing large snapshots to be stored; change it so that
(unless devMode is true) non-binary bodies are truncated to a safe byte cap and
replaced with a placeholder indicating the content type and original size.
Update RequestBodySnapshot to check len(body) against a configurable cap (use
MAXX_MAX_UPLOAD_BYTES or an explicit safe default when that config <= 0), and if
over the cap return a placeholder like the existing binary branch (refer to
RequestBodySnapshot, isBinaryUploadContentType, contentTypeToken and the
MAXX_MAX_UPLOAD_BYTES config) instead of string(body). Ensure devMode still
returns full body and keep binary branch behavior unchanged.
---
Nitpick comments:
In `@internal/handler/proxy.go`:
- Around line 114-146: The 413 (tooLarge) and 429 (acquire failure) branches in
the upload limiter path currently call writeError/writeRateLimitError and abort
without emitting a REJECTED record; update both branches to call
h.executor.RecordRejectedProxyRequest (the same recorder used in the
token-concurrency path) before writeError/writeRateLimitError so the admin UI
and metrics see the rejection, populating it with available context (e.g.
r.Context(), r.Method, r.URL.Path, r.ContentLength and client IP if extractable)
because apiToken is not yet parsed; ensure you still call c.Abort() and return
after recording the rejection.
In `@internal/repository/sqlite/proxy_request_test.go`:
- Around line 217-228: The test currently only checks that a nil ResponseInfo in
a status-only Update doesn't write a new value, but doesn't verify that an
existing real response_info is preserved; to fix, before the status-only Update
seed a real response payload by setting req.ResponseInfo to a non-nil
domain.ResponseInfo and calling repo.Update(req), then set req.ResponseInfo =
nil and call repo.Update(req) again, and finally call reload() and assert that
reload().ResponseInfo still contains the original "the-response-body"; modify
the test around the existing RequestInfo/ResponseInfo assertions (using
repo.Update, req.ResponseInfo and reload()) to include this additional
seed-and-verify step.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e374b7ad-01a3-4ac1-b64f-18cfd368ed6f
📒 Files selected for processing (11)
internal/domain/request_snapshot.gointernal/domain/request_snapshot_test.gointernal/executor/executor.gointernal/executor/middleware_ingress.gointernal/handler/provider_proxy.gointernal/handler/proxy.gointernal/handler/upload_limit.gointernal/handler/upload_limit_test.gointernal/repository/sqlite/proxy_request.gointernal/repository/sqlite/proxy_request_test.gointernal/repository/sqlite/proxy_upstream_attempt.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Backend Checks
- GitHub Check: playwright
- GitHub Check: e2e
- GitHub Check: multiinstance
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ymkiux
Repo: awsl-project/maxx PR: 0
File: :0-0
Timestamp: 2026-03-30T09:23:52.570Z
Learning: In awsl-project/maxx PR `#466`, all 5 concerns raised in the initial review were incorrect. The `APITokenConcurrencySection` useEffect guard was present, `AcquireConcurrency` correctly returns `ErrInvalidToken` for zero-ID+empty-token, `ResolveToken` is fully tested, `IsStreamRequest` has no dependency on clientType, and passing `""` as ClientType in `ExtractToken` is intentional to scan all auth headers. Be more careful to read the actual code before flagging concerns.
🔇 Additional comments (10)
internal/domain/request_snapshot.go (1)
31-56: LGTM!internal/domain/request_snapshot_test.go (1)
8-49: LGTM!internal/executor/executor.go (1)
217-218: LGTM!Also applies to: 232-232
internal/executor/middleware_ingress.go (1)
121-121: LGTM!internal/handler/provider_proxy.go (1)
263-263: LGTM!internal/repository/sqlite/proxy_request.go (2)
63-73: LGTM!
635-685: LGTM!internal/repository/sqlite/proxy_upstream_attempt.go (1)
37-47: LGTM!Also applies to: 369-410
internal/handler/upload_limit.go (1)
42-140: LGTM!internal/handler/upload_limit_test.go (1)
9-109: LGTM!
…flow guard, tests Addresses CodeRabbit + local Codex review findings: - request_snapshot: cap non-binary (text/JSON) snapshots too. A client-controlled Content-Type could disguise a huge body as JSON and bloat the DB TEXT column; now bodies over MAXX_REQUEST_SNAPSHOT_MAX_BYTES (default 256KiB) store a truncation placeholder. dev_mode still keeps full body. Exported IsBinaryUploadContentType for reuse. - uploadLimiter: gate unknown-length (chunked, Content-Length<0) binary/multipart uploads by content-type. Previously they skipped the concurrency gate entirely, leaving an OOM path for chunked large uploads (Codex). Plain unknown-length text still ungated to avoid false positives. - proxy.go: guard the LimitReader maxBytes+1 overflow at math.MaxInt64; log the 413 over-limit rejection; document the deliberate pre-auth/pre-stream-detection placement of admission control (shed cheaply before doing work). - tests: strengthen ProxyRequest Update test to prove a nil-ResponseInfo Update does NOT wipe an already-persisted response_info (CodeRabbit nit); add the parallel ProxyUpstreamAttempt Update invariant test; add snapshot over-cap + unknown-length-binary-gating coverage. Not changed (with rationale): 413/429 rejections are not written as REJECTED proxy_request rows — admission control runs before auth/token resolution by design (to shed load before any work), so there's no tenant/token context to attribute the rejection to; they are logged instead. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The doc said text/JSON is never size-truncated, now stale after the round-1 snapshot cap. Comment-only change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/handler/proxy.go (1)
119-147:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift未知长度的 JSON/文本请求仍能绕过大上传门控。
Line 127 的
acquire只会把Content-Length < 0的二进制 / multipart 请求纳入门控,而 Line 147 之后仍然会对其余请求执行io.ReadAll。这意味着客户端只要改用 chunkedapplication/json/text/*,在MAXX_MAX_UPLOAD_BYTES=0的默认配置下就能绕过并发闸门,再次把任意大的 body 全量堆进内存,PR 想封住的 OOM 路径仍然存在。建议把所有未知长度请求默认纳入门控,或者先做一个小前缀探测/强制默认硬上限后再决定是否放行。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/handler/proxy.go` around lines 119 - 147, The current logic lets chunked/unknown-length requests (Content-Length < 0) bypass the concurrency limiter because h.uploadLimiter.acquire(r.Context(), r.ContentLength, contentType) is only invoked for some types; update the flow so that any request with unknown length (r.ContentLength < 0) is treated the same as large uploads: call h.uploadLimiter.acquire for such requests (or pre-check by reading a small fixed prefix up to a safe probe size) before doing io.ReadAll, and keep the existing io.LimitReader/bodyReader guard to enforce the hard max; adjust checks around tooLarge and acquire (and the defer release()) to reference r.ContentLength < 0 and contentType when deciding to acquire a slot so chunked application/json and text/* cannot escape the limiter.
🧹 Nitpick comments (2)
internal/domain/request_snapshot_test.go (1)
73-73: 💤 Low value
contains是strings.Contains的冗余封装。同文件其余断言已直接调用
strings.Contains,此处可统一改为直接调用,移除该辅助函数。可选清理。♻️ 可选:直接使用 strings.Contains
- if !contains(got, "body omitted") { + if !strings.Contains(got, "body omitted") { t.Fatalf("expected truncation placeholder, got %q", got) }并删除文件末尾的
func contains(s, sub string) bool { return strings.Contains(s, sub) }。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/domain/request_snapshot_test.go` at line 73, The helper function contains(s, sub string) is redundant; remove the function definition and replace its call sites in request_snapshot_test.go with direct calls to strings.Contains, ensuring you import "strings" if not already referenced in the file and update any test assertions that currently call contains to use strings.Contains instead.internal/domain/request_snapshot.go (1)
44-51: 💤 Low value非二进制超限时整体丢弃 body,可考虑保留前缀以保留部分审计价值。
当前实现对超过
requestSnapshotMaxBytes的文本/JSON body 直接替换为占位串,即使仅超出 1 字节也会丢失全部内容。可考虑截断为前 N 字节再附加占位说明,既挡住内存/DB 膨胀,又保留可审计的开头部分。属可选改进,不阻塞合并。♻️ 可选:保留前缀的截断方案
if requestSnapshotMaxBytes > 0 && len(body) > requestSnapshotMaxBytes { label := contentTypeToken(contentType) if label == "binary" { // 非二进制路径下空 content-type 标成 text 更贴切 label = "text" } - return fmt.Sprintf("<%s, %d bytes, body omitted (over snapshot cap %d)>", label, len(body), requestSnapshotMaxBytes) + return fmt.Sprintf("%s\n<%s, %d bytes, truncated (over snapshot cap %d)>", + string(body[:requestSnapshotMaxBytes]), label, len(body), requestSnapshotMaxBytes) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/domain/request_snapshot.go` around lines 44 - 51, The current branch in request_snapshot.go drops the entire body when len(body) > requestSnapshotMaxBytes; instead, truncate and preserve a leading prefix for auditability: in the block that uses requestSnapshotMaxBytes, contentTypeToken and label, compute a prefix length (e.g., min( maxPrefixBytes, requestSnapshotMaxBytes ) or a fixed cap like 256 bytes), take the first prefix bytes of body, and return a string that includes the label, the preserved prefix (escaped/cleaned as appropriate), the total original byte count, and an explicit "(body truncated to N bytes of M, omitted)" note; keep the existing special-case that maps binary to text for labeling and ensure you do not exceed requestSnapshotMaxBytes when deciding how many bytes to retain.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@internal/handler/proxy.go`:
- Around line 119-147: The current logic lets chunked/unknown-length requests
(Content-Length < 0) bypass the concurrency limiter because
h.uploadLimiter.acquire(r.Context(), r.ContentLength, contentType) is only
invoked for some types; update the flow so that any request with unknown length
(r.ContentLength < 0) is treated the same as large uploads: call
h.uploadLimiter.acquire for such requests (or pre-check by reading a small fixed
prefix up to a safe probe size) before doing io.ReadAll, and keep the existing
io.LimitReader/bodyReader guard to enforce the hard max; adjust checks around
tooLarge and acquire (and the defer release()) to reference r.ContentLength < 0
and contentType when deciding to acquire a slot so chunked application/json and
text/* cannot escape the limiter.
---
Nitpick comments:
In `@internal/domain/request_snapshot_test.go`:
- Line 73: The helper function contains(s, sub string) is redundant; remove the
function definition and replace its call sites in request_snapshot_test.go with
direct calls to strings.Contains, ensuring you import "strings" if not already
referenced in the file and update any test assertions that currently call
contains to use strings.Contains instead.
In `@internal/domain/request_snapshot.go`:
- Around line 44-51: The current branch in request_snapshot.go drops the entire
body when len(body) > requestSnapshotMaxBytes; instead, truncate and preserve a
leading prefix for auditability: in the block that uses requestSnapshotMaxBytes,
contentTypeToken and label, compute a prefix length (e.g., min( maxPrefixBytes,
requestSnapshotMaxBytes ) or a fixed cap like 256 bytes), take the first prefix
bytes of body, and return a string that includes the label, the preserved prefix
(escaped/cleaned as appropriate), the total original byte count, and an explicit
"(body truncated to N bytes of M, omitted)" note; keep the existing special-case
that maps binary to text for labeling and ensure you do not exceed
requestSnapshotMaxBytes when deciding how many bytes to retain.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 26a6f944-7d47-4f6e-b294-f1c2553417de
📒 Files selected for processing (7)
internal/domain/request_snapshot.gointernal/domain/request_snapshot_test.gointernal/handler/proxy.gointernal/handler/upload_limit.gointernal/handler/upload_limit_test.gointernal/repository/sqlite/proxy_request_test.gointernal/repository/sqlite/proxy_upstream_attempt_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/repository/sqlite/proxy_request_test.go
- internal/handler/upload_limit.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: playwright
- GitHub Check: e2e
- GitHub Check: multiinstance
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ymkiux
Repo: awsl-project/maxx PR: 0
File: :0-0
Timestamp: 2026-03-30T09:23:52.570Z
Learning: In awsl-project/maxx PR `#466`, all 5 concerns raised in the initial review were incorrect. The `APITokenConcurrencySection` useEffect guard was present, `AcquireConcurrency` correctly returns `ErrInvalidToken` for zero-ID+empty-token, `ResolveToken` is fully tested, `IsStreamRequest` has no dependency on clientType, and passing `""` as ClientType in `ExtractToken` is intentional to scan all auth headers. Be more careful to read the actual code before flagging concerns.
📚 Learning: 2026-03-30T09:23:52.570Z
Learnt from: ymkiux
Repo: awsl-project/maxx PR: 0
File: :0-0
Timestamp: 2026-03-30T09:23:52.570Z
Learning: In awsl-project/maxx PR `#466`, all 5 concerns raised in the initial review were incorrect. The `APITokenConcurrencySection` useEffect guard was present, `AcquireConcurrency` correctly returns `ErrInvalidToken` for zero-ID+empty-token, `ResolveToken` is fully tested, `IsStreamRequest` has no dependency on clientType, and passing `""` as ClientType in `ExtractToken` is intentional to scan all auth headers. Be more careful to read the actual code before flagging concerns.
Applied to files:
internal/domain/request_snapshot.go
🔇 Additional comments (5)
internal/domain/request_snapshot.go (2)
16-23: LGTM!
55-82: LGTM!internal/domain/request_snapshot_test.go (1)
8-70: LGTM!internal/repository/sqlite/proxy_upstream_attempt_test.go (2)
4-4: LGTM!
11-68: LGTM!
…efix; cleanups Addresses CodeRabbit incremental review: - uploadLimiter now gates ALL unknown-length (Content-Length<0) requests, not just binary/multipart ones. Round-1's content-type-based gating still let unknown-length JSON/text bypass the concurrency gate (CodeRabbit major). Conservative posture: unknown length => can't predict size => treat as large. Empty body (==0) and known small requests stay ungated. Removed the now- unneeded content-type plumbing (acquire/isLarge) and reverted IsBinaryUploadContentType to unexported. - RequestBodySnapshot: oversized non-binary bodies now keep a 256-byte prefix before the truncation marker, preserving partial audit value (model, first message) instead of dropping the whole body (CodeRabbit nit). - test: drop the redundant `contains` wrapper, use strings.Contains directly (CodeRabbit nit). Updated unknown-length gating test for the new semantics. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@coderabbitai 已在 128e4d3 处理本轮反馈:
CI Backend/Frontend Checks 已通过,相关包测试全绿。 |
|
Tip For best results, initiate chat on the files or code changes.
CI 全绿,实现与意图一致,本轮反馈均已落地。✅ (´。• ᵕ •。`) ♡ 🧠 Learnings usedIf you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
…rmetic Both local reviewers (Claude + Codex) approved and independently raised the same optional robustness nit: - clamp the 256-byte preview to min(snapshotPreviewBytes, requestSnapshotMaxBytes) so an operator setting MAXX_REQUEST_SNAPSHOT_MAX_BYTES below 256 can't make the placeholder exceed the configured cap. - snapshot tests now override requestSnapshotMaxBytes via withSnapshotCap() helper instead of depending on the env-derived package var, so they pass regardless of MAXX_REQUEST_SNAPSHOT_MAX_BYTES in the environment (verified with =0). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This reverts commit db004d2.
|
@coderabbitai review Please re-check the current PR head and focus only on must-fix before merge issues. This PR touches high-risk request-path areas: proxy ingress body handling, upload admission control, request body snapshots, and SQLite persistence of large request/response fields. Please specifically look for:
If you find anything that must be fixed before merge, please list each issue with:
If there are no must-fix issues, please explicitly say so, and separate any optional nits from merge blockers. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
echo "=== upload_limit.go ==="
cat internal/handler/upload_limit.go
echo ""
echo "=== proxy.go ingress ==="
cat internal/handler/proxy.go
echo ""
echo "=== request_snapshot.go ==="
cat internal/domain/request_snapshot.go
echo ""
echo "=== sqlite proxy_request.go ==="
cat internal/repository/sqlite/proxy_request.go
echo ""
echo "=== sqlite proxy_upstream_attempt.go ==="
cat internal/repository/sqlite/proxy_upstream_attempt.go
echo ""
echo "=== middleware_ingress.go ==="
cat internal/executor/middleware_ingress.go
echo ""
echo "=== executor.go (RecordRejectedProxyRequest) ==="
cat internal/executor/executor.goLength of output: 82563
✅ Area 1 — Request contract (chunked/unknown, empty, small/large)
The ✅ Area 2 — Slot-release leaks in
|
CodeRabbit's focused re-review (requested by @awsl233777) found no must-fix issues / safe to merge; applying 2 of the 3 optional nits: - proxy.go: close r.Body on the io.ReadAll error path too (parity with the success path; Go would close on teardown anyway, but it's cleaner). - request_snapshot.go: sanitize the 256-byte truncation preview with strings.ToValidUTF8 so a non-UTF-8 text body can't put garbled bytes in the placeholder. Declined (with rationale): re-introducing bytes.Clone for the aliased body. Round-1 review verified no code path mutates the request-body slice in place (converter/normalize/InjectCodexUserAgent all return new slices; the only KeyOriginalRequestBody reader uses read-only gjson), and converting_writer already clones independently. Re-cloning would reintroduce exactly the per- request multi-MB copy optimization B removed, for a purely hypothetical future mutator — kept as a documented contract instead. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@awsl233777 谢谢 review。针对你点名的 5 个高风险面,CodeRabbit 的聚焦复审结论是 No must-fix / Safe to merge,本地 Claude + Codex 三轮也均 APPROVE。三条 optional nits 处理如下(c1afe00):
对照你列的 5 点:① chunked/未知/空/大小请求契约 → 未知长度一律门控、空 body(==0)与已知小请求不门控,有测试;② acquire/ingress 生命周期 slot 泄漏 → 已 rebase 到你刚推的版本号 commit 之上。 |
awsl233777
left a comment
There was a problem hiding this comment.
Approved after re-checking the current head: CI is green, CodeRabbit found no must-fix issues, and the remaining body alias tradeoff is documented and acceptable for preserving the memory optimization.
|
@coderabbitai review Please review the current head Focus narrowly on whether the latest changes introduced any actionable or must-fix issues:
If there are no actionable / must-fix issues on the current head, please explicitly say so. If there are issues, please list severity, exact file/function/scenario, and minimal fix guidance. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
echo "=== internal/handler/proxy.go (ingress function) ==="
cat internal/handler/proxy.go
echo ""
echo "=== internal/domain/request_snapshot.go ==="
cat internal/domain/request_snapshot.goLength of output: 16718
Targeted change verdicts1. Both paths are now clean: body, err := io.ReadAll(bodyReader)
if err != nil {
_ = r.Body.Close() // ← added, error path
writeError(...)
c.Abort()
return
}
_ = r.Body.Close() // success path (was already present)
2. safePreview := strings.ToValidUTF8(string(preview), "�")
return fmt.Sprintf("%s…<%s, %d bytes total, body truncated ...>", safePreview, ...)
3. Intentionally unchanged. The documented no-in-place-mutation contract is accurate against the current code — all normalizers/converters return new slices, and the only downstream reader of Conclusion: Current head is clean. Safe to merge. [approve] 🧠 Learnings used |
背景
线上 OOM 排查。根因有两类:请求/响应 body 在一次请求里被反复拷贝与序列化;以及大上传(如
/v1/images/edits,几十 MB)并发无上限,同时挤在堆上把进程撑爆。profile 关键信号:
ProxyRequestRepository.toModel经toJSON累计 2.58 GB,ProxyUpstreamAttemptRepository1.19 GB,2 分钟累计分配 3.7 GB。改动
1. 仓储
Update不再重复 marshal 大 body(主犯) —proxy_request.go/proxy_upstream_attempt.go一次请求生命周期里
Update被调用 5~10 次,旧代码每次都把多 MB 的request_info/response_info重新json.Marshal。拆toModel → toModelMeta:Update永不编码request_info(Create 后不可变,gormOmit),response_info仅在非 nil 时编码一次。请求体 marshal 从 ~8 次降到 0 次,响应体降到 1 次。2. 去掉冗余 body 拷贝 —
proxy.go删掉 ingress 里多余的
bytes.Clone(converter/normalize 都返回新切片、从不就地改;converting_writer自带 clone)。3. 二进制 body 不进 DB 快照 —
domain/request_snapshot.gomultipart/image/audio/video/octet-stream 的 body 在
RequestInfo.Body只存<type, N bytes, body omitted>占位,不把几十 MB 二进制 stringify 进 SQLite(还会被 attempt 冗余存一份)。dev_mode保留完整 body。三处构造点统一。4. 大上传准入控制 —
handler/upload_limit.go超过阈值的 body 经按
Content-Length的并发闸门控,峰值内存被钉死在≈ maxConcurrency × 体积;小于阈值的普通文本/对话请求零开销、不门控。可选硬上限(413)。全部经MAXX_*env 可调,默认值生产安全。不动[]byte主链路、不改任何 adapter。MAXX_LARGE_UPLOAD_THRESHOLD_BYTESMAXX_LARGE_UPLOAD_MAX_CONCURRENCYMAXX_MAX_UPLOAD_BYTESMAXX_LARGE_UPLOAD_WAIT_TIMEOUT_MS刻意推迟
全链路流式 / 落盘重构:会碰 7 个 provider adapter(都假设
flow.GetRequestBody() []byte)+ multipart 重写 + 重试,回归面巨大;在以上改动落地后边际收益有限。等线上内存曲线说话,真还 OOM 再上"仅 multipart 重写走 temp file + 转发边界ReadSeeker"的精修版。测试
Update不变量:状态类 Update 不抹掉request_info、不误写响应体;终态 Update 正确落库response_info。RequestBodySnapshot:multipart/image 省略、JSON 保留、dev_mode 保留、无 content-type 保留。go vet干净;handler/executor/domain/sqlite全部测试通过。🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes