Skip to content

feat(limits): add per-subgraph and per-connector response size limits#9160

Open
carodewig wants to merge 20 commits intodevfrom
caroline/subgraph-limits
Open

feat(limits): add per-subgraph and per-connector response size limits#9160
carodewig wants to merge 20 commits intodevfrom
caroline/subgraph-limits

Conversation

@carodewig
Copy link
Copy Markdown
Contributor

Summary

  • Adds limits.subgraph.all.http_max_response_bytes and limits.subgraph.subgraphs.<name>.http_max_response_bytes config to cap the number of bytes read from subgraph HTTP response bodies, protecting against OOM from unexpectedly large payloads
  • Adds limits.connector.all.http_max_response_bytes and limits.connector.sources.<source>.http_max_response_bytes config for the same protection on connector responses
  • Limits are enforced streaming (per-chunk via http_body_util::Limited) — the router stops reading as soon as the limit is exceeded without buffering the full body
  • Config restructured: existing limits.* fields moved under limits.router.*; a config migration entry handles the breaking change transparently

New metrics and span attributes

  • apollo.router.limits.subgraph_response_size.exceeded counter (attribute: subgraph.name)
  • apollo.router.limits.connector_response_size.exceeded counter (attribute: connector.source)
  • apollo.subgraph.response.aborted span attribute set to "response_size_limit" when a subgraph response is aborted
  • apollo.connector.response.aborted span attribute set to "response_size_limit" when a connector response is aborted

Test plan

  • Unit tests in plugins::limits — all config resolution and service injection behavior
  • Unit tests in services::subgraph_service — limit enforced and error returned; no counter increment for non-limit errors
  • Unit tests in plugins::connectors::handle_responses — limit enforced for connector responses
  • Unit tests in services::router::bodyinto_bytes_limited behavior
  • Schema generation snapshot updated
  • Docs updated: routing/security/request-limits.mdx, routing/observability/router-telemetry-otel/enabling-telemetry/standard-instruments.mdx

🤖 Generated with Claude Code

carodewig and others added 14 commits April 9, 2026 09:56
Adds `limits.subgraph.all.http_max_response_bytes` and per-subgraph overrides under `limits.subgraph.subgraphs.<name>` to cap how many bytes are read from a subgraph response body, preventing OOM when a subgraph sends an unexpectedly large payload.
Enforcement happens in `do_fetch()` via `http_body_util::Limited`, which rejects the body mid-stream as each chunk arrives rather than buffering everything first. The limit is signaled through the request `Context` via a new `SubgraphResponseSizeLimit` extension type set by `LimitsPlugin::subgraph_service()`.
This is a breaking config change: all existing `limits.*` fields move under `limits.router.*`. A new migration (2045) handles automatic upgrade of old configs. The `limits.subgraph` section is additive.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@carodewig carodewig requested review from a team as code owners April 9, 2026 15:24
@apollo-librarian
Copy link
Copy Markdown
Contributor

apollo-librarian bot commented Apr 9, 2026

✅ Docs preview ready

The preview is ready to be viewed. View the preview

File Changes

0 new, 3 changed, 0 removed
* graphos/routing/(latest)/configuration/yaml.mdx
* graphos/routing/(latest)/observability/router-telemetry-otel/enabling-telemetry/standard-instruments.mdx
* graphos/routing/(latest)/security/request-limits.mdx

Build ID: 728911f536b13046ca6ce7d5
Build Logs: View logs

URL: https://www.apollographql.com/docs/deploy-preview/728911f536b13046ca6ce7d5


⚠️ AI Style Review — 35 Issues Found

Summary

The documentation has been updated to align with the style guide across several key areas: framing has been improved by using imperative verbs, second-person perspective ('your'), and simpler language like 'used' instead of 'enabled'. Product naming conventions now treat plural names like 'Apollo Connectors' as plural subjects and omit articles before standalone names like 'Router'. Verb usage was shifted to the present tense and active voice to clarify actors and improve immediacy. Structural and formatting changes include removing punctuation from list fragments, ensuring consistency with colons, and removing quotes around code-formatted strings. Finally, word usage was refined by replacing semicolons and em-dashes with periods, using contractions, and ensuring all configuration-related numbers use numerals in code font.

Duration: 2715ms
Review Log: View detailed log

This review is AI-generated. Please use common sense when accepting these suggestions, as they may not always be accurate or appropriate for your specific context.

@github-actions

This comment has been minimized.

carodewig and others added 6 commits April 9, 2026 11:36
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Applied suggestions from AI review 20260409-911566dd-9160-9cb68e9860742752:
- docs/source/routing/security/request-limits.mdx:445: Strings within code font should not be surrounded by quotes.; Use the present te...
- docs/source/routing/security/request-limits.mdx:402: Use imperative verbs for instructions.; Use an authoritative voice to guide the ...
- docs/source/routing/security/request-limits.mdx:400: Do not surround strings in code font with quotes.; Use active voice to describe ...
- docs/source/routing/security/request-limits.mdx:397: Avoid using "When" if it can be replaced by "After" to indicate sequence.
- docs/source/routing/security/request-limits.mdx:383: Prescribe the specific tool and method for measuring response sizes to provide a...
- docs/source/routing/observability/router-telemetry-otel/enabling-telemetry/standard-instruments.mdx:284: Introduce the list item with a colon instead of a hyphen to separate the term fr...
- docs/source/routing/security/request-limits.mdx:414: Use active voice for the identification step.
- docs/source/routing/security/request-limits.mdx:430: Unordered lists should be introduced with a sentence or fragment that ends in a ...

Review: #9160
Triggered by: caroline.rodewig@apollographql.com
Copy link
Copy Markdown
Contributor

@aaronArinder aaronArinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be super nice to use ByteSize so folks don't have to do the math for bytes, but otherwise lgtm!

limits:
subgraph:
all:
http_max_response_bytes: 10485760 # 10 MB for all subgraphs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the friendly ByteSize to get something like 10mb?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I just cribbed from router.http_max_request_bytes but bytesize would be much friendlier.

Comment on lines +1097 to +1122
Some(SubgraphResponseSizeLimit(limit)) => {
router::body::into_bytes_limited(body, limit)
.instrument(tracing::debug_span!("aggregate_response_data"))
.await
.map_err(|err| {
tracing::error!(fetch_error = ?err);
let reason = if err.downcast_ref::<LengthLimitError>().is_some() {
u64_counter!(
"apollo.router.limits.subgraph_response_size.exceeded",
"Number of subgraph responses aborted because they exceeded the configured response size limit",
1,
subgraph.name = service_name.to_string()
);
tracing::Span::current()
.record("apollo.subgraph.response.aborted", "response_size_limit");
format!("subgraph response body exceeded limit of {limit} bytes")
} else {
err.to_string()
};
FetchError::SubrequestHttpError {
status_code: Some(parts.status.as_u16()),
service: service_name.to_string(),
reason,
}
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take this (and logic for into_bytes_limited) to be the meat of this pr and it looks solid to me! flagging because I only glanced over all the config shuffling

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's accurate! It had to happen to get the config nested but it definitely clouds the real changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants