feat(connectors): support circular/recursive types#9166
feat(connectors): support circular/recursive types#9166lennyburdette wants to merge 18 commits intodevfrom
Conversation
✅ Docs preview readyThe preview is ready to be viewed. View the preview File Changes 0 new, 1 changed, 0 removedBuild ID: 2962e6128aacbb8fa91ba980 URL: https://www.apollographql.com/docs/deploy-preview/2962e6128aacbb8fa91ba980 ✅ AI Style Review — No Changes DetectedNo MDX files were changed in this pull request. Review Log: View detailed log
|
|
@lennyburdette, please consider creating a changeset entry in |
1103759 to
5781ca7
Compare
|
/retry test-macos_test |
lennyburdette
left a comment
There was a problem hiding this comment.
Overall
The approach is architecturally sound — reusing the existing @provides copy-node mechanism to model restricted type availability is elegant and avoids inventing new planning logic. The PoC test (recursive_type_across_subgraphs) proving the planner already handles A→B→A entity resolution for split types is a great foundation.
Test coverage is solid: depth-1, depth-2, single-fetch optimization, indirect cycles, and the full E2E pipeline. 2,290 existing tests passing is reassuring.
Two main areas of concern below: (1) the extract_shape_field_names producing only top-level field names may produce an invalid/non-terminating FieldSet, and (2) the changes to graph_path.rs broaden behavior for all copy nodes (including @provides) rather than scoping to connector-originated restricted copies. The need to create cyclical graph structures out of non-cyclical REST operations is very specific to connectors — we should ensure these changes can't introduce a performance regression in path exploration for non-connector schemas.
| if names.is_empty() { | ||
| None | ||
| } else { | ||
| Some(names.join(" ")) |
There was a problem hiding this comment.
extract_shape_field_names only extracts top-level names — should this produce a full (terminating) FieldSet?
This function extracts only the top-level field names from the connector's Shape. For a connector with selection: "id name friends { id name }", this produces connectedSelection: "id name friends" — but friends is of type [User], which is a composite type. A FieldSet like "id name friends" with a bare composite field doesn't terminate at scalar leaves.
What happens downstream
When handle_connected_selection() in build_query_graph.rs processes connectedSelection: "id name friends", it calls parse_field_set() which parses this into a SelectionSet. Then add_restricted_field_edges() iterates the selections:
id→ leaf field, points to same tail as original ✓name→ leaf field, points to same tail as original ✓friends→ has no nested SelectionSet (it was just the bare name), so it's treated as a leaf and the edge points to the original User node (with all edges includingfriendsitself)
This means the restricted copy's friends edge leads to the unrestricted original User node. The planner can then resolve friends.name directly without entity resolution, and friends.friends.name with only one entity resolution hop instead of two.
Is this actually what we want?
If the connector's entity resolver HTTP endpoint returns {id, name} (no nested friends), then the restricted copy should model exactly that — id and name only, no friends edge. The current code accidentally gives you a friends edge on the restricted copy that promises more data than the HTTP response actually contains.
Consider instead extracting the full nested shape:
// selection: "id name friends { id name }"
// Should produce: connectedSelection: "id name friends { id name }"
// NOT: connectedSelection: "id name friends"
This way the restricted copy would have a friends edge pointing to another restricted copy (with only id and name), which accurately models what each entity resolution fetch returns.
Or maybe we don't want friends at all?
The entity resolver's selection is "id name" — it doesn't select friends. The query-level connector (Query.user) has selection: "id name friends { id name }" which does include friends, but that's the root fetch, not the entity resolver. So the connectedSelection should arguably be "id name" (just the entity resolver's fields), not "id name friends" or "id name friends { id name }".
Looking at the test expectations, connectedSelection: "id name" is exactly what appears in the supergraph snapshots and query planner tests — so the tests are correct. But this function would produce "id name friends" for the root connector's shape if it were ever called on it. The filtering to entity-resolver-only connectors (line 202) saves this from being a bug today, but it's fragile.
I'd like more eyes on this. The interplay between which connector's shape gets extracted, what FieldSet syntax is valid for composite fields, and how add_restricted_field_edges interprets nested vs. bare fields is subtle enough that I think we should validate the invariant more explicitly — either by always producing a fully-qualified FieldSet that terminates at scalars, or by adding a validation/assertion that the produced FieldSet contains only scalar field names.
There was a problem hiding this comment.
Update: verified with a test — connectedSelection DOES need the full nested field tree
I wrote a test comparing connectedSelection: "id name" vs connectedSelection: "id name friends { id name }" for an endpoint that returns friends one level deep. Results:
| Query | "id name" (current) |
"id name friends { id name }" (nested) |
|---|---|---|
friends { name } |
1 fetch | 1 fetch |
friends { name friends { name } } |
2 fetches | 1 fetch |
friends { friends { friends { name } } } |
3 fetches | 2 fetches |
The nested connectedSelection saves one entity resolution round-trip per depth level that the endpoint already returns.
How it works
With connectedSelection: "id name friends { id name }", add_restricted_field_edges processes friends as a nested selection (not a leaf). It calls create_restricted_copy recursively to create a second restricted copy for the friends' tail, with only {id, name}. The result is:
User(restricted-copy-1) --friends--> User(restricted-copy-2) --id--> ...
--name--> ...
--key--> User(original)
User(restricted-copy-1) --id--> ...
--name--> ...
--friends--> User(restricted-copy-2) [as above]
--key--> User(original)
So friends.name resolves within the copy chain. Only friends.friends.friends (depth 3) hits the boundary and needs entity resolution.
Test code (for reference)
#[test]
fn connected_selection_nested_fields_avoids_extra_fetch() {
// Hand-crafted supergraph where connectedSelection includes nested friends.
// Entity resolver endpoint returns { id, name, friends: [{ id, name }] }
let supergraph_sdl = r#"
schema
@link(url: "https://specs.apollo.dev/link/v1.0")
@link(url: "https://specs.apollo.dev/join/v0.6", for: EXECUTION)
{ query: Query }
directive @join__field(graph: join__Graph, requires: join__FieldSet, provides: join__FieldSet, type: String, external: Boolean, override: String, usedOverridden: Boolean, overrideLabel: String, contextArguments: [join__ContextArgument!], connectedSelection: join__FieldSet) repeatable on FIELD_DEFINITION | INPUT_FIELD_DEFINITION
directive @join__graph(name: String!, url: String!) on ENUM_VALUE
directive @join__type(graph: join__Graph!, key: join__FieldSet, extension: Boolean! = false, resolvable: Boolean! = true, isInterfaceObject: Boolean! = false) repeatable on OBJECT | INTERFACE | UNION | ENUM | INPUT_OBJECT | SCALAR
directive @join__enumValue(graph: join__Graph!) repeatable on ENUM_VALUE
directive @join__implements(graph: join__Graph!, interface: String!) repeatable on OBJECT | INTERFACE
directive @join__unionMember(graph: join__Graph!, member: String!) repeatable on UNION
directive @join__directive(graphs: [join__Graph!], name: String!, args: join__DirectiveArguments!) repeatable on SCHEMA | OBJECT | INTERFACE | FIELD_DEFINITION
directive @link(url: String, as: String, for: link__Purpose, import: [link__Import]) repeatable on SCHEMA
scalar join__DirectiveArguments
scalar join__FieldSet
scalar link__Import
input join__ContextArgument { name: String! type: String! context: String! selection: join__FieldValue! }
scalar join__FieldValue
enum link__Purpose { SECURITY EXECUTION }
enum join__Graph { CONNECTOR @join__graph(name: "connector", url: "") }
type Query @join__type(graph: CONNECTOR) {
user(id: ID!): User @join__field(graph: CONNECTOR)
}
type User @join__type(graph: CONNECTOR, key: "id") {
id: ID! @join__field(graph: CONNECTOR)
name: String @join__field(graph: CONNECTOR)
friends: [User] @join__field(graph: CONNECTOR, connectedSelection: "id name friends { id name }")
}
"#;
let supergraph = Supergraph::new(supergraph_sdl).unwrap();
let planner = QueryPlanner::new(&supergraph, Default::default()).unwrap();
// Depth 2: friends.friends.name — single fetch (covered by nested restricted copies)
let doc = ExecutableDocument::parse_and_validate(
planner.api_schema().schema(),
r#"{ user(id: "1") { friends { name friends { name } } } }"#,
"op.graphql",
).unwrap();
let plan = planner.build_query_plan(&doc, None, Default::default()).unwrap();
// With nested connectedSelection: single fetch, no entity resolution
assert!(!plan.to_string().contains("Flatten"),
"depth-2 should be single fetch, got:\n{}", plan);
// Depth 3: friends.friends.friends.name — needs entity resolution
let doc2 = ExecutableDocument::parse_and_validate(
planner.api_schema().schema(),
r#"{ user(id: "1") { friends { friends { friends { name } } } } }"#,
"op.graphql",
).unwrap();
let plan2 = planner.build_query_plan(&doc2, None, Default::default()).unwrap();
assert!(plan2.to_string().contains("Flatten"),
"depth-3 should need entity resolution, got:\n{}", plan2);
}What needs to change
extract_shape_field_names should be replaced with a recursive function that produces a full FieldSet string including nested selections for composite types, something like:
fn extract_shape_as_field_set(shape: &Shape) -> Option<String> {
match shape.case() {
ShapeCase::Object { fields, .. } => {
let parts: Vec<String> = fields.iter()
.filter(|(k, _)| *k != "__typename")
.map(|(k, v)| {
match extract_shape_as_field_set(v) {
Some(nested) => format!("{k} {{ {nested} }}"),
None => k.to_string(), // scalar leaf
}
})
.collect();
if parts.is_empty() { None } else { Some(parts.join(" ")) }
}
ShapeCase::Array { tail, .. } => extract_shape_as_field_set(tail),
ShapeCase::One(shapes) => shapes.iter().find_map(extract_shape_as_field_set),
_ => None, // scalar — no sub-fields
}
}This correctly produces "id name friends { id name }" for a shape representing { id, name, friends: [{ id, name }] }, and the restricted copy chain accurately models what the HTTP endpoint returns at each recursion depth.
| // not on the copy. For @provides copies (which have all original | ||
| // edges), this is a longer path that cost optimization prunes. | ||
| let advance_tail_weight = self.graph.node_weight(to_advance.tail)?; | ||
| let tail_is_copy = advance_tail_weight.provide_id.is_some(); |
There was a problem hiding this comment.
provide_id.is_some() is too broad — this changes behavior for @provides copy nodes too
This predicate allows same-subgraph re-entry for any copy node, but restricted connector copies and @provides copies have fundamentally different characteristics:
@providescopies have all original edges plus extras. Re-entering the subgraph from a provides copy is strictly a longer path — you can always do better by staying on the original node.- Restricted copies have fewer edges. Re-entry via entity resolution is the only way to access missing fields.
The comment says cost optimization prunes the longer path for provides copies, but that's an assumption about planner behavior rather than a guaranteed invariant. If cost calculation changes, or if there's an edge case where the longer path isn't pruned, this could cause @provides copy nodes to trigger unnecessary re-entry attempts — increasing path exploration in non-connector schemas.
Scoping concern: connector-specific graph cycles shouldn't affect non-connector planning
The need to create cyclical graph structures from non-cyclical REST operations is very specific to connectors. Is there a way to encode in the type system or data model an invariant that ensures these changes only take effect when a connector is at play?
For example, rather than overloading provide_id.is_some(), consider a CopyKind enum on QueryGraphNode:
pub(crate) enum CopyKind {
Provides, // @provides: all original edges + provided extras
RestrictedConnector, // connectedSelection: subset of original edges + key resolution
}
// In QueryGraphNode:
pub(crate) copy_kind: Option<CopyKind>,Then this check becomes:
let tail_is_restricted_copy = matches!(
advance_tail_weight.copy_kind,
Some(CopyKind::RestrictedConnector)
);This ensures the re-entry path is only explored for connector-originated restricted copies. Non-connector schemas with @provides would have zero behavioral change.
Would a CopyKind distinction be sufficient here, or is there more we can do to guarantee isolation? For instance, could the connectedSelection handling be gated behind a feature flag or a schema-level check that connectors are present, so the graph builder doesn't even create restricted copy nodes for non-connector supergraphs?
| // node. Copy nodes have fewer fields than the original, so the | ||
| // "detour" (re-entering the subgraph) is actually necessary to | ||
| // access fields not on the copy. | ||
| if !tail_is_copy |
There was a problem hiding this comment.
Same concern as the re-entry comment above — this disables the detour optimization for all copy nodes, including @provides copies.
The detour optimization is a major pruning heuristic (the comment above describes how it "drastically reduces state explosion"). For @provides copies (which have all original edges), skipping it means the planner explores paths it would otherwise prune. The correct plan is still found via cost optimization, but exploration time increases.
With a CopyKind enum this becomes if !matches!(tail_copy_kind, Some(CopyKind::RestrictedConnector)), keeping the detour optimization intact for provides-only schemas.
| self.base.query_graph.current_source = current_source; | ||
|
|
||
| let new_node_weight = self.base.query_graph.node_weight_mut(new_node)?; | ||
| new_node_weight.provide_id = Some(provide_id); |
There was a problem hiding this comment.
This reuses provide_id which was designed for @provides copy tracking (see the doc comment on QueryGraphNode.provide_id: "this mostly exists for debugging visualization"). Restricted connector copies are semantically quite different from provides copies — they have fewer edges rather than more.
Consider replacing provide_id: Option<u32> with something like:
pub(crate) copy_info: Option<CopyInfo>,
pub(crate) struct CopyInfo {
pub(crate) id: u32,
pub(crate) kind: CopyKind,
}This would:
- Make
graph_path.rschecks precise (only restricted copies allow re-entry / skip detour) - Improve debugging — you can tell why a copy exists when inspecting the graph
- Prevent future features from accidentally broadening behavior for the wrong copy type
- Document the invariant that restricted copies are connector-specific
This test demonstrates that the query planner already handles recursive types when they're split across subgraphs (A has name, B has friends). This is the 'north star' plan shape for the connector circular reference feature — restricted copy nodes will reproduce this pattern within a single connector subgraph. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove both circular reference checks: 1. Direct field-level check in connect.rs (User.friends: [User] with @connect) 2. Selection-level check in selection.rs (type appearing twice in selection path) Circular references in connector selections are now allowed. The JSONSelection is inherently finite, and recursive types will be handled by restricted copy nodes in the query graph.
Verifies that walk_type_with_shape correctly handles recursive types. The Shape tree naturally terminates expansion even though User.friends returns [User].
This argument encodes which fields are available on a type when reached through a specific field in a connector subgraph. Used to create restricted copy nodes in the query graph for recursive types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements handle_connected_selection() in the query graph builder, which creates restricted copy nodes for fields with connectedSelection on @join__field. Restricted copies have only the specified field edges plus key resolution edges. This enables entity resolution for recursive types in connectors. Also allows self-key re-entry from copy nodes (provide_id.is_some()) in graph_path.rs, and skips the detour optimization for copy nodes. For @provides copies (which have all original edges), this is a no-op -- cost optimization prunes the longer path. Only restricted copies benefit since they have fewer edges than the original. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
connectedSelection on @join__field requires a join spec version bump. Add join v0.6 with the new argument, and update all composition snapshots to reflect the new version.
After merging expanded subgraphs, analyzes connectors on self-referential fields (where return type == parent type) and adds connectedSelection to the corresponding @join__field directives. This tells the query graph builder which fields are available at each recursion depth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Depth-2 recursion query plan test (friends.friends.friends) verifying each hop triggers entity resolution via connectedSelection copy nodes - Composition/satisfiability test for circular connector schema - Full regression check passes (1574+704+8 tests, 0 failures) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove PathPart enum, path tracking, and check_for_circular_reference method from SelectionValidator since circular references are now allowed.
Emit connectedSelection for all field-level entity resolver connectors, not just direct self-references. This handles indirect cycles like Track → Module → Track where Module.track is an entity resolver. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verifies the full pipeline: connector expansion produces a supergraph with connectedSelection, and the query planner generates entity resolution fetches at recursion boundaries. Also fixes expand_connectors to add `connectedSelection: join__FieldSet` to the @join__field directive definition when connectedSelection is used in field annotations — without this, the expanded supergraph fails GraphQL validation since the argument is absent from the directive definition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verifies full HTTP execution with mock endpoints: query connector fetches user data, entity resolver fetches friends, response correctly assembled from multiple fetches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
extract_shape_field_names only extracted top-level field names, producing
connectedSelection: "id name" even when the connector's selection included
nested composite fields like "id name friends { id name }". This caused
unnecessary entity resolution fetches — e.g. friends.friends.name required
2 fetches instead of 1 when the endpoint already returns nested friends.
Replace with extract_shape_as_field_set which recurses into Shape field
values, producing "id name friends { id name }" for nested objects. The
restricted copy chain then accurately models what the HTTP endpoint
returns at each depth level.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ef0172a to
6949cac
Compare
Introduce ProvidesCopy::More / ProvidesCopy::Fewer to distinguish @provides copy nodes from connector restricted copy nodes. The re-entry and detour-skip logic in graph_path.rs now only activates for Fewer (connector) copies, ensuring @provides behavior is unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6949cac to
62ed67d
Compare
…tion During connector expansion, verify that the selection shape includes key fields for any entity type it returns. Without them the query planner cannot perform entity resolution, producing an opaque internal error at runtime. The check now fails fast with a clear message at expansion time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
User.friends: [User]connectedSelectionargument on@join__field(join v0.6) to encode which fields are available at nested recursion levelsHow it works
Given a connector with
User.friends: [User]and selection"id name":friends: [User] @join__field(connectedSelection: "id name")User(restricted)with only{id, name}edges + key→User(original){ user { friends { name friends { name } } } }, the planner generates:{ user { friends { __typename id name } } }(name is on restricted copy){ ... on User { friends { name } } }(friends requires entity resolution)Each recursion level is a separate entity resolution fetch, bounded by the query depth.
Test plan
🤖 Generated with Claude Code