-
Notifications
You must be signed in to change notification settings - Fork 6
feat: Subscriptions #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Subscriptions #620
Conversation
cf0bb48 to
9d9e2df
Compare
Summary of ChangesHello @enisdenjo, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces comprehensive support for GraphQL subscriptions within the router. It fundamentally changes how the router handles long-lived connections and streaming data, moving from a single-response model to one that can manage continuous data flows. This enables real-time data updates for clients and subgraphs, complete with dynamic protocol negotiation and on-the-fly entity resolution for each event in a subscription stream. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant new functionality for GraphQL subscriptions, including support for SSE and multipart protocols, entity resolution for subscription events, and updates to the query planner. The overall architecture is well-thought-out, especially the separation of concerns with QueryPlanExecutionResult and the use of an owned context for long-lived subscriptions.
My review has identified a critical safety issue regarding an unsafe block that could lead to a use-after-free bug. I've also found several areas where manual JSON construction is used, which is brittle and should be replaced with safer serialization. Additionally, there are opportunities to improve performance in the stream handling logic by reducing string allocations, and a potential panic in the benchmark code. Please see the detailed comments for suggestions on how to address these points.
| ) -> Result<PlanExecutionOutput, PlanExecutionError> { | ||
| // Clone initial_data to make it 'static for ExecutionContext | ||
| // SAFETY: We're creating a new owned value that will be used within this function | ||
| let owned_data: Value<'exec> = unsafe { std::mem::transmute(initial_data.clone()) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of unsafe and mem::transmute here to extend the lifetime of initial_data is dangerous and likely unsound. The initial_data is a Value<'_> that borrows from response_body, which is only valid for the current loop iteration. Because execute_plan_with_initial_data is async, the future it returns can be suspended and resumed after response_body is no longer valid, leading to a use-after-free.
To fix this, you should create a fully owned Value<'static>. You can do this by implementing a helper function that recursively clones the Value data, converting any borrowed Cows to owned ones. This avoids the need for unsafe code.
For example, you could create a function fn to_owned_value<'a>(v: &Value<'a>) -> Value<'static> and use it like let owned_data: Value<'exec> = to_owned_value(&initial_data);.
✅
|
|
🐋 This PR was built and pushed to the following Docker images: Image Names: Platforms: Image Tags: Docker metadata{
"buildx.build.ref": "builder-d43cc119-50af-408e-8735-d33cf0064b10/builder-d43cc119-50af-408e-8735-d33cf0064b100/nb9ci3y07ou5g7xv5l5v1b0di",
"containerimage.descriptor": {
"mediaType": "application/vnd.oci.image.index.v1+json",
"digest": "sha256:3956b78c27374b42580546766756a51ee3a6b8709447eab32af28d1150ec2034",
"size": 1609
},
"containerimage.digest": "sha256:3956b78c27374b42580546766756a51ee3a6b8709447eab32af28d1150ec2034",
"image.name": "ghcr.io/graphql-hive/router:pr-620,ghcr.io/graphql-hive/router:sha-c80aa36"
} |
2ee5c8c to
27cd5e5
Compare
Ref router-165
Implement SSE, Incremental Delivery over HTTP and Apollo's Multipart HTTP specs for subscriptions when communicating with subgraphs or clients with entity resolution capabilities.
What changed?
Streaming execution result
The execution pipeline now returns
QueryPlanExecutionResult- an enum that can be either a single response or a stream:This separation allows the query planner and executor to operate independently from transport concerns, making future improvements (like connection repair and silent retries) easier to implement.
Owned context for streaming
Subscriptions require long-lived contexts that outlive request lifetimes. The implementation introduces an owned context that:
Arc-wrapped shared data (executors, schema metadata, projection/header plans)Subscription handlers
The router respects the client's
Acceptheader to determine response format:text/event-stream→ SSE responsesmultipart/mixed→ Incremental Delivery over HTTPmultipart/mixed;subscriptionSpec="1.0"→ Apollo multipart HTTP406 Not Acceptableif subscription is requested over unsupported transportSame behavior is expected when communicating with subgraphs.
SSE
Implements the GraphQL over SSE spec distinct connection mode.
Multipart protocol
Implements Apollo's multipart HTTP spec and GraphQL's Incremental Delivery over HTTP RFC.
Entity resolution
When a subscription emits data that references entities from other subgraphs, the router:
This is handled in the async stream generator in
execute_query_plan():Subscription node in query plan
SubscriptionNodeis used and now wraps subscription fetch operations:The query planner detects subscription operations and wraps them appropriately, enabling plans with entity resolution like:
Subgraph executor subscribe method
The HTTP executor gains a
subscribe()method that:BoxStream<HttpExecutionResponse>for downstream processingConfigure to only enable/disable subscriptions
The supported subscription protocols in this PR are inherintly HTTP and do not need a "protocol" configuration option. Hive Router will send an accept header listing all supported protocols for subscriptions over HTTP and the subgraph is free to choose whichever one it supports.
Whether we really want to limit specific protocols is up to discussion but objectively there is no benefit since they're all streamed HTTP.
Hence, you can only:
P.S. if the subscriptions are disabled, the router will respond with when receiving a subscription request:
What didn't change?
No HTTP Callback Protocol
Intentionally excluded to keep the PR focused. This would require webhook infrastructure in the router, which adds significant complexity. The SSE and multipart protocols cover the vast majority of use cases.
No WebSockets
At the moment only HTTP protocols. Just to keep the PR smaller, it can be integrating easily due to the separation of modules.
Silent Retries & Upstream Connection Repair
Most GraphQL subgraph implementations are stateless for subscriptions and have no concept of "continuing" from where they left off after a connection loss. Implementing retry logic on the router side would create false expectations - users would assume all events are delivered, but some would be lost when the subgraph creates a fresh subscription.
This is fundamentally why the EDFS (Event-Driven Federated Subscriptions) and callback protocols exist. To avoid misleading behavior and keep the PR focused on the core functionality, connection repair is not implemented.
TODO
test client disconnect propagationconfigure accepted protocols