feat: add directSend feature for low-latency token streaming#700
Open
michaelraskansky wants to merge 2 commits intoaws-samples:mainfrom
Open
feat: add directSend feature for low-latency token streaming#700michaelraskansky wants to merge 2 commits intoaws-samples:mainfrom
michaelraskansky wants to merge 2 commits intoaws-samples:mainfrom
Conversation
added 2 commits
November 19, 2025 14:01
This feature enables Lambda handlers to send tokens directly to clients via AppSync, bypassing the SNS/SQS path for improved latency. Changes: - Add directSend configuration option - Add AppSync direct send utility - Update websocket handler to support direct send - Update model interfaces (langchain, idefics) to support direct send feat: apply directSend to bedrock-agents interface Extend directSend feature to the new bedrock-agents interface to enable direct token streaming via AppSync for Bedrock Agent responses. docs: add directSend configuration documentation fix: correct import and boolean logic in websocket.py - Fix ChatbotAction import from genai_core.types instead of index - Fix DIRECT_SEND environment variable check to properly evaluate boolean value - Fix grammar in configuration prompt test: add unit tests for directSend feature - Test directSend routing logic for token vs non-token actions - Test environment variable configuration - Test AppSync mutation formatting
- Remove verbose request/response logging to reduce CloudWatch costs - Only log errors when status != 200 - Keep X-Ray tracing for observability
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces the
directSendfeature that enables Lambda handlers to send LLM tokens directly to clients via AppSync GraphQL mutations, bypassing the SNS/SQS messaging path for significantly reduced latency during streaming responses.Motivation
The default token delivery path (Lambda → SNS → SQS → WebSocket) introduces unnecessary latency for streaming LLM responses. By sending tokens directly via AppSync, we achieve:
Changes
Core Implementation
directSend?: booleantoSystemConfig(default:false)appsync.pymodule with SigV4-authenticated GraphQL mutationswebsocket.pyto routeLLM_NEW_TOKENactions based onDIRECT_SENDenv varFiles Changed (11)
cli/magic-config.ts- Configuration wizard supportlib/shared/types.ts- Type definitionlib/shared/layers/python-sdk/python/genai_core/utils/appsync.py- NEW: Direct send implementationlib/shared/layers/python-sdk/python/genai_core/utils/websocket.py- Routing logiclib/model-interfaces/{langchain,idefics,bedrock-agents}/index.ts- Interface updateslib/aws-genai-llm-chatbot-stack.ts- Pass GraphQL API to interfacestests/shared/test_*.py- Unit testsTesting
Backward Compatibility
directSend: false)Deployment Notes
When enabled, Lambda functions require:
APPSYNC_ENDPOINTenvironment variableDIRECT_SEND=trueenvironment variableappsync:GraphQL(query/mutation)Checklist