Open
Description
In the example provided at Llama Deploy Python Fullstack, the final output of the workflow is non-streaming. It only produces results once all generated tokens are complete.
As a result, I had to create my own FastAPI service.
Question:
Is there a more suitable modification to the workflow or llama-deploy that allows for direct and true asynchronous output without having to write my own FastAPI service?
Objective:
My main goal is to deliver the results of the workflow to users as quickly as possible.
Metadata
Metadata
Assignees
Type
Projects
Status
No status
Milestone
Relationships
Development
No branches or pull requests
Activity
logan-markewich commentedon Nov 14, 2024
You can definitely stream
If you use ctx.write_event_to_stream in your workflow, you can access these streamed events with the client
logan-markewich commentedon Nov 14, 2024
Still working on proper docs, but if you are using the old client
session.get_task_result_stream()
llama_deploy/llama_deploy/client/async_client.py
Line 123 in 4449d9f
Finding the example for the newer client...
logan-markewich commentedon Nov 14, 2024
Ah, newer client is here
https://github.com/run-llama/llama_deploy/blob/main/docs/docs/module_guides/llama_deploy/40_python_sdk.md#a-more-complex-example
logan-markewich commentedon Nov 14, 2024
Streaming outside of llama-deploy is slightly different
You still use the same ctx.write_event_to_stream method, but how you get the stream is different
https://docs.llamaindex.ai/en/stable/module_guides/workflow/#streaming-events