I appreciate the design of trustcall and how it uses JSON patching to incrementally update structured output.
However, there’s currently no support for streaming responses, which would greatly improve responsiveness and UX in interactive systems.
I propose adding streaming support, where patches (incremental changes) are emitted as they are generated - instead of collecting all operations first - so that if one operation fails, the system can immediately attempt to fix it via the LLM and then continue generation without waiting for the entire batch to complete.
I have a system that displays real-time changes, and I’d prefer not to wait until all operations are generated by the LLM, since my JSON structure is quite complex.
I appreciate the design of trustcall and how it uses JSON patching to incrementally update structured output.
However, there’s currently no support for streaming responses, which would greatly improve responsiveness and UX in interactive systems.
I propose adding streaming support, where patches (incremental changes) are emitted as they are generated - instead of collecting all operations first - so that if one operation fails, the system can immediately attempt to fix it via the LLM and then continue generation without waiting for the entire batch to complete.
I have a system that displays real-time changes, and I’d prefer not to wait until all operations are generated by the LLM, since my JSON structure is quite complex.