When you interrupt the LLM while it is generating, the extension behaves erratically. The output of the model becomes a part of the user's prompt and the generation spirals out of control. The model starts to continuously generate large amount of text and usually stops after many generations
From a glance it looks like the problem might be multiple streams being triggered and logging to the same conversation simultaneously.
May want to cancel the request when a new prompt comes in?
To recreate:
- Enter a "complex" prompt ("Write me a python implementation of Dijstra's algorithm" here)
- Interrupt the model during generation ("stop" here)


