Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming chat endpoint #1527

Merged
merged 13 commits into from
Mar 3, 2025

Conversation

TamiTakamiya
Copy link
Contributor

@TamiTakamiya TamiTakamiya commented Feb 10, 2025

Jira Issue: https://issues.redhat.com/browse/AAP-39044

Description

The implementation of /streaming_chat endpoint.

Testing

Steps to test

  1. Pull down the PR
  2. Run unit tests

Scenarios tested

Production deployment

  • This code change is ready for production on its own
  • This code change requires the following considerations before going to production:
  1. The streaming API returns almost immediately and actual chat data are sent through the stream, which is returned with API. It is likely we need to manage and limit the number of concurrent streams. This PR does not contain that logic.
  2. An operational telemetry event (StreamingChatBotOperationalEvent) is implemented in this PR, but it is sent right after the API is returned, i.e. chat data are not received yet. We need to design more meaningful telemetry for the streaming API.

@TamiTakamiya TamiTakamiya changed the base branch from main to TamiTakamiya/AAP-39043/chatbot-streaming-ui February 10, 2025 21:12
Base automatically changed from TamiTakamiya/AAP-39043/chatbot-streaming-ui to main February 10, 2025 21:21
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch 6 times, most recently from dae29f3 to 91c8daa Compare February 14, 2025 21:35
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch 3 times, most recently from 2d9bdbf to eed802b Compare February 18, 2025 20:06
Copy link
Contributor

@manstis manstis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a peek @TamiTakamiya

I don't think we need the async_invoke function as you've now moved streaming support to a new type of pipeline (ModelPipelineStreamingChatBot).

@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from c661a8d to da5c188 Compare February 19, 2025 21:48
manstis
manstis previously approved these changes Feb 21, 2025
Copy link
Contributor

@manstis manstis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the selftest errors; LGTM 👍

I made a comment about where HTTP exception handling should possibly happen.

IDK if you're able to test both where you have exceptions handled (e.g. backend disconnects or something) vs where I've proposed it.

@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch 2 times, most recently from fa30d9c to 41fb9e8 Compare February 23, 2025 22:43
@TamiTakamiya TamiTakamiya changed the title [DRAFT] /streaming_chat endpoint PoC Streaming chat endpoint Feb 24, 2025
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from 41fb9e8 to 52b08b2 Compare February 24, 2025 13:33
@TamiTakamiya TamiTakamiya marked this pull request as ready for review February 24, 2025 15:41
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from fc08a8a to e4b2e46 Compare February 24, 2025 17:39
Copy link
Contributor

@romartin romartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TamiTakamiya that's an AWESOME work! congrats! 👏

I just added a few comments, but mostly about Q's I have, I think the code changes are looking good , also tests.

Just another Q - If I want to test it, I don't see any change in this PR on the chatbot UI app. I understand this is because we are just enabling the backend side of it, but for now not allowing end users to stream. Anyway, if at some point we want to enable streaming, at least in local env, it is just about changing the endpoint in the useChatbot.tsx, something else? Worth documenting this in a quick README?

Great great stuff, thanks!



@define
class StreamingChatBotOperationalEvent(ChatBotBaseEvent):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we are just using the same chatbot event structure for when streaming, as when non-streaming, correct?

It is absolutely fine, but just asking, should I make sense to collect any other chatbot attribute of interest only when streaming? Just brainstorming, because as far as I remember, when playing locally with Ollama, for example, you get, for every stream data, also metadata associated for it, as well as some other metadata at the end of the stream response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The STREAM_DATA used in test_pipeline.py was taken from an actual log of ansible-chatbot-service. As you see in the first start data, conversation_id is sent with the start data. For the first API call, chatbot UI issues /chat or /streaming_chat API without specifying conversation_id expecting chatbot service will assing a new conversation_id. In such case, conversation_id is not found in the API response and therefore it won't appear in the operational event as well.

So I think at least we want to add two more operational event types that correspond to the start and end data.

On Ollama, I also saw some data is received from ollama server at the end of stream. There might be some missing parts in the current road-core/service streaming API implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is needed. It becomes the fallback implementation should one not be provided. For example, you configured the ChatBot to use wca as it's provider in the JSON configuration. We only have a concrete implementation for http.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manstis Thank you for correcting my misunderstanding and provide clarification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK why my comment appeared here 🤣

It was meant for here.

@@ -93,6 +94,11 @@
"csp.middleware.CSPMiddleware",
]

if os.environ.get("CSRF_TRUSTED_ORIGINS"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just due to my ignorance... :) why do we need CSRF trusted origins? Why are they used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I do not understand 100% yet. It is not needed when the AI Connect server is started from source, which uses Daphne for all endpoints. When the server is loaded from the image with podman-compose, it is needed. I think it may be because we are using Daphne only for the /streaming_chat endpoint and others are processed with uwsgi, i.e. when the chatbot page is loaded, all HTML/Javascript/css, etc. are loaded through uwsgi, and the streaming_chat request goes to Daphne. I guess it breaks something on CSRF code used by Django, but I have not clearly figured it out yet.

@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from e4b2e46 to 093a589 Compare February 28, 2025 03:23
Copy link
Contributor

@manstis manstis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments on this iteration @TamiTakamiya but looking good.

I have not looked at the concerns reported by @romartin or @jameswnl. I'll leave those to them.

@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from 093a589 to 86203fa Compare March 1, 2025 21:26
manstis
manstis previously approved these changes Mar 1, 2025
Copy link
Contributor

@manstis manstis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

I didn't consider @jameswnl and @romartin comments.. you may want to get their approval too.

@TamiTakamiya
Copy link
Contributor Author

@jameswnl Thank you for your approval!! I opened https://github.com/ansible/ansible-wisdom-ops/pull/1255 to set the CSRF_TRUSTED_ORIGINS envvar, which was added with this PR, in stage/prod clusters and the ops PR has to be merged before this is merged so that streaming char functions correctly. Would you take a look? Thanks!

@TamiTakamiya TamiTakamiya merged commit 8e5ed60 into main Mar 3, 2025
11 checks passed
@TamiTakamiya TamiTakamiya deleted the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch March 3, 2025 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants