Add support for stream output in Gradio demo #630

sunyuhan19981208 · 2023-06-18T12:28:43Z

Description:

This pull request adds support for streaming output in Gradio demo. With this enhancement, users can now visualize the output of their models as a stream in real-time, enabling a more interactive and dynamic experience.
效果展示：https://www.bilibili.com/video/BV1zW4y1S7tS/?vd_source=87997328b39dd6fa2449ef2da981cfcc

Changes Made:

Implemented stream inference.
Support stream output in Gradio Chatbot component.
Add interactive option&slice for top_k, do_sample and repetition_penalty.

ymcui · 2023-06-19T04:23:49Z

scripts/inference/gradio_demo.py

@@ -143,24 +138,34 @@ def predict(
    with gr.Row():


line 135: The banner image seems too big, please consider resize it to a smaller one if possible. Also, try to use the absolute path for this image as someone may use gradio_web_demo.py in other path.

line 136: missing "。" at the end of sentence.

ymcui · 2023-06-19T04:31:47Z

Hi,

Thanks again for your PR. I have checked that the script is functional on Colab.
Here are some comments on this PR:

Please also consider update gradio_web_demo.ipynb. Also note that the recommended third-party LLaMA has been changed to elinas/llama-7b-hf-transformers-4.29.
Someone may use this script in other path rather than in scripts/inference folder. As such, please replace the banner image path to the absolute one.

Comments from GPT-4 Code Interpreter (it is up to you to follow its advice or not):

The code seems well structured, but here are a few suggestions that might improve its overall quality:

Code Organization: Separate the code into functions or classes based on functionality. For instance, the model loading and setup part can be a separate function. It's a good practice to keep your code modular. This way, you can reuse your code and it will be easier to read and maintain.
Error Handling: There seems to be a lack of error handling in the code. Try to anticipate what might go wrong and add appropriate error handling code. For example, if the model or tokenizer fails to load, the code should handle these exceptions and give a meaningful error message to the user.
Command-Line Arguments: Consider using the argparse module's functionality to add help messages and type checking for the command-line arguments. This can make it easier for users to understand how to use the script and prevent incorrect usage.
Comments: Although the code is mostly clear, adding comments to explain what each part of the code does can be helpful, especially for complex sections. This can make the code easier to understand for others, and for you in the future.
Code Formatting: The code formatting seems inconsistent in some places. For example, there are spaces around operators in some places but not in others. Adhere to a specific style guide (like PEP8 for Python) to ensure consistency. Tools like flake8 or black can help with this.
Model Selection: The code seems to only support a specific model (LLaMA), and it's unclear what happens if the user tries to use a different model. Consider adding support for other models or clearly documenting that only this specific model is supported.
Unused Variables: The variable load_type is defined but not used in the code. Make sure to remove any unused variables to clean up the code.
Hard-Coded Values: There are some hard-coded values in the code (e.g., max_memory=256, server_port=19324). Consider making these configurable via command-line arguments or a configuration file.
Logging: Add logging for key events and errors. This can help with troubleshooting and understanding what the program is doing. Python's built-in logging module can be very useful for this.
Closing Resources: Ensure that all resources (like file handles, network connections, etc.) are properly closed after usage. This is typically not a problem in small scripts, but it's a good habit to get into.
Code Duplication: There seems to be some code duplication (e.g., the submit and click handlers for user_input and submitBtn). If possible, try to combine these or extract common code into separate functions to avoid duplication.

Please note that these suggestions are based on general good practices for Python development. The specific needs and constraints of your project might mean that some of these suggestions are not applicable or need to be modified.

sunyuhan19981208 · 2023-06-19T05:40:05Z

Thanks for your valuable advice!!!! I will modify the code following your great code advice, I am really happy that you can give me so many advices on my code!!!😃

ymcui

I made some minor comments.
After resolving these issues, I'll merge this PR in main branch.
Thanks.

ymcui · 2023-06-28T00:23:24Z

scripts/inference/gradio_demo.py

 with gr.Blocks() as demo:
    gr.HTML("""<h1 align="center">Chinese LLaMA & Alpaca LLM</h1>""")
    current_file_path = os.path.abspath(os.path.dirname(__file__))
-    gr.Image(f'{current_file_path}/../../pics/banner.png', label = 'Chinese LLaMA & Alpaca LLM')


Should be small_banner.png instead of banner.png?

ymcui · 2023-06-28T00:24:58Z

scripts/inference/gradio_demo.py

+                value=128,
+                step=1.0,
+                label="Maximum New Token Length",
+                interactive=True)
            top_p = gr.Slider(0, 1, value=0.8, step=0.01,


It would be better to set top_p=0.9 as default.

scripts/inference/gradio_demo.py

ymcui · 2023-06-28T05:51:58Z

It seems that there is no space between different English words in stream mode.

sunyuhan19981208 · 2023-06-28T12:47:51Z

It seems that there is no space between different English words in stream mode.

I will fix that..

sunyuhan19981208 · 2023-06-28T14:02:39Z

@ymcui

it is fixed now

sunyuhan19981208 added 2 commits June 18, 2023 20:12

feat: support stream output for gradio demo

0805752

chore: change port

23c18a4

ymcui self-requested a review June 19, 2023 04:12

ymcui reviewed Jun 19, 2023

View reviewed changes

sunyuhan19981208 added 4 commits June 25, 2023 21:56

chore: upload small banner

dde9d21

refactor: Apply GPT-4 instructions to refactor the code

1142d4e

Merge branch 'main' into stream_gradio

332b79b

fix: fix the hugging face path of llama-7b

bfa4daf

sunyuhan19981208 requested a review from ymcui June 26, 2023 13:47

fix: make unused argument max_memory into use

f00bd30

ymcui requested changes Jun 28, 2023

View reviewed changes

fix: fix English output missing space and tune default option

e96b77d

sunyuhan19981208 requested a review from ymcui June 28, 2023 14:03

ymcui approved these changes Jun 30, 2023

View reviewed changes

ymcui merged commit 500547e into ymcui:main Jun 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for stream output in Gradio demo #630

Add support for stream output in Gradio demo #630

Uh oh!

sunyuhan19981208 commented Jun 18, 2023

Uh oh!

ymcui Jun 19, 2023

Uh oh!

ymcui commented Jun 19, 2023

Uh oh!

sunyuhan19981208 commented Jun 19, 2023

Uh oh!

ymcui left a comment

Uh oh!

ymcui Jun 28, 2023

Uh oh!

ymcui Jun 28, 2023

Uh oh!

Uh oh!

ymcui commented Jun 28, 2023

Uh oh!

sunyuhan19981208 commented Jun 28, 2023

Uh oh!

sunyuhan19981208 commented Jun 28, 2023

Uh oh!

Uh oh!

Add support for stream output in Gradio demo #630

Add support for stream output in Gradio demo #630

Uh oh!

Conversation

sunyuhan19981208 commented Jun 18, 2023

Description:

Changes Made:

Uh oh!

ymcui Jun 19, 2023

Choose a reason for hiding this comment

Uh oh!

ymcui commented Jun 19, 2023

Uh oh!

sunyuhan19981208 commented Jun 19, 2023

Uh oh!

ymcui left a comment

Choose a reason for hiding this comment

Uh oh!

ymcui Jun 28, 2023

Choose a reason for hiding this comment

Uh oh!

ymcui Jun 28, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ymcui commented Jun 28, 2023

Uh oh!

sunyuhan19981208 commented Jun 28, 2023

Uh oh!

sunyuhan19981208 commented Jun 28, 2023

Uh oh!

Uh oh!