Skip to content

Add support for stream output in Gradio demo #630

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 30, 2023

Conversation

sunyuhan19981208
Copy link
Contributor

Description:

This pull request adds support for streaming output in Gradio demo. With this enhancement, users can now visualize the output of their models as a stream in real-time, enabling a more interactive and dynamic experience.
效果展示:https://www.bilibili.com/video/BV1zW4y1S7tS/?vd_source=87997328b39dd6fa2449ef2da981cfcc

Changes Made:

  • Implemented stream inference.
  • Support stream output in Gradio Chatbot component.
  • Add interactive option&slice for top_k, do_sample and repetition_penalty.

@ymcui ymcui self-requested a review June 19, 2023 04:12
@@ -143,24 +138,34 @@ def predict(
with gr.Row():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 135: The banner image seems too big, please consider resize it to a smaller one if possible. Also, try to use the absolute path for this image as someone may use gradio_web_demo.py in other path.

line 136: missing "。" at the end of sentence.

@ymcui
Copy link
Owner

ymcui commented Jun 19, 2023

Hi,

Thanks again for your PR. I have checked that the script is functional on Colab.
Here are some comments on this PR:

  1. Please also consider update gradio_web_demo.ipynb. Also note that the recommended third-party LLaMA has been changed to elinas/llama-7b-hf-transformers-4.29.
  2. Someone may use this script in other path rather than in scripts/inference folder. As such, please replace the banner image path to the absolute one.

Comments from GPT-4 Code Interpreter (it is up to you to follow its advice or not):

The code seems well structured, but here are a few suggestions that might improve its overall quality:

  1. Code Organization: Separate the code into functions or classes based on functionality. For instance, the model loading and setup part can be a separate function. It's a good practice to keep your code modular. This way, you can reuse your code and it will be easier to read and maintain.

  2. Error Handling: There seems to be a lack of error handling in the code. Try to anticipate what might go wrong and add appropriate error handling code. For example, if the model or tokenizer fails to load, the code should handle these exceptions and give a meaningful error message to the user.

  3. Command-Line Arguments: Consider using the argparse module's functionality to add help messages and type checking for the command-line arguments. This can make it easier for users to understand how to use the script and prevent incorrect usage.

  4. Comments: Although the code is mostly clear, adding comments to explain what each part of the code does can be helpful, especially for complex sections. This can make the code easier to understand for others, and for you in the future.

  5. Code Formatting: The code formatting seems inconsistent in some places. For example, there are spaces around operators in some places but not in others. Adhere to a specific style guide (like PEP8 for Python) to ensure consistency. Tools like flake8 or black can help with this.

  6. Model Selection: The code seems to only support a specific model (LLaMA), and it's unclear what happens if the user tries to use a different model. Consider adding support for other models or clearly documenting that only this specific model is supported.

  7. Unused Variables: The variable load_type is defined but not used in the code. Make sure to remove any unused variables to clean up the code.

  8. Hard-Coded Values: There are some hard-coded values in the code (e.g., max_memory=256, server_port=19324). Consider making these configurable via command-line arguments or a configuration file.

  9. Logging: Add logging for key events and errors. This can help with troubleshooting and understanding what the program is doing. Python's built-in logging module can be very useful for this.

  10. Closing Resources: Ensure that all resources (like file handles, network connections, etc.) are properly closed after usage. This is typically not a problem in small scripts, but it's a good habit to get into.

  11. Code Duplication: There seems to be some code duplication (e.g., the submit and click handlers for user_input and submitBtn). If possible, try to combine these or extract common code into separate functions to avoid duplication.

Please note that these suggestions are based on general good practices for Python development. The specific needs and constraints of your project might mean that some of these suggestions are not applicable or need to be modified.

@sunyuhan19981208
Copy link
Contributor Author

Thanks for your valuable advice!!!! I will modify the code following your great code advice, I am really happy that you can give me so many advices on my code!!!😃

@sunyuhan19981208 sunyuhan19981208 requested a review from ymcui June 26, 2023 13:47
Copy link
Owner

@ymcui ymcui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some minor comments.
After resolving these issues, I'll merge this PR in main branch.
Thanks.

with gr.Blocks() as demo:
gr.HTML("""<h1 align="center">Chinese LLaMA & Alpaca LLM</h1>""")
current_file_path = os.path.abspath(os.path.dirname(__file__))
gr.Image(f'{current_file_path}/../../pics/banner.png', label = 'Chinese LLaMA & Alpaca LLM')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be small_banner.png instead of banner.png?

value=128,
step=1.0,
label="Maximum New Token Length",
interactive=True)
top_p = gr.Slider(0, 1, value=0.8, step=0.01,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to set top_p=0.9 as default.

@ymcui
Copy link
Owner

ymcui commented Jun 28, 2023

It seems that there is no space between different English words in stream mode.
image

@sunyuhan19981208
Copy link
Contributor Author

It seems that there is no space between different English words in stream mode. image

I will fix that..

@sunyuhan19981208
Copy link
Contributor Author

@ymcui
1687960935961
it is fixed now

@sunyuhan19981208 sunyuhan19981208 requested a review from ymcui June 28, 2023 14:03
@ymcui ymcui merged commit 500547e into ymcui:main Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants