-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support vision input for Planner #472
Conversation
liqul
commented
Mar 11, 2025
- Modified message formatting to support vision input for OpenAI API
- Added a role ImageReader to process input images so the Planner can get the URL/content of the image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This pull request adds support for vision input in the Planner by updating message formatting and introducing a new role for processing image inputs.
- Updated Chat message types and introduced a helper for constructing content messages with image URLs.
- Created a new ImageReader role with corresponding configuration to read images and convert local paths into data URLs.
- Modified Planner and attachment handling to incorporate image attachment processing.
Reviewed Changes
File | Description |
---|---|
taskweaver/llm/util.py | Updated ChatMessage and added format_chat_message_content for vision input support. |
taskweaver/ext_role/image_reader/image_reader.py | Added new ImageReader role to process image paths and generate data URLs. |
taskweaver/ext_role/image_reader/image_reader.role.yaml | Added YAML configuration for the ImageReader role. |
taskweaver/planner/planner.py | Modified conversation composition to include image attachments. |
taskweaver/memory/attachment.py | Extended AttachmentType enum to include image_url for vision input. |
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
…sers/liqun/vision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds support for vision input to the Planner role by introducing a new ImageReader role that can process image paths provided in messages and converts local images to Base64 data URLs for downstream consumption. Additional changes include updates to message formatting functions to support image URLs, various markdown documentation updates, and adjustments in attachment handling.
Reviewed Changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
taskweaver/ext_role/image_reader/image_reader.py | New ImageReader role implementation to process image inputs |
taskweaver/llm/util.py | Updated chat message formatting to include image URL support |
website/blog/authors.yml | Added author details |
taskweaver/ext_role/image_reader/image_reader.role.yaml | Defined role configuration for ImageReader |
taskweaver/planner/planner.py | Updated logic to process attachments with image URLs |
taskweaver/code_interpreter/code_generator.py | Adjusted attachment handling to use the content field |
website/blog/local_llm.md, evaluation.md, experience.md | Updated markdown front matter and content |
taskweaver/memory/attachment.py & post.py | Updated AttachmentType and get_attachment return type for image_url support |
taskweaver/code_interpreter/*, chat/console/chat.py | Minor content and messaging updates throughout the codebase |
README.md | Updated news section to reflect vision input support |
taskweaver/code_interpreter/code_interpreter_cli_only/code_interpreter_cli_only.py | Adjusted attachment handling for reply content |
Comments suppressed due to low confidence (1)
taskweaver/memory/post.py:90
- The get_attachment function now returns Attachment objects instead of their content strings; verify that calling code in other modules properly accesses the 'content' attribute where required.
def get_attachment(self, type: AttachmentType) -> List[Attachment]:
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds vision input support for the Planner role by modifying chat message formatting and introducing a new ImageReader role to process image inputs. Key changes include updating message formatting in llm/util.py, adding a new image conversion and reader implementation in ext_role/image_reader, and propagating image attachments support throughout the codebase.
Reviewed Changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
taskweaver/llm/util.py | Updated chat message formatting to support image URLs via a new content structure |
taskweaver/ext_role/image_reader/image_reader.py | Added ImageReader role with local image-to-data URL conversion and image processing logic |
website/blog/authors.yml | Added author metadata for new contributors |
taskweaver/planner/planner.py | Updated conversation composition to handle image attachments |
website/blog/local_llm.md, evaluation.md, experience.md | Introduced/upgraded YAML front matter for blog posts |
taskweaver/code_interpreter/code_interpreter/code_generator.py | Fixed extraction of attachment content for code feedback |
taskweaver/memory/attachment.py | Added new attachment type “image_url” |
taskweaver/code_interpreter/code_interpreter_plugin_only/code_interpreter_plugin_only.py | Updated extraction of function attachment content |
README.md | Updated news section to announce vision input support |
taskweaver/code_interpreter/code_interpreter_cli_only/code_interpreter_cli_only.py | Adjusted extraction of reply content from attachments |
taskweaver/chat/console/chat.py | Minor update to the system message for new sessions |
taskweaver/memory/post.py | Updated get_attachment to return Attachment objects instead of their content |
Comments suppressed due to low confidence (1)
taskweaver/ext_role/image_reader/image_reader.py:31
- Consider replacing print statements with logger.error calls to ensure that error messages are correctly captured in production logs.
print(f"Error: The file {image_path} does not exist.")
Co-authored-by: Copilot <[email protected]>