Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support vision input for Planner #472

Merged
merged 14 commits into from
Mar 14, 2025
Merged

Support vision input for Planner #472

merged 14 commits into from
Mar 14, 2025

Conversation

liqul
Copy link
Collaborator

@liqul liqul commented Mar 11, 2025

  • Modified message formatting to support vision input for OpenAI API
  • Added a role ImageReader to process input images so the Planner can get the URL/content of the image

@liqul liqul requested a review from Copilot March 11, 2025 09:30

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This pull request adds support for vision input in the Planner by updating message formatting and introducing a new role for processing image inputs.

  • Updated Chat message types and introduced a helper for constructing content messages with image URLs.
  • Created a new ImageReader role with corresponding configuration to read images and convert local paths into data URLs.
  • Modified Planner and attachment handling to incorporate image attachment processing.

Reviewed Changes

File Description
taskweaver/llm/util.py Updated ChatMessage and added format_chat_message_content for vision input support.
taskweaver/ext_role/image_reader/image_reader.py Added new ImageReader role to process image paths and generate data URLs.
taskweaver/ext_role/image_reader/image_reader.role.yaml Added YAML configuration for the ImageReader role.
taskweaver/planner/planner.py Modified conversation composition to include image attachments.
taskweaver/memory/attachment.py Extended AttachmentType enum to include image_url for vision input.

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

@liqul liqul requested review from Copilot, Jack-Q and ShilinHe March 14, 2025 08:28
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for vision input to the Planner role by introducing a new ImageReader role that can process image paths provided in messages and converts local images to Base64 data URLs for downstream consumption. Additional changes include updates to message formatting functions to support image URLs, various markdown documentation updates, and adjustments in attachment handling.

Reviewed Changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
taskweaver/ext_role/image_reader/image_reader.py New ImageReader role implementation to process image inputs
taskweaver/llm/util.py Updated chat message formatting to include image URL support
website/blog/authors.yml Added author details
taskweaver/ext_role/image_reader/image_reader.role.yaml Defined role configuration for ImageReader
taskweaver/planner/planner.py Updated logic to process attachments with image URLs
taskweaver/code_interpreter/code_generator.py Adjusted attachment handling to use the content field
website/blog/local_llm.md, evaluation.md, experience.md Updated markdown front matter and content
taskweaver/memory/attachment.py & post.py Updated AttachmentType and get_attachment return type for image_url support
taskweaver/code_interpreter/*, chat/console/chat.py Minor content and messaging updates throughout the codebase
README.md Updated news section to reflect vision input support
taskweaver/code_interpreter/code_interpreter_cli_only/code_interpreter_cli_only.py Adjusted attachment handling for reply content
Comments suppressed due to low confidence (1)

taskweaver/memory/post.py:90

  • The get_attachment function now returns Attachment objects instead of their content strings; verify that calling code in other modules properly accesses the 'content' attribute where required.
def get_attachment(self, type: AttachmentType) -> List[Attachment]:

@liqul liqul requested a review from Copilot March 14, 2025 08:35
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds vision input support for the Planner role by modifying chat message formatting and introducing a new ImageReader role to process image inputs. Key changes include updating message formatting in llm/util.py, adding a new image conversion and reader implementation in ext_role/image_reader, and propagating image attachments support throughout the codebase.

Reviewed Changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
taskweaver/llm/util.py Updated chat message formatting to support image URLs via a new content structure
taskweaver/ext_role/image_reader/image_reader.py Added ImageReader role with local image-to-data URL conversion and image processing logic
website/blog/authors.yml Added author metadata for new contributors
taskweaver/planner/planner.py Updated conversation composition to handle image attachments
website/blog/local_llm.md, evaluation.md, experience.md Introduced/upgraded YAML front matter for blog posts
taskweaver/code_interpreter/code_interpreter/code_generator.py Fixed extraction of attachment content for code feedback
taskweaver/memory/attachment.py Added new attachment type “image_url”
taskweaver/code_interpreter/code_interpreter_plugin_only/code_interpreter_plugin_only.py Updated extraction of function attachment content
README.md Updated news section to announce vision input support
taskweaver/code_interpreter/code_interpreter_cli_only/code_interpreter_cli_only.py Adjusted extraction of reply content from attachments
taskweaver/chat/console/chat.py Minor update to the system message for new sessions
taskweaver/memory/post.py Updated get_attachment to return Attachment objects instead of their content
Comments suppressed due to low confidence (1)

taskweaver/ext_role/image_reader/image_reader.py:31

  • Consider replacing print statements with logger.error calls to ensure that error messages are correctly captured in production logs.
print(f"Error: The file {image_path} does not exist.")

@liqul liqul merged commit f605b97 into main Mar 14, 2025
2 checks passed
@liqul liqul deleted the users/liqun/vision branch March 14, 2025 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants