Skip to content

Comments

Add image embedding support to ml/rag#37628

Merged
claudevdm merged 7 commits intoapache:masterfrom
claudevdm:image-embeddings
Feb 18, 2026
Merged

Add image embedding support to ml/rag#37628
claudevdm merged 7 commits intoapache:masterfrom
claudevdm:image-embeddings

Conversation

@claudevdm
Copy link
Collaborator

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @claudevdm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Apache Beam's RAG capabilities by integrating robust support for image embeddings. It introduces the necessary infrastructure to process and embed image content using popular ML frameworks like HuggingFace and Vertex AI, alongside existing text embedding functionalities. The changes streamline the handling of multimodal data within the RAG pipeline, from data representation to ingestion into various vector databases, thereby broadening the scope of applications that can leverage Beam for advanced retrieval augmented generation.

Highlights

  • Image Embedding Support: Introduced new classes and functionalities to support image embeddings for both HuggingFace and Vertex AI models within the RAG pipeline. This allows users to generate embeddings from image content, expanding the multimodal capabilities of Apache Beam's ML RAG components.
  • Enhanced EmbeddableItem: The EmbeddableItem data structure has been updated to include an optional image field in its Content attribute. A new from_image factory method simplifies the creation of image-based EmbeddableItem instances, and a content_string property was added to provide a unified string representation for ingestion, prioritizing text over image URI.
  • Ingestion Pipeline Updates: BigQuery, MySQL, PostgreSQL, and Spanner ingestion modules were modified to leverage the new content_string property of EmbeddableItem. This ensures that these pipelines can correctly process and store both text and image URI content, adapting to the expanded EmbeddableItem definition.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • sdks/python/apache_beam/ml/rag/embeddings/base_test.py
    • Imported EmbeddableItem for testing purposes.
    • Added ImageEmbeddableItemTest to validate the EmbeddableItem.from_image factory method.
    • Added ContentStringTest to verify the functionality of the new content_string property.
  • sdks/python/apache_beam/ml/rag/embeddings/huggingface.py
    • Added imports for io, Sequence, _add_embedding_fn, EmbeddingTypeAdapter, _ImageEmbeddingHandler, and PIL.Image.
    • Updated the docstring for HuggingfaceTextEmbeddings to provide more detailed argument descriptions.
    • Implemented _extract_images to convert EmbeddableItem image content into PIL Image objects.
    • Created _create_hf_image_adapter to generate an EmbeddingTypeAdapter specifically for HuggingFace image embeddings.
    • Introduced HuggingfaceImageEmbeddings class to manage image embedding using HuggingFace models.
  • sdks/python/apache_beam/ml/rag/embeddings/huggingface_test.py
    • Added imports for os, HuggingfaceImageEmbeddings, _create_hf_image_adapter, EmbeddableItem, and PIL.Image.
    • Included a PIL_AVAILABLE check to conditionally run tests based on Pillow library availability.
    • Added HuggingfaceImageAdapterTest to test the functionality of the HuggingFace image adapter.
    • Implemented HuggingfaceImageEmbeddingsTest to verify the end-to-end image embedding pipeline for HuggingFace models, including tests for both image bytes and file paths.
    • Added a helper function _create_png_bytes to generate sample PNG image data for tests.
  • sdks/python/apache_beam/ml/rag/embeddings/vertex_ai.py
    • Added imports for Sequence, _add_embedding_fn, EmbeddingTypeAdapter, _ImageEmbeddingHandler, Chunk, _VertexAIImageEmbeddingHandler, and vertexai.vision_models.Image.
    • Modified the get_ptransform_for_processing method in VertexAITextEmbeddings to return beam.PCollection[Chunk].
    • Implemented _extract_images to convert EmbeddableItem image content into Vertex AI Image objects.
    • Created _create_image_adapter to generate an EmbeddingTypeAdapter for Vertex AI image embeddings.
    • Introduced VertexAIImageEmbeddings class to manage image embedding using Vertex AI models.
  • sdks/python/apache_beam/ml/rag/embeddings/vertex_ai_test.py
    • Added imports for os, EmbeddableItem, VertexAIImageEmbeddings, and _create_image_adapter.
    • Implemented VertexAIImageAdapterTest to test the functionality of the Vertex AI image adapter.
    • Added VertexAIImageEmbeddingsTest to verify the end-to-end image embedding pipeline for Vertex AI models, including tests for both image bytes and file paths.
    • Included a static helper method _create_png_bytes to generate sample PNG image data for tests.
  • sdks/python/apache_beam/ml/rag/ingestion/bigquery.py
    • Updated example embeddable_to_dict_fn to use item instead of chunk for consistency.
    • Modified _default_embeddable_to_dict_fn to utilize item.content_string and adjusted the ValueError message.
    • Updated the lambda variable name in the expand method for improved readability.
  • sdks/python/apache_beam/ml/rag/ingestion/mysql_common.py
    • Modified the value_fn within with_content_spec to use chunk.content_string and removed the explicit ValueError check for content.text.
  • sdks/python/apache_beam/ml/rag/ingestion/postgres_common.py
    • Modified the value_fn within with_content_spec to use chunk.content_string and removed the explicit ValueError check for content.text.
  • sdks/python/apache_beam/ml/rag/ingestion/spanner.py
    • Modified the extract_fn within with_content_spec to use embeddable.content_string and updated its return type to Optional[str], removing the ValueError check.
  • sdks/python/apache_beam/ml/rag/types.py
    • Imported Union for type hinting flexibility.
    • Added an image: Optional[Union[bytes, str]] = None field to the Content dataclass to support image content.
    • Introduced a from_image class method to EmbeddableItem for convenient creation of items with image content.
    • Added a content_string property to EmbeddableItem that returns a storable string representation, prioritizing text content or image URI.
Activity
  • The pull request was opened by claudevdm with the title 'image embeddings'.
  • The initial description is a placeholder, indicating that a more meaningful description is needed from the author.
  • No further activity, comments, or reviews have been recorded since the pull request's creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claudevdm
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully introduces support for image embeddings in the RAG pipeline, covering both HuggingFace and Vertex AI implementations. It correctly updates the core types and ingestion logic to handle multimodal content. The main areas for improvement are ensuring robust handling of GCS URIs when loading images and minor cleanup of error messages and docstrings.

@claudevdm claudevdm marked this pull request as ready for review February 18, 2026 14:08
@claudevdm
Copy link
Collaborator Author

R: @damccorm

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@claudevdm claudevdm changed the title image embeddings. Add image embedding support to ml/rag Feb 18, 2026
Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally LGTM, just had one question

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once checks complete

@claudevdm claudevdm merged commit cecc2a6 into apache:master Feb 18, 2026
108 of 109 checks passed
@Abacn
Copy link
Contributor

Abacn commented Feb 20, 2026

It's likely breaking XVR tests #30601 #30602 #31418 given Beam 2.72.0 branch is good and this is the only commit at first breakage https://github.com/apache/beam/actions/workflows/beam_PostCommit_XVR_Flink.yml?query=

@claudevdm
Copy link
Collaborator Author

It's likely breaking XVR tests #30601 #30602 #31418 given Beam 2.72.0 branch is good and this is the only commit at first breakage https://github.com/apache/beam/actions/workflows/beam_PostCommit_XVR_Flink.yml?query=

Responded on #30602

'numpy>=1.14.3,<2.5.0', # Update pyproject.toml as well.
'objsize>=0.6.1,<0.8.0',
'packaging>=22.0',
'pillow',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using an upper bound unless we don't expect breaking changes that can affect us in the future, see:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=290982440#DependencymanagementguidelinesforBeamPythonSDKmaintainers-Howtoaddanewdependency?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants