Image Generation tool for Presentation Generator (PoC)#197
Open
sairishi-exe wants to merge 3 commits into
Open
Image Generation tool for Presentation Generator (PoC)#197sairishi-exe wants to merge 3 commits into
sairishi-exe wants to merge 3 commits into
Conversation
Author
buriihenry
approved these changes
Mar 24, 2025
buriihenry
reviewed
Mar 24, 2025
Contributor
buriihenry
left a comment
There was a problem hiding this comment.
Superb work.. Would be good to write unit tests for the slide image generator
Author
Yes working on some unit tests will get them out by tonight |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR adds a new Slide Image Generator tool (similar to the Outline and Slide Generator tools) that automatically creates relevant images for presentation slides. The tool analyzes slide content, generates appropriate image prompts, and creates visually appealing images using Vertex AI's Imagen model, all of this is done concurrently. No additional endpoints have been added as Slide Image Generator will be used as a tool.
Type of Change
Proposed Solution
Slide Processing Pipeline
The implementation follows a structured processing pipeline:
Content Analysis: Each slide is analyzed to determine if it needs an image
Prompt Generation: For slides needing images, the tool creates safe, effective prompts
Concurrent Processing: Slides are processed in parallel using ThreadPoolExecutor(needs to be evaluated)
Image Generation: Prompts are sent to Vertex AI's Imagen model
imagen-3.0-fast-generate-001for fast generation.Storage: Generated images are stored in a public Cloud Storage bucket
How to Setup and Test
I. Setup on Google Cloud
Firstly, we will need to setup a Google Cloud Service Account and a Public Bucket on Google Cloud Storage. Follow the Loom video below.
Google Cloud Setup Instructions
After successfully running the
test_service_account.pyfile inmarvel-ai-backend/app/tools/presentation_generatir_updated/slide_image_generator/tests, you will have completed your Google Cloud setup.II. Verification before Testing
Now to ensure we have the setups and configs done appropriately, verify that:
.envfile is in app/.env and yourmarvel-ai-backend-credentials.jsonis root directory. Your.envfile must contain the following variablesVertex AI API is enabled in your project on Google Cloud.
You have the appropriate Python libraries installed for the Google Cloud SDK
III. Testing
cdto yourmarvel-ai-backend/root directory and build your Docker image using the following commanddocker build -t marvel-ai-backend ..envfile explicitly mentioned, like so.docker run -d -p 8000:8000 \ --env-file ./app/.env \ --name marvel-ai-backend marvel-ai-backendlocalhost:8000/docs, and use the/submit-toolendpoint using the following sample schema{ "user": { "id": "string", "fullName": "string", "email": "string" }, "type": "tool", "tool_data": { "tool_id": "slide-image-generator", "inputs": [ { "name": "slides", "value": [ { "title": "Introduction: Linear Algebra's Role in Machine Learning", "template": "sectionHeader", "content": "Linear algebra is the foundational mathematics behind many machine learning algorithms. This presentation will cover key concepts and their applications." }, { "title": "Vectors and Matrices: Fundamental Building Blocks", "template": "titleAndBody", "content": "Vectors represent data points, while matrices represent collections of data. Understanding vector operations (addition, scalar multiplication, dot product) and matrix operations (addition, multiplication, transpose) is crucial for manipulating data in machine learning. For example, images can be represented as matrices, and features in a dataset as vectors." }, { "title": "Linear Transformations: Manipulating Data", "template": "titleAndBullets", "content": [ "Linear transformations are functions that map vectors to other vectors in a linear way.", "They are represented by matrices.", "Examples include rotation, scaling, and projection.", "Crucial for tasks like dimensionality reduction and feature scaling in machine learning." ] }, { "title": "Eigenvalues and Eigenvectors: Understanding Data Structure", "template": "titleAndBody", "content": "Eigenvalues and eigenvectors reveal fundamental properties of linear transformations and matrices. Eigenvectors remain unchanged in direction after a transformation, only scaled by the corresponding eigenvalue. In Principal Component Analysis (PCA), eigenvectors represent principal components, which capture the most variance in the data." }, { "title": "Singular Value Decomposition (SVD): Dimensionality Reduction in Action", "template": "titleAndBody", "content": "SVD decomposes a matrix into three smaller matrices, revealing its underlying structure. This is useful for dimensionality reduction techniques like Latent Semantic Analysis (LSA) in natural language processing, reducing the number of dimensions while preserving essential information." }, { "title": "Real-World Applications: Examples in Machine Learning Algorithms", "template": "twoColumn", "content": { "leftColumn": "Recommendation Systems: Collaborative filtering uses matrix factorization techniques based on SVD to predict user preferences.", "rightColumn": "Image Compression: SVD can be used to reduce the size of image files by approximating the original matrix with a lower-rank matrix." } }, { "title": "Hands-on Exercise: Linear Algebra in Python (e.g., using NumPy)", "template": "titleAndBullets", "content": [ "Use NumPy to create vectors and matrices.", "Perform basic vector and matrix operations.", "Implement a simple linear transformation.", "Analyze a small dataset using PCA." ] }, { "title": "Summary and Conclusion: Key Takeaways and Further Exploration", "template": "titleAndBody", "content": "This presentation provided an overview of essential linear algebra concepts for machine learning. Further exploration of topics like matrix decompositions, vector spaces, and norms is recommended for a deeper understanding." } ] } ] } }Functionality Showcase
Example Usage of the Slide Image Generator
Unit Tests
No unit tests have been implemented, but you can check out the test file found at
marvel-ai-backend/app/tools/presentation_generator_updated/slide_image_generator/tests/test_service_account.py, to test basic Google Cloud Service Account usage.Future Enhancements
Potential future improvements (not included in this PR):
Documentation Updates
Documentation needs to be updated to include:
Checklist
Additional Information
Detailed Notion Document WIP