Skip to content

Commit c4d4ed1

Browse files
layolinCopybara
authored andcommitted
feat(vto): Add platform links and improve documentation for VTO workflow
- Add 4 platform launch links (Colab, Colab Enterprise, Vertex AI, GitHub) - Rename notebook files from LJ_GenMedia_Workflow to VTO_GenMedia_Workflow - Rename images folder to dress for consistency - Update all references from 'Dress' to lowercase 'dress' throughout - Add prerequisites section explaining how to copy dress images to GCS - Document OUTFITS_PREFIX configuration variable - Remove emoji icons from documentation - Streamline README by removing unnecessary sections - Include 4 sample dress images (blue, multi, pink, red) - Apply linter fixes for consistent formatting Change-Id: I26a5d9dfe4208be991054203b96cc57f3084fa13 GitOrigin-RevId: 753408edd5c57ae61a1f201e34de5c772d4239d4
1 parent ea3a880 commit c4d4ed1

File tree

7 files changed

+104
-129
lines changed

7 files changed

+104
-129
lines changed

projects/ai/gen-media/notebooks/vto_scale_workflow/README.md

Lines changed: 34 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ applications that:
2020

2121
## Key Features
2222

23-
### 🎯 Multi-Model AI Pipeline
23+
### Multi-Model AI Pipeline
2424

2525
- **Model Generation**: Creates photorealistic digital models with diverse
2626
demographics (race, body type, age)
@@ -32,7 +32,7 @@ applications that:
3232
- **Scalable Architecture**: Processes batch inputs from CSV with parallel
3333
execution
3434

35-
### 🔧 Technologies Used
35+
### Technologies Used
3636

3737
- **Google Vertex AI**: Primary platform for all AI operations
3838
- **Gemini 2.5 Flash**: Orchestration, prompt generation, and quality critique
@@ -45,7 +45,13 @@ applications that:
4545

4646
```text
4747
vto_scale_workflow/
48-
├── LJ_GenMedia_Workflow.ipynb # Main Jupyter notebook with complete pipeline
48+
├── VTO_GenMedia_Workflow.ipynb # Main Jupyter notebook with complete pipeline
49+
├── VTO_GenMedia_Workflow.nb.py # Python script version of the notebook
50+
├── dress/ # Sample dress images for VTO
51+
│ ├── blue-front.png
52+
│ ├── multi-front.png
53+
│ ├── pink-front.png
54+
│ └── red-front.png
4955
├── requirements.txt # Python dependencies
5056
└── README.md # This file
5157
```
@@ -104,6 +110,19 @@ The pipeline follows a sequential 5-step process:
104110
- Local: `gcloud auth application-default login`
105111
- Vertex AI Workbench: Automatic authentication
106112

113+
### Preparing Input Images
114+
115+
Before running the notebook end-to-end, copy the sample dress images from the
116+
`dress/` folder to your Google Cloud Storage bucket:
117+
118+
```bash
119+
# Copy all sample dress images to your GCS bucket
120+
gsutil cp dress/*.png gs://YOUR_BUCKET_NAME/dress/
121+
```
122+
123+
The notebook configuration uses `OUTFITS_PREFIX = "dress"` to specify where it
124+
looks for input dress images.
125+
107126
### Storage Setup
108127

109128
Google Cloud Storage bucket with structure:
@@ -112,14 +131,16 @@ Google Cloud Storage bucket with structure:
112131
your-bucket/
113132
├── Model_Creation.csv # Generated model definitions
114133
├── models/ # Generated base model images
115-
├── Dress/ # Input garment images
116-
│ ├── dress1.png
117-
│ ├── dress2.png
134+
├── dress/ # Input garment images (copied from dress/ folder)
135+
│ ├── blue-front.png
136+
│ ├── multi-front.png
137+
│ ├── pink-front.png
138+
│ ├── red-front.png
118139
│ └── ...
119-
├── Dress/4tryon/ # VTO output images
120-
├── Dress/4tryon/final/ # Selected best VTO images
140+
├── dress/4tryon/ # VTO output images
141+
├── dress/4tryon/final/ # Selected best VTO images
121142
│ └── eval_summary.csv # Critique results
122-
└── Dress/4tryon/final_motion/ # Generated videos
143+
└── dress/4tryon/final_motion/ # Generated videos
123144
```
124145

125146
## Installation
@@ -165,7 +186,7 @@ MODEL_VIDEO = "veo-3.0-generate-001"
165186

166187
1. **Open the Notebook**
167188

168-
- Launch `LJ_GenMedia_Workflow.ipynb` in Jupyter or Vertex AI Workbench
189+
- Launch `VTO_GenMedia_Workflow.ipynb` in Jupyter or Vertex AI Workbench
169190

170191
1. **Run Configuration Cell**
171192

@@ -178,98 +199,6 @@ MODEL_VIDEO = "veo-3.0-generate-001"
178199

179200
1. **Access Results**
180201

181-
- Final VTO images: `gs://your-bucket/Dress/4tryon/final/`
182-
- Motion videos: `gs://your-bucket/Dress/4tryon/final_motion/`
183-
- Evaluation summary: `gs://your-bucket/Dress/4tryon/final/eval_summary.csv`
184-
185-
## Performance Considerations
186-
187-
- **Parallel Processing**: Utilizes ThreadPoolExecutor for concurrent operations
188-
- **Retry Mechanism**: Automatic retry for failed VTO attempts (3 attempts by
189-
default)
190-
- **Batch Processing**: Efficient handling of multiple model-outfit combinations
191-
- **Resource Management**: Configurable worker limits to control API usage
192-
193-
## Output Examples
194-
195-
### Generated Assets
196-
197-
- **Model Images**: Diverse digital models in standardized outfit
198-
- **VTO Images**: High-quality garment transfers on each model
199-
- **Motion Videos**: 8-second runway walk showcasing garments
200-
- **Evaluation CSV**: Detailed critique results with selection reasoning
201-
202-
### Quality Metrics
203-
204-
The AI critique evaluates:
205-
206-
- Garment transfer completeness
207-
- Fabric texture preservation
208-
- Fit accuracy and realism
209-
- Absence of visual artifacts
210-
- Body proportion maintenance
211-
212-
## Troubleshooting
213-
214-
### Common Issues
215-
216-
1. **Authentication Errors**
217-
218-
- Ensure proper GCP authentication
219-
- Verify project permissions
220-
221-
1. **API Quotas**
222-
223-
- Monitor Vertex AI quotas
224-
- Adjust `PARALLEL_JOBS_PER_MODEL` if needed
225-
226-
1. **Storage Access**
227-
228-
- Verify bucket exists and is accessible
229-
- Check file paths and prefixes
230-
231-
1. **Model Availability**
232-
233-
- Confirm model versions are available in your region
234-
- Update model IDs if using newer versions
235-
236-
## Dependencies
237-
238-
Core requirements (see `requirements.txt`):
239-
240-
- `pandas==2.2.2` - Data manipulation
241-
- `Pillow==11.1.0` - Image processing
242-
- `google-genai==1.45.0` - Generative AI SDK
243-
- `google-cloud-storage==2.19.0` - GCS operations
244-
- `google-cloud-aiplatform==1.74.0` - Vertex AI integration
245-
246-
## License
247-
248-
This project is for demonstration purposes. Please ensure compliance with Google
249-
Cloud's terms of service and any applicable licensing requirements for
250-
production use.
251-
252-
## Contributing
253-
254-
This is a demonstration workflow. For production implementations, consider:
255-
256-
- Error handling enhancements
257-
- Monitoring and logging integration
258-
- Cost optimization strategies
259-
- Custom quality assessment metrics
260-
- Extended diversity parameters
261-
262-
## Support
263-
264-
For issues related to:
265-
266-
- Google Cloud setup: Consult
267-
[Google Cloud Documentation](https://cloud.google.com/docs)
268-
- Vertex AI models: See
269-
[Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)
270-
- Code issues: Review the notebook comments and inline documentation
271-
272-
## Acknowledgments
273-
274-
Created on 11/12/2025 using Google's suite of Generative AI models on Vertex AI
275-
platform.
202+
- Final VTO images: `gs://your-bucket/dress/4tryon/final/`
203+
- Motion videos: `gs://your-bucket/dress/4tryon/final_motion/`
204+
- Evaluation summary: `gs://your-bucket/dress/4tryon/final/eval_summary.csv`

projects/ai/gen-media/notebooks/vto_scale_workflow/LJ_GenMedia_Workflow.ipynb renamed to projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,32 @@
3333
"source": [
3434
"# Gen Media end-to-end Workflow, Virtual Try-on usecase\n",
3535
"\n",
36-
"This Jupyter Notebook outlines a complete, scalable pipeline for generating diverse, photorealistic virtual try-on. The core objective is to use a suite of Google's Generative AI models—Gemini (for orchestration and critique), Gemini Image Generation (for creating diverse base models), Vertex AI Virtual Try-On (VTO) (for garment swapping), and Veo (for adding motion)—to produce a large volume of Virtual Try-On images and short motion videos featuring diverse digital models in various outfits. All these are creatd using one platform Vertex AI!\n",
36+
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GoogleCloudPlatform/the-repo/blob/main/projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb)\n",
37+
"[![Open in Colab Enterprise](https://img.shields.io/badge/Open%20in%20Colab%20Enterprise-blue?style=flat-square)](https://console.cloud.google.com/colab/notebooks/github/GoogleCloudPlatform/the-repo/blob/main/projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb)\n",
38+
"[![Open in Vertex AI Workbench](https://img.shields.io/badge/Open%20in%20Vertex%20AI%20Workbench-orange?style=flat-square)](https://console.cloud.google.com/vertex-ai/workbench)\n",
39+
"[![View on GitHub](https://img.shields.io/badge/View%20on%20GitHub-black?style=flat-square&logo=github)](https://github.com/GoogleCloudPlatform/the-repo/blob/main/projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb)\n",
40+
"\n",
41+
"This Jupyter Notebook outlines a complete, scalable pipeline for generating diverse, photorealistic virtual try-on. The core objective is to use a suite of Google's Generative AI models—Gemini (for orchestration and critique), Gemini Image Generation (for creating diverse base models), Vertex AI Virtual Try-On (VTO) (for garment swapping), and Veo (for adding motion)—to produce a large volume of Virtual Try-On images and short motion videos featuring diverse digital models in various outfits. All these are created using one platform Vertex AI!\n",
42+
"\n",
43+
"## Prerequisites - Preparing Your GCS Bucket\n",
44+
"\n",
45+
"Before running this notebook end-to-end, you need to copy the sample dress images to your Google Cloud Storage bucket:\n",
46+
"\n",
47+
"1. **Copy Sample Dress Images**: The sample dress images are provided in the `dress/` folder. Copy these files to your GCS bucket under the path specified by `OUTFITS_PREFIX` (which is set to \"dress\" by default):\n",
48+
"\n",
49+
" ```bash\n",
50+
" # Example command to copy images from the local dress folder to your GCS bucket\n",
51+
" gsutil cp dress/*.png gs://YOUR_BUCKET_NAME/dress/\n",
52+
" ```\n",
53+
"\n",
54+
" The notebook expects dress images to be available at: `gs://YOUR_BUCKET_NAME/dress/`\n",
55+
"\n",
56+
" Note: The `OUTFITS_PREFIX = \"dress\"` variable in the configuration section defines where the notebook looks for input dress images.\n",
57+
"\n",
58+
"2. **Update Configuration**: In the Global Configuration section below, update:\n",
59+
" - `PROJECT_ID`: Your Google Cloud Project ID\n",
60+
" - `BUCKET_NAME`: Your Google Cloud Storage bucket name\n",
61+
" - `LOCATION`: Your preferred region (default: us-central1)\n",
3762
"\n",
3863
"Created on 11/12/2025"
3964
]
@@ -140,9 +165,7 @@
140165
"# Global Configuration, UPDATE FOR ANY MODEL CHANGES\n",
141166
"# --- Project & Location Settings ---\n",
142167
"# Ensure these match your environment\n",
143-
"os.environ[\"GOOGLE_CLOUD_PROJECT\"] = (\n",
144-
" \"PROJECT_ID\" # update your project\n",
145-
")\n",
168+
"os.environ[\"GOOGLE_CLOUD_PROJECT\"] = \"PROJECT_ID\" # update your project\n",
146169
"os.environ[\"GOOGLE_CLOUD_LOCATION\"] = \"us-central1\" # update your location\n",
147170
"\n",
148171
"PROJECT_ID = os.environ.get(\"GOOGLE_CLOUD_PROJECT\")\n",
@@ -155,10 +178,10 @@
155178
"# GCS Paths/Prefixes -> The process will create the subsequent file and folder structure\n",
156179
"CSV_OBJECT_NAME = \"Model_Creation.csv\"\n",
157180
"MODELS_PREFIX = \"models\" # Base images of models\n",
158-
"OUTFITS_PREFIX = \"Dress\" # Input dress images\n",
159-
"VTO_OUTPUT_PREFIX = \"Dress/4tryon\"\n",
160-
"FINAL_PREFIX = \"Dress/4tryon/final\"\n",
161-
"MOTION_OUTPUT_PREFIX = \"Dress/4tryon/final_motion\"\n",
181+
"OUTFITS_PREFIX = \"dress\" # Input dress images\n",
182+
"VTO_OUTPUT_PREFIX = \"dress/4tryon\"\n",
183+
"FINAL_PREFIX = \"dress/4tryon/final\"\n",
184+
"MOTION_OUTPUT_PREFIX = \"dress/4tryon/final_motion\"\n",
162185
"\n",
163186
"# --- Model Versions ---\n",
164187
"# Text/Orchestration Model\n",
@@ -509,7 +532,7 @@
509532
"#Use Case 3 - Virtual Try-On (Vertex AI VTO)\n",
510533
"\n",
511534
"Description: Multi-try-on in one shot using the Vertex AI VTO API\n",
512-
"- The process involves pairing the generated model images (from Step 4) with input outfit images (from the GCS Dress prefix).\n",
535+
"- The process involves pairing the generated model images (from Step 4) with input outfit images (from the GCS dress prefix).\n",
513536
"\n",
514537
"- Use a ThreadPoolExecutor to orchestrate the VTO generation in parallel for multiple model/outfit pairs.\n",
515538
"\n",
@@ -680,7 +703,7 @@
680703
"\n",
681704
"- Execute the critique process in parallel using a ThreadPoolExecutor.\n",
682705
"\n",
683-
"- The winning image for each model/outfit combination is copied to the final GCS folder (Dress/4tryon/final)."
706+
"- The winning image for each model/outfit combination is copied to the final GCS folder (dress/4tryon/final)."
684707
]
685708
},
686709
{
@@ -932,7 +955,7 @@
932955
"\n",
933956
"- The VTO image is used as the input image and a prompt (e.g., \"slowly walking on a white runway\") is provided to instruct the motion and environment.\n",
934957
"\n",
935-
"- The generated short video clips are uploaded to the final motion GCS prefix (Dress/4tryon/final_motion)."
958+
"- The generated short video clips are uploaded to the final motion GCS prefix (dress/4tryon/final_motion)."
936959
]
937960
},
938961
{
@@ -1029,7 +1052,7 @@
10291052
"QaVTCIINx8JZ",
10301053
"C57vaI8syqHU"
10311054
],
1032-
"name": "LJ_GenMedia_Workflow",
1055+
"name": "VTO_GenMedia_Workflow",
10331056
"provenance": []
10341057
},
10351058
"kernelspec": {

projects/ai/gen-media/notebooks/vto_scale_workflow/LJ_GenMedia_Workflow.nb.py renamed to projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.nb.py

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
# extension: .py
66
# format_name: percent
77
# format_version: '1.3'
8-
# jupytext_version: 1.18.1
8+
# jupytext_version: 1.20.0
99
# kernelspec:
1010
# display_name: Python 3
1111
# language: python
@@ -32,7 +32,32 @@
3232
# %% [markdown] id="B9SgWpCruX5g"
3333
# # Gen Media end-to-end Workflow, Virtual Try-on usecase
3434
#
35-
# This Jupyter Notebook outlines a complete, scalable pipeline for generating diverse, photorealistic virtual try-on. The core objective is to use a suite of Google's Generative AI models—Gemini (for orchestration and critique), Gemini Image Generation (for creating diverse base models), Vertex AI Virtual Try-On (VTO) (for garment swapping), and Veo (for adding motion)—to produce a large volume of Virtual Try-On images and short motion videos featuring diverse digital models in various outfits. All these are creatd using one platform Vertex AI!
35+
# [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GoogleCloudPlatform/the-repo/blob/main/projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb)
36+
# [![Open in Colab Enterprise](https://img.shields.io/badge/Open%20in%20Colab%20Enterprise-blue?style=flat-square)](https://console.cloud.google.com/colab/notebooks/github/GoogleCloudPlatform/the-repo/blob/main/projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb)
37+
# [![Open in Vertex AI Workbench](https://img.shields.io/badge/Open%20in%20Vertex%20AI%20Workbench-orange?style=flat-square)](https://console.cloud.google.com/vertex-ai/workbench)
38+
# [![View on GitHub](https://img.shields.io/badge/View%20on%20GitHub-black?style=flat-square&logo=github)](https://github.com/GoogleCloudPlatform/the-repo/blob/main/projects/ai/gen-media/notebooks/vto_scale_workflow/VTO_GenMedia_Workflow.ipynb)
39+
#
40+
# This Jupyter Notebook outlines a complete, scalable pipeline for generating diverse, photorealistic virtual try-on. The core objective is to use a suite of Google's Generative AI models—Gemini (for orchestration and critique), Gemini Image Generation (for creating diverse base models), Vertex AI Virtual Try-On (VTO) (for garment swapping), and Veo (for adding motion)—to produce a large volume of Virtual Try-On images and short motion videos featuring diverse digital models in various outfits. All these are created using one platform Vertex AI!
41+
#
42+
# ## Prerequisites - Preparing Your GCS Bucket
43+
#
44+
# Before running this notebook end-to-end, you need to copy the sample dress images to your Google Cloud Storage bucket:
45+
#
46+
# 1. **Copy Sample Dress Images**: The sample dress images are provided in the `dress/` folder. Copy these files to your GCS bucket under the path specified by `OUTFITS_PREFIX` (which is set to "dress" by default):
47+
#
48+
# ```bash
49+
# # Example command to copy images from the local dress folder to your GCS bucket
50+
# gsutil cp dress/*.png gs://YOUR_BUCKET_NAME/dress/
51+
# ```
52+
#
53+
# The notebook expects dress images to be available at: `gs://YOUR_BUCKET_NAME/dress/`
54+
#
55+
# Note: The `OUTFITS_PREFIX = "dress"` variable in the configuration section defines where the notebook looks for input dress images.
56+
#
57+
# 2. **Update Configuration**: In the Global Configuration section below, update:
58+
# - `PROJECT_ID`: Your Google Cloud Project ID
59+
# - `BUCKET_NAME`: Your Google Cloud Storage bucket name
60+
# - `LOCATION`: Your preferred region (default: us-central1)
3661
#
3762
# Created on 11/12/2025
3863

@@ -108,9 +133,7 @@
108133
# Global Configuration, UPDATE FOR ANY MODEL CHANGES
109134
# --- Project & Location Settings ---
110135
# Ensure these match your environment
111-
os.environ["GOOGLE_CLOUD_PROJECT"] = (
112-
"PROJECT_ID" # update your project
113-
)
136+
os.environ["GOOGLE_CLOUD_PROJECT"] = "PROJECT_ID" # update your project
114137
os.environ["GOOGLE_CLOUD_LOCATION"] = "us-central1" # update your location
115138

116139
PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT")
@@ -123,10 +146,10 @@
123146
# GCS Paths/Prefixes -> The process will create the subsequent file and folder structure
124147
CSV_OBJECT_NAME = "Model_Creation.csv"
125148
MODELS_PREFIX = "models" # Base images of models
126-
OUTFITS_PREFIX = "Dress" # Input dress images
127-
VTO_OUTPUT_PREFIX = "Dress/4tryon"
128-
FINAL_PREFIX = "Dress/4tryon/final"
129-
MOTION_OUTPUT_PREFIX = "Dress/4tryon/final_motion"
149+
OUTFITS_PREFIX = "dress" # Input dress images
150+
VTO_OUTPUT_PREFIX = "dress/4tryon"
151+
FINAL_PREFIX = "dress/4tryon/final"
152+
MOTION_OUTPUT_PREFIX = "dress/4tryon/final_motion"
130153

131154
# --- Model Versions ---
132155
# Text/Orchestration Model
@@ -438,7 +461,7 @@ def display_images_in_row(
438461
# #Use Case 3 - Virtual Try-On (Vertex AI VTO)
439462
#
440463
# Description: Multi-try-on in one shot using the Vertex AI VTO API
441-
# - The process involves pairing the generated model images (from Step 4) with input outfit images (from the GCS Dress prefix).
464+
# - The process involves pairing the generated model images (from Step 4) with input outfit images (from the GCS dress prefix).
442465
#
443466
# - Use a ThreadPoolExecutor to orchestrate the VTO generation in parallel for multiple model/outfit pairs.
444467
#
@@ -593,7 +616,7 @@ def tryon_worker(
593616
#
594617
# - Execute the critique process in parallel using a ThreadPoolExecutor.
595618
#
596-
# - The winning image for each model/outfit combination is copied to the final GCS folder (Dress/4tryon/final).
619+
# - The winning image for each model/outfit combination is copied to the final GCS folder (dress/4tryon/final).
597620

598621
# %% id="SehmdW6_x_q9"
599622
print("[START] AI Critique")
@@ -829,7 +852,7 @@ def process_critique_group(model_stamp, items_for_model):
829852
#
830853
# - The VTO image is used as the input image and a prompt (e.g., "slowly walking on a white runway") is provided to instruct the motion and environment.
831854
#
832-
# - The generated short video clips are uploaded to the final motion GCS prefix (Dress/4tryon/final_motion).
855+
# - The generated short video clips are uploaded to the final motion GCS prefix (dress/4tryon/final_motion).
833856

834857
# %% id="CboQHX_JywYl"
835858
VIDEO_PROMPT = (
320 KB
Loading
331 KB
Loading
374 KB
Loading
374 KB
Loading

0 commit comments

Comments
 (0)