Skip to content

Commit f4b1b27

Browse files
vertex-mg-botcopybara-github
authored andcommitted
Add HiDream-I1 serving notebook
PiperOrigin-RevId: 751521086
1 parent e4608c9 commit f4b1b27

File tree

1 file changed

+366
-0
lines changed

1 file changed

+366
-0
lines changed
Lines changed: 366 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,366 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"metadata": {
7+
"cellView": "form",
8+
"id": "7d9bbf86da5e"
9+
},
10+
"outputs": [],
11+
"source": [
12+
"# Copyright 2025 Google LLC\n",
13+
"#\n",
14+
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
15+
"# you may not use this file except in compliance with the License.\n",
16+
"# You may obtain a copy of the License at\n",
17+
"#\n",
18+
"# https://www.apache.org/licenses/LICENSE-2.0\n",
19+
"#\n",
20+
"# Unless required by applicable law or agreed to in writing, software\n",
21+
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
22+
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
23+
"# See the License for the specific language governing permissions and\n",
24+
"# limitations under the License."
25+
]
26+
},
27+
{
28+
"cell_type": "markdown",
29+
"metadata": {
30+
"id": "99c1c3fc2ca5"
31+
},
32+
"source": [
33+
"# Vertex AI Model Garden - HiDream-I1\n",
34+
"\n",
35+
"<table><tbody><tr>\n",
36+
" <td style=\"text-align: center\">\n",
37+
" <a href=\"https://console.cloud.google.com/vertex-ai/workbench/instances\">\n",
38+
" <img alt=\"Workbench logo\" src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" width=\"32px\"><br> Run in Workbench\n",
39+
" </a>\n",
40+
" </td>\n",
41+
" <td style=\"text-align: center\">\n",
42+
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fcommunity%2Fmodel_garden%2Fmodel_garden_pytorch_hidream_i1.ipynb\">\n",
43+
" <img alt=\"Google Cloud Colab Enterprise logo\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" width=\"32px\"><br> Run in Colab Enterprise\n",
44+
" </a>\n",
45+
" </td>\n",
46+
" <td style=\"text-align: center\">\n",
47+
" <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_pytorch_hidream_i1.ipynb\">\n",
48+
" <img alt=\"GitHub logo\" src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" width=\"32px\"><br> View on GitHub\n",
49+
" </a>\n",
50+
" </td>\n",
51+
"</tr></tbody></table>"
52+
]
53+
},
54+
{
55+
"cell_type": "markdown",
56+
"metadata": {
57+
"id": "3de7470326a2"
58+
},
59+
"source": [
60+
"## Overview\n",
61+
"\n",
62+
"This notebook demonstrates deploying the pre-trained [HiDream-I1](https://huggingface.co/collections/HiDream-ai/hidream-i1-67f3e90dd509fed088a158b3) models on Vertex AI for online prediction.\n",
63+
"\n",
64+
"### Objective\n",
65+
"\n",
66+
"- Upload the model to [Model Registry](https://cloud.google.com/vertex-ai/docs/model-registry/introduction).\n",
67+
"- Deploy the model on [Endpoint](https://cloud.google.com/vertex-ai/docs/predictions/using-private-endpoints).\n",
68+
"- Run online predictions for text-to-image.\n",
69+
"\n",
70+
"### File a bug\n",
71+
"\n",
72+
"File a bug on [GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/issues/new) if you encounter any issue with the notebook.\n",
73+
"\n",
74+
"### Costs\n",
75+
"\n",
76+
"This tutorial uses billable components of Google Cloud:\n",
77+
"\n",
78+
"* Vertex AI\n",
79+
"* Cloud Storage\n",
80+
"\n",
81+
"Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
82+
]
83+
},
84+
{
85+
"cell_type": "markdown",
86+
"metadata": {
87+
"id": "264c07757582"
88+
},
89+
"source": [
90+
"## Run the notebook"
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": null,
96+
"metadata": {
97+
"cellView": "form",
98+
"id": "ioensNKM8ned"
99+
},
100+
"outputs": [],
101+
"source": [
102+
"# @title Setup Google Cloud project\n",
103+
"\n",
104+
"# @markdown 1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n",
105+
"\n",
106+
"# @markdown 2. **[Optional]** Set region. If not set, the region will be set automatically according to Colab Enterprise environment.\n",
107+
"\n",
108+
"REGION = \"\" # @param {type:\"string\"}\n",
109+
"\n",
110+
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
111+
"\n",
112+
"# @markdown > | Machine Type | Accelerator Type | Recommended Regions |\n",
113+
"# @markdown | ----------- | ----------- | ----------- |\n",
114+
"# @markdown | a2-ultragpu-1g | 1 NVIDIA_A100_80GB | us-central1, us-east4, europe-west4, asia-southeast1, us-east4 |\n",
115+
"# @markdown | a3-highgpu-2g | 2 NVIDIA_H100_80GB | us-west1, asia-southeast1, europe-west4 |\n",
116+
"# @markdown | a3-highgpu-4g | 4 NVIDIA_H100_80GB | us-west1, asia-southeast1, europe-west4 |\n",
117+
"# @markdown | a3-highgpu-8g | 8 NVIDIA_H100_80GB | us-central1, europe-west4, us-west1, asia-southeast1 |\n",
118+
"\n",
119+
"# Upgrade Vertex AI SDK.\n",
120+
"! pip3 install --upgrade --quiet 'google-cloud-aiplatform>=1.84.0'\n",
121+
"\n",
122+
"import importlib\n",
123+
"import os\n",
124+
"\n",
125+
"from google.cloud import aiplatform\n",
126+
"\n",
127+
"if os.environ.get(\"VERTEX_PRODUCT\") != \"COLAB_ENTERPRISE\":\n",
128+
" ! pip install --upgrade tensorflow\n",
129+
"! git clone https://github.com/GoogleCloudPlatform/vertex-ai-samples.git\n",
130+
"\n",
131+
"common_util = importlib.import_module(\n",
132+
" \"vertex-ai-samples.community-content.vertex_model_garden.model_oss.notebook_util.common_util\"\n",
133+
")\n",
134+
"\n",
135+
"models, endpoints = {}, {}\n",
136+
"LABEL = \"text-to-image-hidream\"\n",
137+
"\n",
138+
"\n",
139+
"# Get the default cloud project id.\n",
140+
"PROJECT_ID = os.environ[\"GOOGLE_CLOUD_PROJECT\"]\n",
141+
"\n",
142+
"# Get the default region for launching jobs.\n",
143+
"if not REGION:\n",
144+
" REGION = os.environ[\"GOOGLE_CLOUD_REGION\"]\n",
145+
"\n",
146+
"# Initialize Vertex AI API.\n",
147+
"print(\"Initializing Vertex AI API.\")\n",
148+
"aiplatform.init(project=PROJECT_ID, location=REGION)\n",
149+
"\n",
150+
"! gcloud config set project $PROJECT_ID\n",
151+
"import vertexai\n",
152+
"\n",
153+
"vertexai.init(\n",
154+
" project=PROJECT_ID,\n",
155+
" location=REGION,\n",
156+
")"
157+
]
158+
},
159+
{
160+
"cell_type": "code",
161+
"execution_count": null,
162+
"metadata": {
163+
"cellView": "form",
164+
"id": "2707b02ef5df"
165+
},
166+
"outputs": [],
167+
"source": [
168+
"# @title Set the model parameters\n",
169+
"\n",
170+
"MODEL_ID = \"HiDream-ai/HiDream-I1-Full\" # @param [\"HiDream-ai/HiDream-I1-Full\", \"HiDream-ai/HiDream-I1-Dev\", \"HiDream-ai/HiDream-I1-Fast\"]\n",
171+
"TASK = \"text-to-image-hidream\"\n",
172+
"\n",
173+
"model_version = MODEL_ID.split(\"/\")[-1].lower()\n",
174+
"PUBLISHER_MODEL_NAME = f\"publishers/hidream-i1/models/hidream-i1-full@{model_version}\"\n",
175+
"\n",
176+
"ACCELERATOR_TYPE = \"NVIDIA_A100_80GB\" # @param [\"NVIDIA_A100_80GB\", \"NVIDIA_H100_80GB\"]\n",
177+
"\n",
178+
"if ACCELERATOR_TYPE == \"NVIDIA_A100_80GB\":\n",
179+
" machine_type = \"a2-ultragpu-1g\"\n",
180+
" accelerator_count = 1\n",
181+
"elif ACCELERATOR_TYPE == \"NVIDIA_H100_80GB\":\n",
182+
" machine_type = \"a3-highgpu-2g\"\n",
183+
" accelerator_count = 2\n",
184+
"else:\n",
185+
" raise ValueError(f\"Unsupported accelerator type: {ACCELERATOR_TYPE}\")\n",
186+
"accelerator_type = ACCELERATOR_TYPE"
187+
]
188+
},
189+
{
190+
"cell_type": "code",
191+
"execution_count": null,
192+
"metadata": {
193+
"cellView": "form",
194+
"id": "lSD2g1pYYamO"
195+
},
196+
"outputs": [],
197+
"source": [
198+
"# @title Deploy to Vertex AI\n",
199+
"\n",
200+
"# @markdown This section uploads the HiDream-I1 model to Model Registry and deploys it on the Endpoint with selected accelerator type.\n",
201+
"\n",
202+
"# @markdown The deployment takes ~25 minutes to finish.\n",
203+
"\n",
204+
"# @markdown Set use_dedicated_endpoint to False if you don't want to use [dedicated endpoint](https://cloud.google.com/vertex-ai/docs/general/deployment#create-dedicated-endpoint). Note that [dedicated endpoint does not support VPC Service Controls](https://cloud.google.com/vertex-ai/docs/predictions/choose-endpoint-type), uncheck the box if you are using VPC-SC.\n",
205+
"use_dedicated_endpoint = True # @param {type:\"boolean\"}\n",
206+
"\n",
207+
"# The pre-built serving docker image. It contains serving scripts and models.\n",
208+
"SERVE_DOCKER_URI = \"us-docker.pkg.dev/deeplearning-platform-release/vertex-model-garden/pytorch-inference.cu125.0-4.ubuntu2204.py310\"\n",
209+
"\n",
210+
"common_util.check_quota(\n",
211+
" project_id=PROJECT_ID,\n",
212+
" region=REGION,\n",
213+
" accelerator_type=accelerator_type,\n",
214+
" accelerator_count=accelerator_count,\n",
215+
" is_for_training=False,\n",
216+
")\n",
217+
"\n",
218+
"\n",
219+
"def deploy_model(\n",
220+
" model_id,\n",
221+
" task,\n",
222+
" machine_type,\n",
223+
" accelerator_type,\n",
224+
" accelerator_count,\n",
225+
" use_dedicated_endpoint,\n",
226+
"):\n",
227+
" \"\"\"Create a Vertex AI Endpoint and deploy the specified model to the endpoint.\"\"\"\n",
228+
"\n",
229+
" model_name = model_id\n",
230+
"\n",
231+
" endpoint = aiplatform.Endpoint.create(\n",
232+
" display_name=f\"{model_name}-endpoint\",\n",
233+
" dedicated_endpoint_enabled=use_dedicated_endpoint,\n",
234+
" )\n",
235+
" serving_env = {\n",
236+
" \"MODEL_ID\": model_id,\n",
237+
" \"TASK\": task,\n",
238+
" \"DEPLOY_SOURCE\": \"notebook\",\n",
239+
" }\n",
240+
"\n",
241+
" model = aiplatform.Model.upload(\n",
242+
" display_name=model_name,\n",
243+
" serving_container_image_uri=SERVE_DOCKER_URI,\n",
244+
" serving_container_ports=[7080],\n",
245+
" serving_container_predict_route=\"/predict\",\n",
246+
" serving_container_health_route=\"/health\",\n",
247+
" serving_container_environment_variables=serving_env,\n",
248+
" model_garden_source_model_name=PUBLISHER_MODEL_NAME,\n",
249+
" )\n",
250+
"\n",
251+
" model.deploy(\n",
252+
" endpoint=endpoint,\n",
253+
" machine_type=machine_type,\n",
254+
" accelerator_type=accelerator_type,\n",
255+
" accelerator_count=accelerator_count,\n",
256+
" deploy_request_timeout=1800,\n",
257+
" system_labels={\n",
258+
" \"NOTEBOOK_NAME\": \"model_garden_pytorch_hidream_i1.ipynb\",\n",
259+
" \"NOTEBOOK_ENVIRONMENT\": common_util.get_deploy_source(),\n",
260+
" },\n",
261+
" )\n",
262+
" return model, endpoint\n",
263+
"\n",
264+
"\n",
265+
"models[LABEL], endpoints[LABEL] = deploy_model(\n",
266+
" model_id=MODEL_ID,\n",
267+
" task=TASK,\n",
268+
" machine_type=machine_type,\n",
269+
" accelerator_type=accelerator_type,\n",
270+
" accelerator_count=accelerator_count,\n",
271+
" use_dedicated_endpoint=use_dedicated_endpoint,\n",
272+
")\n",
273+
"\n",
274+
"print(\"endpoint_name:\", endpoints[LABEL].name)"
275+
]
276+
},
277+
{
278+
"cell_type": "code",
279+
"execution_count": null,
280+
"metadata": {
281+
"cellView": "form",
282+
"id": "bb7adab99e41"
283+
},
284+
"outputs": [],
285+
"source": [
286+
"# @title Predict\n",
287+
"\n",
288+
"# @markdown Once deployment succeeds, you can send requests to the endpoint with text prompts.\n",
289+
"\n",
290+
"# @markdown Example:\n",
291+
"\n",
292+
"# @markdown ```\n",
293+
"# @markdown text: \"A cat holding a sign that says hello world\"\n",
294+
"# @markdown ```\n",
295+
"\n",
296+
"# @markdown Recommended parameters:\n",
297+
"# @markdown - HiDream-I1-Full: num_inference_steps=50, guidance_scale=5.0\n",
298+
"# @markdown - HiDream-I1-Dev: num_inference_steps=28, guidance_scale=0.0\n",
299+
"# @markdown - HiDream-I1-Fast: num_inference_steps=16, guidance_scale=0.0\n",
300+
"\n",
301+
"# @markdown You may adjust the parameters below to achieve best image quality.\n",
302+
"\n",
303+
"text = \"A cat holding a sign that says hello world\" # @param {type: \"string\"}\n",
304+
"height = 1024 # @param {type:\"number\"}\n",
305+
"width = 1024 # @param {type:\"number\"}\n",
306+
"num_inference_steps = 50 # @param {type:\"number\"}\n",
307+
"guidance_scale = 5.0 # @param {type:\"number\"}\n",
308+
"\n",
309+
"instances = [{\"text\": text}]\n",
310+
"parameters = {\n",
311+
" \"height\": height,\n",
312+
" \"width\": width,\n",
313+
" \"num_inference_steps\": num_inference_steps,\n",
314+
" \"guidance_scale\": guidance_scale,\n",
315+
"}\n",
316+
"\n",
317+
"# The default num inference steps is set to 4 in the serving container, but\n",
318+
"# you can change it to your own preference for image quality in the request.\n",
319+
"response = endpoints[LABEL].predict(\n",
320+
" instances=instances,\n",
321+
" parameters=parameters,\n",
322+
" use_dedicated_endpoint=use_dedicated_endpoint,\n",
323+
")\n",
324+
"images = [\n",
325+
" common_util.base64_to_image(prediction.get(\"output\"))\n",
326+
" for prediction in response.predictions\n",
327+
"]\n",
328+
"common_util.image_grid(images, rows=1)"
329+
]
330+
},
331+
{
332+
"cell_type": "code",
333+
"execution_count": null,
334+
"metadata": {
335+
"cellView": "form",
336+
"id": "6c460088b873"
337+
},
338+
"outputs": [],
339+
"source": [
340+
"# @title Clean up resources\n",
341+
"# @markdown Delete the experiment models and endpoints to recycle the resources\n",
342+
"# @markdown and avoid unnecessary continuous charges that may incur.\n",
343+
"\n",
344+
"# Undeploy model and delete endpoint.\n",
345+
"for endpoint in endpoints.values():\n",
346+
" endpoint.delete(force=True)\n",
347+
"\n",
348+
"# Delete models.\n",
349+
"for model in models.values():\n",
350+
" model.delete()"
351+
]
352+
}
353+
],
354+
"metadata": {
355+
"colab": {
356+
"name": "model_garden_pytorch_hidream_i1.ipynb",
357+
"toc_visible": true
358+
},
359+
"kernelspec": {
360+
"display_name": "Python 3",
361+
"name": "python3"
362+
}
363+
},
364+
"nbformat": 4,
365+
"nbformat_minor": 0
366+
}

0 commit comments

Comments
 (0)