Add HiDream-I1 serving notebook

vertex-mg-bot · copybara-github · commit f4b1b277cf32 · 2025-04-25T12:38:34.000-07:00
PiperOrigin-RevId: 751521086
diff --git a/notebooks/community/model_garden/model_garden_pytorch_hidream_i1.ipynb b/notebooks/community/model_garden/model_garden_pytorch_hidream_i1.ipynb
@@ -0,0 +1,366 @@
+{
+  "cells": [
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "7d9bbf86da5e"
+      },
+      "outputs": [],
+      "source": [
+        "# Copyright 2025 Google LLC\n",
+        "#\n",
+        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# you may not use this file except in compliance with the License.\n",
+        "# You may obtain a copy of the License at\n",
+        "#\n",
+        "#     https://www.apache.org/licenses/LICENSE-2.0\n",
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing, software\n",
+        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+        "# See the License for the specific language governing permissions and\n",
+        "# limitations under the License."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "99c1c3fc2ca5"
+      },
+      "source": [
+        "# Vertex AI Model Garden - HiDream-I1\n",
+        "\n",
+        "<table><tbody><tr>\n",
+        "  <td style=\"text-align: center\">\n",
+        "    <a href=\"https://console.cloud.google.com/vertex-ai/workbench/instances\">\n",
+        "      <img alt=\"Workbench logo\" src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" width=\"32px\"><br> Run in Workbench\n",
+        "    </a>\n",
+        "  </td>\n",
+        "  <td style=\"text-align: center\">\n",
+        "    <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fcommunity%2Fmodel_garden%2Fmodel_garden_pytorch_hidream_i1.ipynb\">\n",
+        "      <img alt=\"Google Cloud Colab Enterprise logo\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" width=\"32px\"><br> Run in Colab Enterprise\n",
+        "    </a>\n",
+        "  </td>\n",
+        "  <td style=\"text-align: center\">\n",
+        "    <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_pytorch_hidream_i1.ipynb\">\n",
+        "      <img alt=\"GitHub logo\" src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" width=\"32px\"><br> View on GitHub\n",
+        "    </a>\n",
+        "  </td>\n",
+        "</tr></tbody></table>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "3de7470326a2"
+      },
+      "source": [
+        "## Overview\n",
+        "\n",
+        "This notebook demonstrates deploying the pre-trained [HiDream-I1](https://huggingface.co/collections/HiDream-ai/hidream-i1-67f3e90dd509fed088a158b3) models on Vertex AI for online prediction.\n",
+        "\n",
+        "### Objective\n",
+        "\n",
+        "- Upload the model to [Model Registry](https://cloud.google.com/vertex-ai/docs/model-registry/introduction).\n",
+        "- Deploy the model on [Endpoint](https://cloud.google.com/vertex-ai/docs/predictions/using-private-endpoints).\n",
+        "- Run online predictions for text-to-image.\n",
+        "\n",
+        "### File a bug\n",
+        "\n",
+        "File a bug on [GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/issues/new) if you encounter any issue with the notebook.\n",
+        "\n",
+        "### Costs\n",
+        "\n",
+        "This tutorial uses billable components of Google Cloud:\n",
+        "\n",
+        "* Vertex AI\n",
+        "* Cloud Storage\n",
+        "\n",
+        "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "264c07757582"
+      },
+      "source": [
+        "## Run the notebook"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "ioensNKM8ned"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Setup Google Cloud project\n",
+        "\n",
+        "# @markdown 1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n",
+        "\n",
+        "# @markdown 2. **[Optional]** Set region. If not set, the region will be set automatically according to Colab Enterprise environment.\n",
+        "\n",
+        "REGION = \"\"  # @param {type:\"string\"}\n",
+        "\n",
+        "# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
+        "\n",
+        "# @markdown > | Machine Type | Accelerator Type | Recommended Regions |\n",
+        "# @markdown | ----------- | ----------- | ----------- |\n",
+        "# @markdown | a2-ultragpu-1g | 1 NVIDIA_A100_80GB | us-central1, us-east4, europe-west4, asia-southeast1, us-east4 |\n",
+        "# @markdown | a3-highgpu-2g | 2 NVIDIA_H100_80GB | us-west1, asia-southeast1, europe-west4 |\n",
+        "# @markdown | a3-highgpu-4g | 4 NVIDIA_H100_80GB | us-west1, asia-southeast1, europe-west4 |\n",
+        "# @markdown | a3-highgpu-8g | 8 NVIDIA_H100_80GB | us-central1, europe-west4, us-west1, asia-southeast1 |\n",
+        "\n",
+        "# Upgrade Vertex AI SDK.\n",
+        "! pip3 install --upgrade --quiet 'google-cloud-aiplatform>=1.84.0'\n",
+        "\n",
+        "import importlib\n",
+        "import os\n",
+        "\n",
+        "from google.cloud import aiplatform\n",
+        "\n",
+        "if os.environ.get(\"VERTEX_PRODUCT\") != \"COLAB_ENTERPRISE\":\n",
+        "    ! pip install --upgrade tensorflow\n",
+        "! git clone https://github.com/GoogleCloudPlatform/vertex-ai-samples.git\n",
+        "\n",
+        "common_util = importlib.import_module(\n",
+        "    \"vertex-ai-samples.community-content.vertex_model_garden.model_oss.notebook_util.common_util\"\n",
+        ")\n",
+        "\n",
+        "models, endpoints = {}, {}\n",
+        "LABEL = \"text-to-image-hidream\"\n",
+        "\n",
+        "\n",
+        "# Get the default cloud project id.\n",
+        "PROJECT_ID = os.environ[\"GOOGLE_CLOUD_PROJECT\"]\n",
+        "\n",
+        "# Get the default region for launching jobs.\n",
+        "if not REGION:\n",
+        "    REGION = os.environ[\"GOOGLE_CLOUD_REGION\"]\n",
+        "\n",
+        "# Initialize Vertex AI API.\n",
+        "print(\"Initializing Vertex AI API.\")\n",
+        "aiplatform.init(project=PROJECT_ID, location=REGION)\n",
+        "\n",
+        "! gcloud config set project $PROJECT_ID\n",
+        "import vertexai\n",
+        "\n",
+        "vertexai.init(\n",
+        "    project=PROJECT_ID,\n",
+        "    location=REGION,\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "2707b02ef5df"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Set the model parameters\n",
+        "\n",
+        "MODEL_ID = \"HiDream-ai/HiDream-I1-Full\"  # @param [\"HiDream-ai/HiDream-I1-Full\", \"HiDream-ai/HiDream-I1-Dev\", \"HiDream-ai/HiDream-I1-Fast\"]\n",
+        "TASK = \"text-to-image-hidream\"\n",
+        "\n",
+        "model_version = MODEL_ID.split(\"/\")[-1].lower()\n",
+        "PUBLISHER_MODEL_NAME = f\"publishers/hidream-i1/models/hidream-i1-full@{model_version}\"\n",
+        "\n",
+        "ACCELERATOR_TYPE = \"NVIDIA_A100_80GB\"  # @param [\"NVIDIA_A100_80GB\", \"NVIDIA_H100_80GB\"]\n",
+        "\n",
+        "if ACCELERATOR_TYPE == \"NVIDIA_A100_80GB\":\n",
+        "    machine_type = \"a2-ultragpu-1g\"\n",
+        "    accelerator_count = 1\n",
+        "elif ACCELERATOR_TYPE == \"NVIDIA_H100_80GB\":\n",
+        "    machine_type = \"a3-highgpu-2g\"\n",
+        "    accelerator_count = 2\n",
+        "else:\n",
+        "    raise ValueError(f\"Unsupported accelerator type: {ACCELERATOR_TYPE}\")\n",
+        "accelerator_type = ACCELERATOR_TYPE"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "lSD2g1pYYamO"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Deploy to Vertex AI\n",
+        "\n",
+        "# @markdown This section uploads the HiDream-I1 model to Model Registry and deploys it on the Endpoint with selected accelerator type.\n",
+        "\n",
+        "# @markdown The deployment takes ~25 minutes to finish.\n",
+        "\n",
+        "# @markdown Set use_dedicated_endpoint to False if you don't want to use [dedicated endpoint](https://cloud.google.com/vertex-ai/docs/general/deployment#create-dedicated-endpoint). Note that [dedicated endpoint does not support VPC Service Controls](https://cloud.google.com/vertex-ai/docs/predictions/choose-endpoint-type), uncheck the box if you are using VPC-SC.\n",
+        "use_dedicated_endpoint = True  # @param {type:\"boolean\"}\n",
+        "\n",
+        "# The pre-built serving docker image. It contains serving scripts and models.\n",
+        "SERVE_DOCKER_URI = \"us-docker.pkg.dev/deeplearning-platform-release/vertex-model-garden/pytorch-inference.cu125.0-4.ubuntu2204.py310\"\n",
+        "\n",
+        "common_util.check_quota(\n",
+        "    project_id=PROJECT_ID,\n",
+        "    region=REGION,\n",
+        "    accelerator_type=accelerator_type,\n",
+        "    accelerator_count=accelerator_count,\n",
+        "    is_for_training=False,\n",
+        ")\n",
+        "\n",
+        "\n",
+        "def deploy_model(\n",
+        "    model_id,\n",
+        "    task,\n",
+        "    machine_type,\n",
+        "    accelerator_type,\n",
+        "    accelerator_count,\n",
+        "    use_dedicated_endpoint,\n",
+        "):\n",
+        "    \"\"\"Create a Vertex AI Endpoint and deploy the specified model to the endpoint.\"\"\"\n",
+        "\n",
+        "    model_name = model_id\n",
+        "\n",
+        "    endpoint = aiplatform.Endpoint.create(\n",
+        "        display_name=f\"{model_name}-endpoint\",\n",
+        "        dedicated_endpoint_enabled=use_dedicated_endpoint,\n",
+        "    )\n",
+        "    serving_env = {\n",
+        "        \"MODEL_ID\": model_id,\n",
+        "        \"TASK\": task,\n",
+        "        \"DEPLOY_SOURCE\": \"notebook\",\n",
+        "    }\n",
+        "\n",
+        "    model = aiplatform.Model.upload(\n",
+        "        display_name=model_name,\n",
+        "        serving_container_image_uri=SERVE_DOCKER_URI,\n",
+        "        serving_container_ports=[7080],\n",
+        "        serving_container_predict_route=\"/predict\",\n",
+        "        serving_container_health_route=\"/health\",\n",
+        "        serving_container_environment_variables=serving_env,\n",
+        "        model_garden_source_model_name=PUBLISHER_MODEL_NAME,\n",
+        "    )\n",
+        "\n",
+        "    model.deploy(\n",
+        "        endpoint=endpoint,\n",
+        "        machine_type=machine_type,\n",
+        "        accelerator_type=accelerator_type,\n",
+        "        accelerator_count=accelerator_count,\n",
+        "        deploy_request_timeout=1800,\n",
+        "        system_labels={\n",
+        "            \"NOTEBOOK_NAME\": \"model_garden_pytorch_hidream_i1.ipynb\",\n",
+        "            \"NOTEBOOK_ENVIRONMENT\": common_util.get_deploy_source(),\n",
+        "        },\n",
+        "    )\n",
+        "    return model, endpoint\n",
+        "\n",
+        "\n",
+        "models[LABEL], endpoints[LABEL] = deploy_model(\n",
+        "    model_id=MODEL_ID,\n",
+        "    task=TASK,\n",
+        "    machine_type=machine_type,\n",
+        "    accelerator_type=accelerator_type,\n",
+        "    accelerator_count=accelerator_count,\n",
+        "    use_dedicated_endpoint=use_dedicated_endpoint,\n",
+        ")\n",
+        "\n",
+        "print(\"endpoint_name:\", endpoints[LABEL].name)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "bb7adab99e41"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Predict\n",
+        "\n",
+        "# @markdown Once deployment succeeds, you can send requests to the endpoint with text prompts.\n",
+        "\n",
+        "# @markdown Example:\n",
+        "\n",
+        "# @markdown ```\n",
+        "# @markdown text: \"A cat holding a sign that says hello world\"\n",
+        "# @markdown ```\n",
+        "\n",
+        "# @markdown Recommended parameters:\n",
+        "# @markdown - HiDream-I1-Full: num_inference_steps=50, guidance_scale=5.0\n",
+        "# @markdown - HiDream-I1-Dev: num_inference_steps=28, guidance_scale=0.0\n",
+        "# @markdown - HiDream-I1-Fast: num_inference_steps=16, guidance_scale=0.0\n",
+        "\n",
+        "# @markdown You may adjust the parameters below to achieve best image quality.\n",
+        "\n",
+        "text = \"A cat holding a sign that says hello world\"  # @param {type: \"string\"}\n",
+        "height = 1024  # @param {type:\"number\"}\n",
+        "width = 1024  # @param {type:\"number\"}\n",
+        "num_inference_steps = 50  # @param {type:\"number\"}\n",
+        "guidance_scale = 5.0  # @param {type:\"number\"}\n",
+        "\n",
+        "instances = [{\"text\": text}]\n",
+        "parameters = {\n",
+        "    \"height\": height,\n",
+        "    \"width\": width,\n",
+        "    \"num_inference_steps\": num_inference_steps,\n",
+        "    \"guidance_scale\": guidance_scale,\n",
+        "}\n",
+        "\n",
+        "# The default num inference steps is set to 4 in the serving container, but\n",
+        "# you can change it to your own preference for image quality in the request.\n",
+        "response = endpoints[LABEL].predict(\n",
+        "    instances=instances,\n",
+        "    parameters=parameters,\n",
+        "    use_dedicated_endpoint=use_dedicated_endpoint,\n",
+        ")\n",
+        "images = [\n",
+        "    common_util.base64_to_image(prediction.get(\"output\"))\n",
+        "    for prediction in response.predictions\n",
+        "]\n",
+        "common_util.image_grid(images, rows=1)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "6c460088b873"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Clean up resources\n",
+        "# @markdown  Delete the experiment models and endpoints to recycle the resources\n",
+        "# @markdown  and avoid unnecessary continuous charges that may incur.\n",
+        "\n",
+        "# Undeploy model and delete endpoint.\n",
+        "for endpoint in endpoints.values():\n",
+        "    endpoint.delete(force=True)\n",
+        "\n",
+        "# Delete models.\n",
+        "for model in models.values():\n",
+        "    model.delete()"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "name": "model_garden_pytorch_hidream_i1.ipynb",
+      "toc_visible": true
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}