-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
1 changed file
with
378 additions
and
0 deletions.
There are no files selected for viewing
378 changes: 378 additions & 0 deletions
378
notebooks/community/model_garden/model_garden_xdit_cogvideox_2b.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,378 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"cellView": "form", | ||
"id": "1gcBBbBCW_CV" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Copyright 2025 Google LLC\n", | ||
"#\n", | ||
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n", | ||
"# you may not use this file except in compliance with the License.\n", | ||
"# You may obtain a copy of the License at\n", | ||
"#\n", | ||
"# https://www.apache.org/licenses/LICENSE-2.0\n", | ||
"#\n", | ||
"# Unless required by applicable law or agreed to in writing, software\n", | ||
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n", | ||
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", | ||
"# See the License for the specific language governing permissions and\n", | ||
"# limitations under the License." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "wKzYxAA1W_CV" | ||
}, | ||
"source": [ | ||
"# Vertex AI Model Garden - CogVideoX-2b\n", | ||
"\n", | ||
"<table><tbody><tr>\n", | ||
" <td style=\"text-align: center\">\n", | ||
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fcommunity%2Fmodel_garden%2Fmodel_garden_xdit_cogvideox_2b.ipynb\">\n", | ||
" <img alt=\"Google Cloud Colab Enterprise logo\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" width=\"32px\"><br> Run in Colab Enterprise\n", | ||
" </a>\n", | ||
" </td>\n", | ||
" <td style=\"text-align: center\">\n", | ||
" <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_xdit_cogvideox_2b.ipynb\">\n", | ||
" <img alt=\"GitHub logo\" src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" width=\"32px\"><br> View on GitHub\n", | ||
" </a>\n", | ||
" </td>\n", | ||
"</tr></tbody></table>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "2WwEeH8BW_CV" | ||
}, | ||
"source": [ | ||
"## Overview\n", | ||
"\n", | ||
"This notebook demonstrates deploying the pre-trained [CogVideoX-2b](https://huggingface.co/THUDM/CogVideoX-2b) model on Vertex AI for online prediction.\n", | ||
"\n", | ||
"### Objective\n", | ||
"\n", | ||
"- Upload the model to [Model Registry](https://cloud.google.com/vertex-ai/docs/model-registry/introduction).\n", | ||
"- Deploy the model on [Endpoint](https://cloud.google.com/vertex-ai/docs/predictions/using-private-endpoints).\n", | ||
"- Run online predictions for text-to-video.\n", | ||
"\n", | ||
"### Costs\n", | ||
"\n", | ||
"This tutorial uses billable components of Google Cloud:\n", | ||
"\n", | ||
"* Vertex AI\n", | ||
"* Cloud Storage\n", | ||
"\n", | ||
"Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "TAKAyLQvW_CV" | ||
}, | ||
"source": [ | ||
"## Run the notebook" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"cellView": "form", | ||
"id": "sGzHHcL3W_CV" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# @title Setup Google Cloud project\n", | ||
"\n", | ||
"# @markdown 1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n", | ||
"\n", | ||
"# @markdown 2. **[Optional]** [Create a Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) for storing experiment outputs. Set the BUCKET_URI for the experiment environment. The specified Cloud Storage bucket (`BUCKET_URI`) should be located in the same region as where the notebook was launched. Note that a multi-region bucket (eg. \"us\") is not considered a match for a single region covered by the multi-region range (eg. \"us-central1\"). If not set, a unique GCS bucket will be created instead.\n", | ||
"\n", | ||
"BUCKET_URI = \"gs://\" # @param {type:\"string\"}\n", | ||
"\n", | ||
"# @markdown 3. **[Optional]** Set region. If not set, the region will be set automatically according to Colab Enterprise environment.\n", | ||
"\n", | ||
"REGION = \"\" # @param {type:\"string\"}\n", | ||
"\n", | ||
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus).\n", | ||
"\n", | ||
"# @markdown > | Machine Type | Accelerator Type | Recommended Regions |\n", | ||
"# @markdown | ----------- | ----------- | ----------- |\n", | ||
"# @markdown | a2-ultragpu-1g | 1 NVIDIA_A100_80GB | us-central1, us-east4, europe-west4, asia-southeast1, us-east4 |\n", | ||
"# @markdown | a3-highgpu-2g | 2 NVIDIA_H100_80GB | us-west1, asia-southeast1, europe-west4 |\n", | ||
"# @markdown | a3-highgpu-4g | 4 NVIDIA_H100_80GB | us-west1, asia-southeast1, europe-west4 |\n", | ||
"# @markdown | a3-highgpu-8g | 8 NVIDIA_H100_80GB | us-central1, us-east5, europe-west4, us-west1, asia-southeast1 |\n", | ||
"\n", | ||
"import datetime\n", | ||
"import importlib\n", | ||
"import os\n", | ||
"import uuid\n", | ||
"\n", | ||
"from google.cloud import aiplatform\n", | ||
"from IPython.display import HTML\n", | ||
"\n", | ||
"# Get the default cloud project id.\n", | ||
"PROJECT_ID = os.environ[\"GOOGLE_CLOUD_PROJECT\"]\n", | ||
"\n", | ||
"# Get the default region for launching jobs.\n", | ||
"if not REGION:\n", | ||
" REGION = os.environ[\"GOOGLE_CLOUD_REGION\"]\n", | ||
"\n", | ||
"# Enable the Vertex AI API and Compute Engine API, if not already.\n", | ||
"print(\"Enabling Vertex AI API and Compute Engine API.\")\n", | ||
"! gcloud services enable aiplatform.googleapis.com compute.googleapis.com\n", | ||
"\n", | ||
"# Cloud Storage bucket for storing the experiment artifacts.\n", | ||
"# A unique GCS bucket will be created for the purpose of this notebook. If you\n", | ||
"# prefer using your own GCS bucket, change the value yourself below.\n", | ||
"now = datetime.datetime.now().strftime(\"%Y%m%d%H%M%S\")\n", | ||
"BUCKET_NAME = \"/\".join(BUCKET_URI.split(\"/\")[:3])\n", | ||
"\n", | ||
"if BUCKET_URI is None or BUCKET_URI.strip() == \"\" or BUCKET_URI == \"gs://\":\n", | ||
" BUCKET_URI = f\"gs://{PROJECT_ID}-tmp-{now}-{str(uuid.uuid4())[:4]}\"\n", | ||
" BUCKET_NAME = \"/\".join(BUCKET_URI.split(\"/\")[:3])\n", | ||
" ! gsutil mb -l {REGION} {BUCKET_URI}\n", | ||
"else:\n", | ||
" assert BUCKET_URI.startswith(\"gs://\"), \"BUCKET_URI must start with `gs://`.\"\n", | ||
" shell_output = ! gsutil ls -Lb {BUCKET_NAME} | grep \"Location constraint:\" | sed \"s/Location constraint://\"\n", | ||
" bucket_region = shell_output[0].strip().lower()\n", | ||
" if bucket_region != REGION:\n", | ||
" raise ValueError(\n", | ||
" \"Bucket region %s is different from notebook region %s\"\n", | ||
" % (bucket_region, REGION)\n", | ||
" )\n", | ||
"print(f\"Using this GCS Bucket: {BUCKET_URI}\")\n", | ||
"\n", | ||
"STAGING_BUCKET = os.path.join(BUCKET_URI, \"temporal\")\n", | ||
"MODEL_BUCKET = os.path.join(BUCKET_URI, \"cogvideox-2b\")\n", | ||
"\n", | ||
"\n", | ||
"# Initialize Vertex AI API.\n", | ||
"print(\"Initializing Vertex AI API.\")\n", | ||
"aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=STAGING_BUCKET)\n", | ||
"\n", | ||
"# Gets the default SERVICE_ACCOUNT.\n", | ||
"shell_output = ! gcloud projects describe $PROJECT_ID\n", | ||
"project_number = shell_output[-1].split(\":\")[1].strip().replace(\"'\", \"\")\n", | ||
"SERVICE_ACCOUNT = f\"{project_number}[email protected]\"\n", | ||
"print(\"Using this default Service Account:\", SERVICE_ACCOUNT)\n", | ||
"\n", | ||
"\n", | ||
"# Provision permissions to the SERVICE_ACCOUNT with the GCS bucket\n", | ||
"! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.admin $BUCKET_NAME\n", | ||
"\n", | ||
"! gcloud config set project $PROJECT_ID\n", | ||
"! gcloud projects add-iam-policy-binding --no-user-output-enabled {PROJECT_ID} --member=serviceAccount:{SERVICE_ACCOUNT} --role=\"roles/storage.admin\"\n", | ||
"! gcloud projects add-iam-policy-binding --no-user-output-enabled {PROJECT_ID} --member=serviceAccount:{SERVICE_ACCOUNT} --role=\"roles/aiplatform.user\"\n", | ||
"\n", | ||
"models, endpoints = {}, {}\n", | ||
"\n", | ||
"! git clone https://github.com/GoogleCloudPlatform/vertex-ai-samples.git\n", | ||
"\n", | ||
"common_util = importlib.import_module(\n", | ||
" \"vertex-ai-samples.community-content.vertex_model_garden.model_oss.notebook_util.common_util\"\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"cellView": "form", | ||
"id": "q36QziORW_CV" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# @title Deploy the model to Vertex for online predictions\n", | ||
"\n", | ||
"# @markdown This section uploads the [THUDM/CogVideoX-2b](https://huggingface.co/THUDM/CogVideoX-2b) model to Model Registry and deploys it on the Endpoint with the specified accelerator.\n", | ||
"\n", | ||
"# @markdown The deployment takes ~15-30 minutes to finish.\n", | ||
"\n", | ||
"model_id = \"THUDM/CogVideoX-2b\"\n", | ||
"task = \"text-to-video\"\n", | ||
"\n", | ||
"accelerator_type = \"NVIDIA_A100_80GB\" # @param [\"NVIDIA_A100_80GB\", \"NVIDIA_H100_80GB\", \"2 NVIDIA_H100_80GB\", \"2 NVIDIA_L4\"]\n", | ||
"\n", | ||
"machine_type_map = {\n", | ||
" \"NVIDIA_A100_80GB\": \"a2-ultragpu-1g\",\n", | ||
" \"NVIDIA_H100_80GB\": \"a3-highgpu-1g\",\n", | ||
" \"2 NVIDIA_H100_80GB\": \"a3-highgpu-2g\",\n", | ||
" \"2 NVIDIA_L4\": \"g2-standard-24\"\n", | ||
"}\n", | ||
"\n", | ||
"machine_type = machine_type_map.get(accelerator_type)\n", | ||
"accelerator_count = 1\n", | ||
"\n", | ||
"if machine_type is \"a3-highgpu-2g\":\n", | ||
" accelerator_type = \"NVIDIA_H100_80GB\"\n", | ||
" accelerator_count = 2\n", | ||
"elif machine_type is \"g2-standard-24\":\n", | ||
" accelerator_type = \"NVIDIA_L4\"\n", | ||
" accelerator_count = 2\n", | ||
"\n", | ||
"\n", | ||
"# The pre-built serving docker image. It contains serving scripts and models.\n", | ||
"SERVE_DOCKER_URI = \"us-docker.pkg.dev/deeplearning-platform-release/vertex-model-garden/xdit-serve.cu125.0-1.ubuntu2204.py310\"\n", | ||
"\n", | ||
"\n", | ||
"def deploy_model(model_id, task, machine_type, accelerator_type, accelerator_count):\n", | ||
" \"\"\"Create a Vertex AI Endpoint and deploy the specified model to the endpoint.\"\"\"\n", | ||
" common_util.check_quota(\n", | ||
" project_id=PROJECT_ID,\n", | ||
" region=REGION,\n", | ||
" accelerator_type=accelerator_type,\n", | ||
" accelerator_count=accelerator_count,\n", | ||
" is_for_training=False,\n", | ||
" )\n", | ||
"\n", | ||
" model_name = model_id\n", | ||
"\n", | ||
" endpoint = aiplatform.Endpoint.create(display_name=f\"{model_name}-endpoint\")\n", | ||
" serving_env = {\n", | ||
" \"MODEL_ID\": model_id,\n", | ||
" \"TASK\": task,\n", | ||
" \"DEPLOY_SOURCE\": \"notebook\",\n", | ||
" }\n", | ||
"\n", | ||
" # xDiT serving parameters\n", | ||
" serving_env[\"N_GPUS\"] = accelerator_count\n", | ||
" serving_env[\"ENABLE_SLICING\"] = \"true\"\n", | ||
" serving_env[\"ENABLE_TILING\"] = \"true\"\n", | ||
" if accelerator_count == 2:\n", | ||
" serving_env[\"USE_CFG_PARALLEL\"] = \"true\"\n", | ||
"\n", | ||
" model = aiplatform.Model.upload(\n", | ||
" display_name=model_name,\n", | ||
" serving_container_image_uri=SERVE_DOCKER_URI,\n", | ||
" serving_container_ports=[7080],\n", | ||
" serving_container_predict_route=\"/predict\",\n", | ||
" serving_container_health_route=\"/health\",\n", | ||
" serving_container_environment_variables=serving_env,\n", | ||
" model_garden_source_model_name=\"publishers/thudm/models/cogvideox-2b\"\n", | ||
" )\n", | ||
"\n", | ||
" model.deploy(\n", | ||
" endpoint=endpoint,\n", | ||
" machine_type=machine_type,\n", | ||
" accelerator_type=accelerator_type,\n", | ||
" accelerator_count=accelerator_count,\n", | ||
" deploy_request_timeout=1800,\n", | ||
" service_account=SERVICE_ACCOUNT,\n", | ||
" system_labels={\n", | ||
" \"NOTEBOOK_NAME\": \"model_garden_xdit_cogvideox_2b.ipynb\"\n", | ||
" )\n", | ||
" return model, endpoint\n", | ||
"\n", | ||
"\n", | ||
"models[\"model\"], endpoints[\"endpoint\"] = deploy_model(\n", | ||
" model_id=model_id,\n", | ||
" task=task,\n", | ||
" machine_type=machine_type,\n", | ||
" accelerator_type=accelerator_type,\n", | ||
" accelerator_count=accelerator_count,\n", | ||
")\n", | ||
"\n", | ||
"print(\"endpoint_name:\", endpoints[\"endpoint\"].name)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"cellView": "form", | ||
"id": "TKJsEJoeW_CV" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# @title Predict\n", | ||
"\n", | ||
"# @markdown Once deployment succeeds, you can send requests to the endpoint with text prompts.\n", | ||
"\n", | ||
"# @markdown The inference takes ~70s with 1 A100 GPU.\n", | ||
"\n", | ||
"# @markdown The inference takes ~40s with 1 H100 GPU.\n", | ||
"\n", | ||
"# @markdown The inference takes ~18s with 2 H100 GPU\n", | ||
"\n", | ||
"# @markdown The inference takes ~110s with 2 L4 GPU.\n", | ||
"\n", | ||
"# @markdown Example:\n", | ||
"\n", | ||
"# @markdown ```\n", | ||
"# @markdown text: A cat waving a sign that says hello world\n", | ||
"# @markdown ```\n", | ||
"\n", | ||
"# @markdown You may adjust the parameters below to achieve best video quality.\n", | ||
"\n", | ||
"text = \"A cat waving a sign that says hello world\" # @param {type: \"string\"}\n", | ||
"num_inference_steps = 50 # @param {type:\"number\"}\n", | ||
"\n", | ||
"instances = [{\"text\": text}]\n", | ||
"parameters = {\n", | ||
" \"num_inference_steps\": num_inference_steps,\n", | ||
"}\n", | ||
"\n", | ||
"\n", | ||
"response = endpoints[\"endpoint\"].predict(instances=instances, parameters=parameters)\n", | ||
"\n", | ||
"video_bytes = response.predictions[0][\"output\"]\n", | ||
"\n", | ||
"video_html = f\"\"\"\n", | ||
"<video width=\"720\" height=\"480\" controls>\n", | ||
"<source src=\"data:video/mp4;base64,{video_bytes}\" type=\"video/mp4\">\n", | ||
"Your browser does not support the video tag.\n", | ||
"</video>\n", | ||
"\"\"\" # Assumes MP4. Change type if needed (e.g., video/webm)\n", | ||
"\n", | ||
"display(HTML(video_html))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"cellView": "form", | ||
"id": "42leJGJFW_CV" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# @title Clean up resources\n", | ||
"# @markdown Delete the experiment models and endpoints to recycle the resources\n", | ||
"# @markdown and avoid unnecessary continuous charges that may incur.\n", | ||
"\n", | ||
"# Undeploy model and delete endpoint.\n", | ||
"for endpoint in endpoints.values():\n", | ||
" endpoint.delete(force=True)\n", | ||
"\n", | ||
"# Delete models.\n", | ||
"for model in models.values():\n", | ||
" model.delete()\n", | ||
"\n", | ||
"delete_bucket = False # @param {type:\"boolean\"}\n", | ||
"if delete_bucket:\n", | ||
" ! gsutil -m rm -r $BUCKET_NAME" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"colab": { | ||
"name": "model_garden_xdit_cogvideox_2b.ipynb", | ||
"toc_visible": true | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"name": "python3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |