Skip to content

Commit d933c29

Browse files
vertex-mg-botcopybara-github
authored andcommitted
Update link to Cloud Quotas page to correct location
PiperOrigin-RevId: 840478912
1 parent 85c649d commit d933c29

File tree

60 files changed

+624
-847
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+624
-847
lines changed

notebooks/community/model_garden/model_garden_advanced_features.ipynb

Lines changed: 3 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
{
44
"cell_type": "code",
55
"execution_count": null,
6-
"id": "DZ1j6RRg-Td6",
76
"metadata": {
87
"cellView": "form",
98
"id": "f705f4be70e9"
@@ -27,7 +26,6 @@
2726
},
2827
{
2928
"cell_type": "markdown",
30-
"id": "99c1c3fc2ca5",
3129
"metadata": {
3230
"id": "71a642b5575a"
3331
},
@@ -50,7 +48,6 @@
5048
},
5149
{
5250
"cell_type": "markdown",
53-
"id": "f9-tJ6RfDLIs",
5451
"metadata": {
5552
"id": "0779b48f654e"
5653
},
@@ -88,7 +85,6 @@
8885
},
8986
{
9087
"cell_type": "markdown",
91-
"id": "47GcOrZjosOx",
9288
"metadata": {
9389
"id": "69453bf7230e"
9490
},
@@ -98,7 +94,6 @@
9894
},
9995
{
10096
"cell_type": "markdown",
101-
"id": "1D_pWejJPHP3",
10297
"metadata": {
10398
"id": "bf3706e69f61"
10499
},
@@ -109,15 +104,14 @@
109104
"- Custom model serving TPU v5e cores per region\n",
110105
"- Custom model serving Nvidia A100 80GB GPUs per region\n",
111106
"\n",
112-
"By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4, which is sufficient for serving the Llama 3.1 8B model. The Llama 3.1 70B model requires 16 TPU v5e cores. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
107+
"By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4, which is sufficient for serving the Llama 3.1 8B model. The Llama 3.1 70B model requires 16 TPU v5e cores. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
113108
"\n",
114-
"The quota for A100_80GB deployment `Custom model serving Nvidia A100 80GB GPUs per region` is 0. You need to request at least 4 for 70B model and 1 for 8B model following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota)."
109+
"The quota for A100_80GB deployment `Custom model serving Nvidia A100 80GB GPUs per region` is 0. You need to request at least 4 for 70B model and 1 for 8B model following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota)."
115110
]
116111
},
117112
{
118113
"cell_type": "code",
119114
"execution_count": null,
120-
"id": "L3dqbxovo5t6",
121115
"metadata": {
122116
"cellView": "form",
123117
"id": "50047cc80bb9"
@@ -136,7 +130,7 @@
136130
"\n",
137131
"REGION = \"\" # @param {type:\"string\"}\n",
138132
"\n",
139-
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
133+
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
140134
"\n",
141135
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
142136
"# @markdown | ----------- | ----------- | ----------- |\n",
@@ -223,7 +217,6 @@
223217
},
224218
{
225219
"cell_type": "markdown",
226-
"id": "SeGqxuMfRBS5",
227220
"metadata": {
228221
"id": "4782dd003acb"
229222
},
@@ -234,7 +227,6 @@
234227
{
235228
"cell_type": "code",
236229
"execution_count": null,
237-
"id": "BxlzWU2KQqmw",
238230
"metadata": {
239231
"cellView": "form",
240232
"id": "798068fc0355"
@@ -274,7 +266,6 @@
274266
},
275267
{
276268
"cell_type": "markdown",
277-
"id": "JpNBJJgjWL7j",
278269
"metadata": {
279270
"id": "10ed490e28e5"
280271
},
@@ -302,7 +293,6 @@
302293
},
303294
{
304295
"cell_type": "markdown",
305-
"id": "9gZJ8cB27e1m",
306296
"metadata": {
307297
"id": "30ddb93fdd7b"
308298
},
@@ -317,7 +307,6 @@
317307
{
318308
"cell_type": "code",
319309
"execution_count": null,
320-
"id": "RpmoA2nXjdCd",
321310
"metadata": {
322311
"cellView": "form",
323312
"id": "b56d82c1aa6f"
@@ -506,7 +495,6 @@
506495
{
507496
"cell_type": "code",
508497
"execution_count": null,
509-
"id": "5QoK8c0R9U3B",
510498
"metadata": {
511499
"cellView": "form",
512500
"id": "96c5afed49b4"
@@ -572,7 +560,6 @@
572560
{
573561
"cell_type": "code",
574562
"execution_count": null,
575-
"id": "29rn5ATmB2YC",
576563
"metadata": {
577564
"cellView": "form",
578565
"id": "9a95c9f90358"
@@ -646,7 +633,6 @@
646633
},
647634
{
648635
"cell_type": "markdown",
649-
"id": "KjbM8E9DGuuR",
650636
"metadata": {
651637
"id": "12ad6d1ff725"
652638
},
@@ -657,7 +643,6 @@
657643
{
658644
"cell_type": "code",
659645
"execution_count": null,
660-
"id": "JpLU7GRQGuuR",
661646
"metadata": {
662647
"cellView": "form",
663648
"id": "1ab4e3bb74b4"
@@ -684,7 +669,6 @@
684669
},
685670
{
686671
"cell_type": "markdown",
687-
"id": "XZ33HhYmOxCS",
688672
"metadata": {
689673
"id": "7a8a9a1b2ddf"
690674
},
@@ -706,7 +690,6 @@
706690
{
707691
"cell_type": "code",
708692
"execution_count": null,
709-
"id": "E8OiHHNNE_wj",
710693
"metadata": {
711694
"cellView": "form",
712695
"id": "4425cc0bdedc"
@@ -910,7 +893,6 @@
910893
{
911894
"cell_type": "code",
912895
"execution_count": null,
913-
"id": "zex1oXl36A70",
914896
"metadata": {
915897
"cellView": "form",
916898
"id": "bcbafec839cd"
@@ -973,7 +955,6 @@
973955
{
974956
"cell_type": "code",
975957
"execution_count": null,
976-
"id": "gDOC_nfsJeUR",
977958
"metadata": {
978959
"cellView": "form",
979960
"id": "e984f43422d5"
@@ -1047,7 +1028,6 @@
10471028
},
10481029
{
10491030
"cell_type": "markdown",
1050-
"id": "GdGxaTirJeUR",
10511031
"metadata": {
10521032
"id": "dff0d10dcc20"
10531033
},
@@ -1058,7 +1038,6 @@
10581038
{
10591039
"cell_type": "code",
10601040
"execution_count": null,
1061-
"id": "OgoqXE-VJeUR",
10621041
"metadata": {
10631042
"cellView": "form",
10641043
"id": "5b8751773e7f"
@@ -1085,7 +1064,6 @@
10851064
},
10861065
{
10871066
"cell_type": "markdown",
1088-
"id": "w4Guijaw_NEs",
10891067
"metadata": {
10901068
"id": "863775857a46"
10911069
},
@@ -1100,7 +1078,6 @@
11001078
},
11011079
{
11021080
"cell_type": "markdown",
1103-
"id": "ml8fgoIQWSbY",
11041081
"metadata": {
11051082
"id": "565cbdc3a06b"
11061083
},
@@ -1142,7 +1119,6 @@
11421119
},
11431120
{
11441121
"cell_type": "markdown",
1145-
"id": "NmWRro8Q-Td6",
11461122
"metadata": {
11471123
"id": "94eaa9050abb"
11481124
},
@@ -1153,7 +1129,6 @@
11531129
{
11541130
"cell_type": "code",
11551131
"execution_count": null,
1156-
"id": "72d1GlrYifKU",
11571132
"metadata": {
11581133
"cellView": "form",
11591134
"id": "5f358cc230a6"
@@ -1470,7 +1445,6 @@
14701445
{
14711446
"cell_type": "code",
14721447
"execution_count": null,
1473-
"id": "CNiItf5hdVFU",
14741448
"metadata": {
14751449
"cellView": "form",
14761450
"id": "be3170e0e05a"
@@ -1546,7 +1520,6 @@
15461520
},
15471521
{
15481522
"cell_type": "markdown",
1549-
"id": "WahYGAZyq6Gl",
15501523
"metadata": {
15511524
"id": "30c5d2535df3"
15521525
},
@@ -1556,7 +1529,6 @@
15561529
},
15571530
{
15581531
"cell_type": "markdown",
1559-
"id": "bV5Yjkgav9BZ",
15601532
"metadata": {
15611533
"id": "63c10917ff95"
15621534
},
@@ -1567,7 +1539,6 @@
15671539
{
15681540
"cell_type": "code",
15691541
"execution_count": null,
1570-
"id": "qsks36cOH9rb",
15711542
"metadata": {
15721543
"cellView": "form",
15731544
"id": "92892e1b1730"

notebooks/community/model_garden/model_garden_codegemma_deployment_on_vertex.ipynb

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@
9595
"\n",
9696
"# @markdown 1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n",
9797
"\n",
98-
"# @markdown 2. By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
98+
"# @markdown 2. By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
9999
"\n",
100100
"# @markdown 3. **[Optional]** [Create a Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) for storing experiment outputs. Set the BUCKET_URI for the experiment environment. The specified Cloud Storage bucket (`BUCKET_URI`) should be located in the same region as where the notebook was launched. Note that a multi-region bucket (eg. \"us\") is not considered a match for a single region covered by the multi-region range (eg. \"us-central1\"). If not set, a unique GCS bucket will be created instead.\n",
101101
"\n",
@@ -434,7 +434,6 @@
434434
" )\n",
435435
" return model, endpoint\n",
436436
"\n",
437-
"\n",
438437
"models[\"hexllm_tpu\"], endpoints[\"hexllm_tpu\"] = deploy_model_hexllm(\n",
439438
" model_name=common_util.get_job_name_with_datetime(prefix=MODEL_ID),\n",
440439
" model_id=model_id,\n",
@@ -659,6 +658,7 @@
659658
" vllm_args.append(\"--enable-auto-tool-choice\")\n",
660659
" vllm_args.append(\"--tool-call-parser=vertex-llama-3\")\n",
661660
"\n",
661+
"\n",
662662
" env_vars = {\n",
663663
" \"MODEL_ID\": base_model_id,\n",
664664
" \"DEPLOY_SOURCE\": \"notebook\",\n",
@@ -704,7 +704,6 @@
704704
"\n",
705705
" return model, endpoint\n",
706706
"\n",
707-
"\n",
708707
"models[\"vllm_gpu\"], endpoints[\"vllm_gpu\"] = deploy_model_vllm(\n",
709708
" model_name=common_util.get_job_name_with_datetime(prefix=\"codegemma-serve-vllm\"),\n",
710709
" model_id=model_id,\n",

notebooks/community/model_garden/model_garden_deployment_tutorial.ipynb

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@
9797
"\n",
9898
"REGION = \"\" # @param {type:\"string\"}\n",
9999
"\n",
100-
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
100+
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
101101
"\n",
102102
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
103103
"# @markdown | ----------- | ----------- | ----------- |\n",
@@ -176,9 +176,7 @@
176176
"# @markdown You can also filter by model name.\n",
177177
"model_filter = \"gemma\" # @param {type:\"string\"}\n",
178178
"\n",
179-
"model_garden.list_deployable_models(\n",
180-
" list_hf_models=list_hf_models, model_filter=model_filter\n",
181-
")"
179+
"model_garden.list_deployable_models(list_hf_models=list_hf_models, model_filter=model_filter)"
182180
]
183181
},
184182
{
@@ -228,10 +226,10 @@
228226
"endpoints[LABEL] = model.deploy(\n",
229227
" hugging_face_access_token=HF_TOKEN,\n",
230228
" use_dedicated_endpoint=use_dedicated_endpoint,\n",
231-
" accept_eula=True, # Accept the End User License Agreement (EULA) on the model card before deploy. Otherwise, the deployment will be forbidden.\n",
229+
" accept_eula = True, # Accept the End User License Agreement (EULA) on the model card before deploy. Otherwise, the deployment will be forbidden.\n",
232230
")\n",
233231
"\n",
234-
"endpoint = endpoints[LABEL]"
232+
"endpoint=endpoints[LABEL]"
235233
]
236234
},
237235
{

0 commit comments

Comments
 (0)