feat: Add GKE Inference Gateway support #4699

SinaChavoshi · 2025-09-25T23:35:00Z

This PR introduces support for the GKE Inference Gateway by adding a new feature flag to the gke-cluster module.
Key changes:

Added a new boolean variable, enable_inference_gateway, to the gke-cluster module. When set to true, this flag:
- Enables the HttpLoadBalancing add-on in the GKE cluster, a prerequisite for the Gateway API.
- Deploys the necessary Inference Gateway Custom Resource Definitions (CRDs) directly from the official Kubernetes SIGs repository.
Created a new example blueprint, gke-a3-highgpu-inference-gateway.yaml, to demonstrate how to enable this feature. This blueprint also includes the required REGIONAL_MANAGED_PROXY
subnet.
Updated the examples/README.md to include documentation for the new blueprint, guiding users on how to deploy a sample workload after the cluster is provisioned.

Submission Checklist

Please take the following actions before submitting this pull request.

Fork your PR branch from the Toolkit "develop" branch (not main)

Test all changes with pre-commit in a local branch #
Confirm that "make tests" passes all tests
Add or modify unit tests to cover code changes
Ensure that unit test coverage remains above 80%
Update all applicable documentation
Follow Cluster Toolkit Contribution guidelines #

samskillman · 2025-09-25T23:37:11Z

Hi @SinaChavoshi - would you mind rebasing your changes on top of the current upstream develop branch, and make the target of this PR to go to the develop branch as well? That follows our development pattern. Thanks!

SinaChavoshi · 2025-09-29T22:49:57Z

Hi @SinaChavoshi - would you mind rebasing your changes on top of the current upstream develop branch, and make the target of this PR to go to the develop branch as well? That follows our development pattern. Thanks!

Done.

samskillman

Mostly it looks good, thank you for this contribution! I've added a few suggestions that we should discuss/fix up before merging.

examples/gke-a3-highgpu-inference-gateway.yaml

cboneti

Hi, sorry for the delay reviewing this.

The PR appears to be well-implemented and achieves its goal. The module changes are correct, the example blueprint is functional, and the documentation is clear.

You are however missing an entry in the examples/README.md TOC (lines 18-72). Please add that.

Nit: Consider adding a note in the modules/scheduler/gke-cluster/README.md about the new enable_inference_gateway variable. This note should mention the requirement of having a subnet with purpose: "REGIONAL_MANAGED_PROXY" in the VPC for this feature to work (or point to the relevant networking documentation).

SinaChavoshi · 2025-10-15T19:08:23Z

... You are however missing an entry in the examples/README.md TOC (lines 18-72). Please add that.

Nit: Consider adding a note in the modules/scheduler/gke-cluster/README.md about the new enable_inference_gateway variable. This note should mention the requirement of having a subnet with purpose: "REGIONAL_MANAGED_PROXY" in the VPC for this feature to work (or point to the relevant networking documentation).

Good catch! Thank you for the feedback. I updated the PR to address both issues raised.

cboneti

lgtm, thanks

samskillman

Approving - let's make sure we run the relevant tests on it.

add read me fix for loop fix http load balancing install crd from http change logic to only set value when flag is set test pre-commit verify precomit remove extra white space remove hard copy of the manifest fix pre-commit fix secondary ip range fix sub network details fix the subnet mask overlap issue remove secondary range from proxy only subnetwork this is not needed here. fix the subnetwork config add subnet_ip enable inference gateway in the cluster fix gateway installation fix gateway_api_config setting move gateway_api_config under networking config fix pre-commit fails. remove extra line in readme add a default cpu nodepool switch to use 192.168.0.0/16 update based on commetns ( use atuo scaling and remove jobset) fix reservation type add explict values for reservation adn set spot to false. add comments to show how to use spot vm and reservations update read me based on review feedback remove extra varialbe introduced by accident during merge. remove extra bracked from read me file.

samskillman · 2025-10-16T16:00:02Z

/gcbrun

kadupoornima · 2025-10-22T04:31:36Z

/gcbrun

cboneti

Approved, pending passing all relevant tests.

SinaChavoshi · 2025-11-05T22:24:45Z

Thank you so much for reviews, I noticed that the failure PR-test-gke-a3-highgpu (hpc-toolkit-dev) seem to have been failing on all executions since mid Aug, is that a correct understanding ? is there a recomended way for me to proceed to unblock this PR?

cboneti · 2025-11-06T09:21:00Z

Thank you so much for reviews, I noticed that the failure PR-test-gke-a3-highgpu (hpc-toolkit-dev) seem to have been failing on all executions since mid Aug, is that a correct understanding ? is there a recomended way for me to proceed to unblock this PR?

Yes, I think we will ignore that and merge this shortly.

examples/README.md

SinaChavoshi requested review from a team and samskillman as code owners September 25, 2025 23:35

SinaChavoshi changed the base branch from main to develop September 25, 2025 23:35

SinaChavoshi force-pushed the gke-cluster-inference-gateway branch from abde98d to 0455bdf Compare September 26, 2025 21:21

samskillman added the release-key-new-features Added to release notes under the "Key New Features" heading. label Sep 30, 2025

samskillman requested changes Sep 30, 2025

View reviewed changes

samskillman assigned bytetwin and cboneti and unassigned bytetwin Sep 30, 2025

SinaChavoshi mentioned this pull request Oct 1, 2025

feat(blueprint): Parameterize machine_type for inference gateway blueprint #4717

Closed

SinaChavoshi requested a review from samskillman October 1, 2025 21:59

cboneti reviewed Oct 14, 2025

View reviewed changes

cboneti assigned SinaChavoshi and unassigned cboneti Oct 14, 2025

SinaChavoshi requested a review from cboneti October 15, 2025 19:08

cboneti previously approved these changes Oct 15, 2025

View reviewed changes

cboneti assigned samskillman and unassigned SinaChavoshi Oct 15, 2025

samskillman previously approved these changes Oct 15, 2025

View reviewed changes

SinaChavoshi dismissed stale reviews from samskillman and cboneti via 6776d01 October 15, 2025 21:14

SinaChavoshi requested review from cboneti and samskillman October 15, 2025 21:31

SinaChavoshi force-pushed the gke-cluster-inference-gateway branch from 94c7f32 to 47fe953 Compare October 15, 2025 22:18

samskillman previously approved these changes Oct 15, 2025

View reviewed changes

cboneti previously approved these changes Oct 16, 2025

View reviewed changes

Fix NCCL test failure.

835bb97

SinaChavoshi dismissed stale reviews from cboneti and samskillman via 835bb97 October 21, 2025 22:10

SinaChavoshi requested review from cboneti and samskillman October 21, 2025 22:11

cboneti approved these changes Nov 5, 2025

View reviewed changes

cboneti enabled auto-merge November 6, 2025 09:20

samskillman approved these changes Nov 11, 2025

View reviewed changes

cboneti merged commit 5b7b8fa into GoogleCloudPlatform:develop Nov 11, 2025
23 of 67 checks passed

raushan2016 reviewed Nov 11, 2025

View reviewed changes

examples/README.md Show resolved Hide resolved

feat: Add GKE Inference Gateway support #4699

feat: Add GKE Inference Gateway support #4699

Uh oh!

Conversation

SinaChavoshi commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Submission Checklist

Uh oh!

samskillman commented Sep 25, 2025

Uh oh!

SinaChavoshi commented Sep 29, 2025

Uh oh!

samskillman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cboneti left a comment

Choose a reason for hiding this comment

Uh oh!

SinaChavoshi commented Oct 15, 2025

Uh oh!

cboneti left a comment

Choose a reason for hiding this comment

Uh oh!

samskillman left a comment

Choose a reason for hiding this comment

Uh oh!

samskillman commented Oct 16, 2025

Uh oh!

kadupoornima commented Oct 22, 2025

Uh oh!

cboneti left a comment

Choose a reason for hiding this comment

Uh oh!

SinaChavoshi commented Nov 5, 2025

Uh oh!

cboneti commented Nov 6, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

SinaChavoshi commented Sep 25, 2025 •

edited

Loading