chore: expand machine spec to contain number of gpus by Despire · Pull Request #1854 · berops/claudie

Despire · 2025-09-19T14:24:46Z

Expands the machine spec to describe the number of GPUs the given instance type has. This will then be used in the Node Template for the cluster-autoscaler.

Partially also implements #1853 where applicable. But it seems that for not all of the Claudie supported providers we can programmatically determine the number of NVIDIA GPUs attached.

Thats why the change in the InputManifest for a dynamic nodepool:

machineSpec:
  nvidiaGpu: 1

Summary by CodeRabbit

New Features
- Dynamic nodepools can declare NVIDIA GPU counts; autoscaler exposes and honors GPU resources (per-node spec overrides provider info).
Documentation
- API reference and autoscaling guide updated with NVIDIA GPU field, clarified CPU/Memory wording, and example GPU-enabled nodepool.
Bug Fixes
- Autoscaler now guards against missing dynamic nodepool data and returns safely when capacity info is absent.
API Changes
- CPU and Memory validations relaxed to be optional/mutually dependent; CRD and proto schemas include nvidiaGpu.
Tests
- Diff-based tests made deterministic by sorting outputs.
Chores
- YAML/manifest formatting, image tags, and test cluster versions updated.

coderabbitai · 2025-09-19T14:24:55Z

Caution

Review failed

The pull request is closed.

Walkthrough

Added optional NVIDIA GPU support across MachineSpec (API, proto, CRD, autoscaler); relaxed cpuCount/memory validations and removed them from CRD required lists; aggregated provider GPU info (AWS/GCP); autoscaler capacity now includes GPUs and respects manifest overrides; docs/CRD text updated; deterministic test sorting and image tag bumps in manifests.

Changes

Cohort / File(s)	Summary
Docs: InputManifest API `docs/input-manifest/api-reference.md`	Updated Dynamic nodepool docs: wording tweaks for cpu/memory and added `nvidiaGpu` field to serverType spec.
API Types & Utilities `internal/api/manifest/manifest.go`, `internal/api/manifest/utils.go`	MachineSpec: add `NvidiaGpu` (optional, gte=0); relax `CpuCount`/`Memory` validations to `required_with` and `gte=0`; propagate `NvidiaGpu` in CreateNodepools.
CRDs: InputManifest & Settings `manifests/claudie/crd/claudie.io_inputmanifests.yaml`, `manifests/claudie/crd/claudie.io_settings.yaml`	InputManifest CRD: update `cpuCount` description, add `memory` and `nvidiaGpu` properties, remove `cpuCount`/`memory` from required. Settings CRD: formatting/indentation reflow only.
Proto: NodePool `proto/spec/nodepool.proto`	Add `int32 nvidiaGpu = 3` to `MachineSpec` in `DynamicNodePool`.
Autoscaler Adapter: Capacity & TypeInfo `services/autoscaler-adapter/node_manager/node_manager.go`, `services/autoscaler-adapter/node_manager/utils.go`	Add `nvidiaGpus` to typeInfo; populate/aggregate NVIDIA GPU counts for AWS/GCP and zero-default for others; guard nil DynamicNodePool; compute disk from typeInfo or fallback; include `nvidia.com/gpu` in ResourceList and allow manifest override via MachineSpec.NvidiaGpu.
Tests `internal/nodepools/nodepools_test.go`	Sort diff outputs for deterministic comparisons in label/taint/annotation tests.
Kustomizations / Deploy Manifests `manifests/claudie/kustomization.yaml`, `manifests/testing-framework/kustomization.yaml`	Updated image tags and minor YAML reformatting (no API changes).
Testing Manifests `manifests/testing-framework/test-sets/autoscaling-1/*.yaml`	Bump Kubernetes cluster version from `1.30.0` to `1.31.0` across autoscaling test-set manifests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant NP as NodeManager
  participant DNP as DynamicNodePool
  participant TI as typeInfo
  participant RL as ResourceList

  rect rgba(230,240,255,0.20)
  note over NP: GetCapacity(np)
  NP->>DNP: read DynamicNodePool
  alt dnp is nil
    NP-->>RL: return nil
  else
    NP->>TI: lookup typeInfo (provider + instance)
    alt typeInfo missing
      NP-->>RL: return nil
    else
      note over TI: cpu, memory, disk, nvidiaGpus
      NP->>RL: build cpu/memory/storage resources
      alt TI.nvidiaGpus > 0
        NP->>RL: add `nvidia.com/gpu` = TI.nvidiaGpus
      end
      alt DNP.MachineSpec.NvidiaGpu > 0
        NP->>RL: override `nvidia.com/gpu` = DNP.MachineSpec.NvidiaGpu
      end
      NP-->>RL: return ResourceList
    end
  end
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

bernardhalas
JKBGIT1

Poem

A hop, a tweak, a GPU bright,
Specs now count both core and light.
Autoscaler learns each card and size,
Tests sorted neat, images rise.
This rabbit cheers — deploys with delight 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly summarizes the primary change by stating the addition of GPU counts to the machine specification, aligning with the pull request’s objective of expanding the machine spec to include NVIDIA GPU fields.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ad0cd1 and d8b5617.

📒 Files selected for processing (2)

manifests/claudie/kustomization.yaml (1 hunks)
manifests/testing-framework/kustomization.yaml (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

docs/input-manifest/api-reference.md (1)
262-265: Tighten wording and capitalization; note NVIDIA plugin requirement

Recommend minor edits for consistency and clarity.
-  - `cpuCount`: specifies the number of cpus used by the `serverType`
-  - `memory`: specifies the memory in GBs used by the `serverType`
-  - `nvidiaGpu`: specifies the number of nvidia GPUs used by the `serverType`
+  - `cpuCount`: number of CPUs for the `serverType`
+  - `memory`: memory in GB for the `serverType`
+  - `nvidiaGpu`: number of NVIDIA GPUs for the `serverType` (requires the NVIDIA device plugin on nodes)
internal/api/manifest/utils.go (1)

293-296: Mapping MachineSpec is correct; consider guarding against int→int32 overflow

The casts are fine, but if manifest values ever exceed int32 (unlikely for CPUs/GPUs, but possible for memory if unit semantics change), this would truncate.

Optionally add a bounds check before casting, or constrain/validate the upper bound where you already validate the spec.
services/autoscaler-adapter/node_manager/node_manager.go (3)
71-74: Add a nil-check for dnp.Provider to prevent rare NPEs

Unlikely, but if Provider is ever missing, dnp.Provider.CloudProviderName will panic.
- typeInfo := nm.getTypeInfo(dnp.Provider.CloudProviderName, dnp)
+ if dnp.Provider == nil {
+   return nil
+ }
+ typeInfo := nm.getTypeInfo(dnp.Provider.CloudProviderName, dnp)
76-83: Use local dnp variable and fix comment typo

Minor clarity/readability.
- // Check if disk is define for the instance.
+ // Check if disk is defined for the instance.
...
-   disk = int64(np.GetDynamicNodePool().StorageDiskSize) * 1024 * 1024 * 1024 // Convert to bytes
+   disk = int64(dnp.StorageDiskSize) * 1024 * 1024 * 1024 // Convert to bytes
84-89: Verify resource key: prefer EphemeralStorage over Storage for node capacity

Kubernetes node capacity commonly reports ephemeral storage via corev1.ResourceEphemeralStorage. Using ResourceStorage may not have the intended effect for scheduling/models.
- rl[k8sV1.ResourceStorage] = *resource.NewQuantity(disk, resource.DecimalSI)
+ rl[k8sV1.ResourceEphemeralStorage] = *resource.NewQuantity(disk, resource.DecimalSI)
If other components rely on ResourceStorage here, ignore this change—but please confirm.
internal/api/manifest/manifest.go (1)

128-137: Validation semantics: decide whether zero values should be allowed when provided

required_with=Memory|CpuCount, gte=0 allows 0 for CPU/Memory if present. If the intent is “only positive when specified,” switch to gt=0. If 0 is a sentinel for “use provider default,” consider pointer fields to disambiguate presence.

Examples:

Keep int, require positive when present:

validate:"required_with=Memory,gt=0"

Use pointers for presence:

CpuCount *int validate:"omitempty,gt=0"

Memory *int validate:"omitempty,gt=0"

NvidiaGpu *int validate:"omitempty,gte=0"
manifests/claudie/crd/claudie.io_inputmanifests.yaml (1)
304-316: Align CRD validation with code: add minimum: 0 and fix NVIDIA capitalization

Manifest validation uses gte=0. Mirror that in the CRD for consistency.
                         properties:
                           cpuCount:
                             description: CpuCount specifies the number of CPU cores
                               the provided instance type will have.
                             type: integer
+                            minimum: 0
                           memory:
                             description: Memory specifies the memory the provided
                               instance type will have.
-                            type: integer
+                            type: integer
+                            minimum: 0
                           nvidiaGpu:
-                            description: Nvidia specifies the number of NVIDIA GPUs
+                            description: NVIDIA specifies the number of NVIDIA GPUs
                               the provided instance type will have.
-                            type: integer
+                            type: integer
+                            minimum: 0

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e726587 and 782a364.

⛔ Files ignored due to path filters (1)

proto/pb/spec/nodepool.pb.go is excluded by !**/*.pb.go

📒 Files selected for processing (8)

docs/input-manifest/api-reference.md (1 hunks)
internal/api/manifest/manifest.go (1 hunks)
internal/api/manifest/utils.go (1 hunks)
manifests/claudie/crd/claudie.io_inputmanifests.yaml (1 hunks)
manifests/claudie/crd/claudie.io_settings.yaml (0 hunks)
manifests/claudie/crd/kustomization.yaml (0 hunks)
proto/spec/nodepool.proto (2 hunks)
services/autoscaler-adapter/node_manager/node_manager.go (1 hunks)

💤 Files with no reviewable changes (2)

manifests/claudie/crd/kustomization.yaml
manifests/claudie/crd/claudie.io_settings.yaml

🧰 Additional context used

🧬 Code graph analysis (2)

services/autoscaler-adapter/node_manager/node_manager.go (2)

internal/api/manifest/manifest.go (1)

MachineSpec (127-137)

proto/pb/spec/nodepool.pb.go (3)

MachineSpec (557-564)

MachineSpec (577-577)

MachineSpec (592-594)

internal/api/manifest/utils.go (2)

internal/api/manifest/manifest.go (1)

MachineSpec (127-137)

proto/pb/spec/nodepool.pb.go (3)

MachineSpec (557-564)

MachineSpec (577-577)

MachineSpec (592-594)

🔇 Additional comments (4)

proto/spec/nodepool.proto (2)

94-95: Proto field addition LGTM; confirm presence semantics are acceptable

Adding int32 nvidiaGpu = 3; is wire‑compatible and aligns with usages gating on > 0. Proto3 can’t distinguish “unset” from “0”; if you ever need that, consider a wrapper type. Otherwise good as-is.

Would you like me to verify all generated bindings are updated and committed (e.g., nodepool.pb.go includes NvidiaGpu)? I can provide a quick grep script.

116-116: No action needed

Formatting-only change.

services/autoscaler-adapter/node_manager/node_manager.go (2)

66-69: Good defensive nil-check for DynamicNodePool

Early return on nil dnp prevents panics when called on static pools.

90-95: GPU capacity entry looks good; call out deployment prerequisite

Adding nvidia.com/gpu when NvidiaGpu > 0 matches extended resource conventions.

Note: Pods will schedule only if the NVIDIA device plugin advertises this resource on the node. Confirm this is documented for users of GPU pools.

jakubhlavacka

I don't understand why we need to update the InputManifest and remove Settings CRD.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

services/autoscaler-adapter/node_manager/utils.go (3)
221-221: Fix typo in function comment.

The comment has a typo in the type name.
-// getTypeInfoGcp converts []*computepb.MachineTypeto typeInfo map of instances, where keys are instance types.
+// getTypeInfoGcp converts []*computepb.MachineType to typeInfo map of instances, where keys are instance types.
263-271: Consider documenting the OCI GPU limitation in the codebase.

While the inline comment explains the lack of vendor metadata API, consider adding documentation in the user-facing documentation or README about this limitation for OCI instances.

Would you like me to help create documentation that explains the GPU limitations for OCI and how users can work around it using the MachineSpec.NvidiaGpu field?

280-288: Consider documenting the Azure GPU limitation.

Similar to OCI, the Azure provider lacks vendor metadata API. This limitation should be documented for users.

Would you like me to help create documentation that explains the GPU limitations for Azure and how users can work around it using the MachineSpec.NvidiaGpu field?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6061ef3 and f3b9b6e.

📒 Files selected for processing (5)

manifests/claudie/crd/claudie.io_settings.yaml (1 hunks)
manifests/claudie/kustomization.yaml (1 hunks)
manifests/testing-framework/kustomization.yaml (1 hunks)
services/autoscaler-adapter/node_manager/node_manager.go (2 hunks)
services/autoscaler-adapter/node_manager/utils.go (4 hunks)

✅ Files skipped from review due to trivial changes (1)

manifests/claudie/kustomization.yaml

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-06-10T08:04:36.596Z

Learnt from: Despire
PR: berops/claudie#1735
File: manifests/claudie/kustomization.yaml:59-61
Timestamp: 2025-06-10T08:04:36.596Z
Learning: In Claudie's deployment process, different images in the same kustomization.yaml file may have different tags intentionally. This is part of their e2e process where only relevant images are updated when changes propagate, rather than updating all images to the same tag. This selective updating approach is deliberate and not an inconsistency.

Applied to files:

manifests/testing-framework/kustomization.yaml

📚 Learning: 2025-06-10T08:26:54.132Z

Learnt from: Despire
PR: berops/claudie#1735
File: services/ansibler/server/domain/usecases/setup_loadbalancers.go:231-238
Timestamp: 2025-06-10T08:26:54.132Z
Learning: In Claudie's codebase, Role.Settings is always initialized with default values during the desired state creation process in services/manager/internal/service/create_desired_state.go. This means that accessing tg.Role.Settings.EnvoyCds, EnvoyLds, and EnvoyAdminPort will never cause a nil pointer panic since the Settings struct is guaranteed to exist even when no CRD settings are provided.

Applied to files:

manifests/claudie/crd/claudie.io_settings.yaml

🧬 Code graph analysis (1)

services/autoscaler-adapter/node_manager/node_manager.go (2)

internal/api/manifest/manifest.go (1)

MachineSpec (127-137)

proto/pb/spec/nodepool.pb.go (3)

MachineSpec (557-564)

MachineSpec (577-577)

MachineSpec (592-594)

🔇 Additional comments (10)

manifests/testing-framework/kustomization.yaml (2)
93-93: Image tag bump looks good and matches the e2e flow.

Acking the selective image tag update; this aligns with the team's intentional process for kustomizations where only relevant images are bumped.

93-93: Optional: pin testing-framework image by digest — GHCR API returned null digest; confirm and add digest if available.

GH API query for ghcr.io/berops/claudie/testing-framework (tag 6c3c266-3581) returned "digest": null for all versions, so the manifest digest couldn't be retrieved via that endpoint. If you want reproducible images, obtain the digest and add it to kustomization.yaml:
images:
- name: ghcr.io/berops/claudie/testing-framework
  newTag: 6c3c266-3581
+  # Optional: pin for reproducibility (digest takes precedence over tag)
+  # Replace <sha256> with the actual manifest digest
+  digest: sha256:<sha256>
To fetch the digest (authenticate as needed), use one of:

crane digest ghcr.io/berops/claudie/testing-framework:6c3c266-3581

docker pull ghcr.io/berops/claudie/testing-framework:6c3c266-3581 && docker image inspect --format='{{index .RepoDigests 0}}' ghcr.io/berops/claudie/testing-framework:6c3c266-3581
services/autoscaler-adapter/node_manager/utils.go (3)

179-183: LGTM! Clear implementation for Hetzner's lack of GPU support.

The addition of the nvidiaGpus field with an explicit value of 0 and helpful comment makes it clear that Hetzner doesn't currently support GPU instances.

192-217: Robust GPU counting implementation for AWS.

Good implementation with proper null checks and manufacturer validation. The logic correctly accumulates GPU counts only for NVIDIA GPUs.

225-254: Well-documented GCP GPU handling with API limitations.

Excellent documentation explaining the API limitations for GPU information retrieval. The fallback mechanism to user-provided values in MachineSpec.NvidiaGpus is a good solution. The implementation correctly handles the accelerator data when available.

services/autoscaler-adapter/node_manager/node_manager.go (4)

39-41: LGTM! Well-documented struct field addition.

The addition of the nvidiaGpus field with clear documentation is appropriate.

68-77: Good defensive programming with nil checks.

The early return pattern with nil checks prevents potential panics.

92-94: Proper GPU resource allocation.

The implementation correctly adds NVIDIA GPU resources to the Kubernetes ResourceList using the standard nvidia.com/gpu resource name.

96-100: Good override mechanism for user-specified GPU counts.

The logic properly prioritizes user-specified GPU counts from MachineSpec.NvidiaGpu over cached values, providing flexibility for cloud providers without proper GPU metadata APIs.

manifests/claudie/crd/claudie.io_settings.yaml (1)

1-86: Formatting changes only - no functional impact.

The file contains only formatting and indentation changes to the CRD OpenAPI schema with no modifications to the actual fields or API surface. The content remains semantically identical.

Despire · 2025-09-22T13:46:55Z

I don't understand why we need to update the InputManifest

For the Autoscaling to work properly when we pass the Template of a Node From a NodePool

claudie/services/autoscaler-adapter/node_manager/node_manager.go

Line 65 in e726587

func (nm *NodeManager) GetCapacity(np *spec.NodePool) k8sV1.ResourceList {

we Need to tell it that it has GPU capacity available, which we currently do not do. But Also it seems that for most Cloud providers supported by Claudie there doesn't seem to be a way programatically to resolve if the instance has GPU's and if they're from Nvidia.

For Aws this is not an issue, but for the others it is, for example Google treats GPUs as someting as an "extension" and you can configure 1,2,4,8 GPUs, and seems like you can't determine this programatically, just by fetching the Instance Type, at least not for now,

So that's why a new field was added machineSpec.NvidiaGpus to tell claudie how many GPUs the instance-type of the nodepool has so that autoscaling with GPUs also works okay.

claudie/services/autoscaler-adapter/node_manager/node_manager.go

Lines 92 to 100 in f3b9b6e

    
           if typeInfo.nvidiaGpus > 0 { 
        
           	rl["nvidia.com/gpu"] = *resource.NewQuantity(int64(typeInfo.nvidiaGpus), resource.DecimalSI) 
        
           } 
        
           // If the machine spec contains a valid number of NvidiaGPUs, prefer that value over the cached 
        
           // one from [typeInfo]. 
        
           if dnp.MachineSpec != nil && dnp.MachineSpec.NvidiaGpu > 0 { 
        
           	rl["nvidia.com/gpu"] = *resource.NewQuantity(int64(dnp.MachineSpec.NvidiaGpu), resource.DecimalSI) 
        
           }

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

docs/autoscaling/autoscaling.md (1)
38-54: Add YAML language to fenced block; fix apostrophe and small copy nits.

Specify language for syntax highlighting and to satisfy MD040.

Use “GPUs” (no apostrophe).

Make closing fence unindented to avoid rendering quirks.

Clarify comment that the instance type example is AWS-specific.
-```
+```yaml
 - name: autoscaled
   providerSpec:
    name: aws
    region: eu-central-1
    zone: eu-central-1a
   autoscaler:
     min: 0
     max: 20
-  # GPU machine type name.
+  # GPU-capable instance type (AWS example).
   serverType: g4dn.xlarge
   machineSpec:
-    # explicitly specify how many GPU's the instance type provides.
+    # Explicitly specify how many GPUs the instance type provides.
     nvidiaGpu: 1
   # OS image name
   image: ami-07eef52105e8a2059
-  ```
+```

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3b9b6e and b07cbca.

📒 Files selected for processing (2)

docs/autoscaling/autoscaling.md (1 hunks)
services/autoscaler-adapter/node_manager/node_manager.go (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

services/autoscaler-adapter/node_manager/node_manager.go

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/autoscaling/autoscaling.md

38-38: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: check-changes

🔇 Additional comments (1)

docs/autoscaling/autoscaling.md (1)
36-37: Clarify GPU auto-detection behavior and align terminology.

File: docs/autoscaling/autoscaling.md (around lines 36–37). Current wording implies GPUs are never auto‑detected; the code attempts detection for some providers but not all (e.g., GCP attachable accelerators or providers without vendor GPU metadata). Use “NVIDIA GPUs”, “node pool”, and the YAML field nvidiaGpu consistently.
-The custom Claudie-Provider for the [Cluster-Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) does not automatically determine whether the provided instance types have GPU capabilities. If you want autoscaling for a nodepool with GPUs, you must explicitly specify how many GPUs each node in the nodepool has.
+The Claudie provider cannot always auto‑detect NVIDIA GPU counts from instance types across all providers. When the provider cannot determine the GPU count (e.g., GCP attachable accelerators or providers without GPU metadata), explicitly set the per‑node count via `machineSpec.nvidiaGpu`. On providers where detection is supported (e.g., some AWS GPU families), this field is optional and will override auto‑detection if set. If you want autoscaling for a node pool with GPUs, ensure each node’s NVIDIA GPU count is known to Claudie.

expand machine spec to contain number of gpus

066fb95

Despire requested a review from jakubhlavacka September 19, 2025 14:24

update docs

782a364

coderabbitai Bot reviewed Sep 19, 2025

View reviewed changes

Despire and others added 2 commits September 22, 2025 09:26

make test deterministic

6061ef3

Auto commit - update kustomization.yaml

e915e85

jakubhlavacka reviewed Sep 22, 2025

View reviewed changes

Comment thread manifests/claudie/crd/claudie.io_settings.yaml Outdated

revert crd removal & adjust autoscaler

f3b9b6e

coderabbitai Bot reviewed Sep 22, 2025

View reviewed changes

Despire requested a review from jakubhlavacka September 22, 2025 13:47

fix linter & update docs

b07cbca

Despire requested a review from bernardhalas September 22, 2025 14:20

coderabbitai Bot reviewed Sep 22, 2025

View reviewed changes

Auto commit - update kustomization.yaml

d0777f2

jakubhlavacka approved these changes Sep 22, 2025

View reviewed changes

Despire added the test-set-autoscaling Will select test-sets related to autoscaling in the e2e tests label Sep 23, 2025

Despire and others added 6 commits September 23, 2025 09:06

trigger CI

33b28e0

Auto commit - update kustomization.yaml

4e7d90b

Merge branch 'master' into fix/gpu-autoscaler

099da17

Auto commit - update kustomization.yaml

86625e6

update kubernetes version

96503e0

Auto commit - update kustomization.yaml

071434e

Despire enabled auto-merge September 26, 2025 05:28

Despire and others added 2 commits September 26, 2025 08:56

Merge branch 'master' into fix/gpu-autoscaler

2ad0cd1

Auto commit - update kustomization.yaml

d8b5617

Despire added this pull request to the merge queue Sep 26, 2025

Merged via the queue into master with commit 73a7445 Sep 26, 2025
1 check was pending

Despire deleted the fix/gpu-autoscaler branch September 26, 2025 07:26

coderabbitai Bot mentioned this pull request Jan 28, 2026

feat(gcp): add GPU guest accelerator support for GCP nodepools #1952

Merged

coderabbitai Bot mentioned this pull request Feb 17, 2026

feat: add Exoscale cloud provider support #1978

Merged

6 tasks

This was referenced Mar 4, 2026

Feature/custom ssh port nodepool #1997

Closed

feat: add CloudRift cloud provider support #2000

Merged

coderabbitai Bot mentioned this pull request May 4, 2026

feat: add Verda Cloud provider #2082

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: expand machine spec to contain number of gpus#1854

chore: expand machine spec to contain number of gpus#1854
Despire merged 15 commits into
masterfrom
fix/gpu-autoscaler

Despire commented Sep 19, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Sep 19, 2025 •

edited

Loading

Review failed

Uh oh!

coderabbitai Bot left a comment

Uh oh!

jakubhlavacka left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Despire commented Sep 22, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Despire commented Sep 19, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

jakubhlavacka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Despire commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Despire commented Sep 19, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Sep 19, 2025 •

edited

Loading

Despire commented Sep 22, 2025 •

edited

Loading