Add nodepool commands and align cluster payload with ClusterSpec CRD#86
Add nodepool commands and align cluster payload with ClusterSpec CRD#86typeid wants to merge 4 commits into
Conversation
The CRD expects []string but the CLI was sending a single string (first subnet only). Send the full comma-separated list as an array. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The hyperfleet ClusterSpec expects IAM role ARNs nested under platform.aws.roles with specific JSON keys (e.g. ingressARN, storageARN). The CLI was merging them as flat camelCase keys in the spec map, which were silently dropped by json.Unmarshal, resulting in clusters with empty role ARNs. Also adds comments documenting which spec fields are silently dropped by the API (provider, multi_az, etc.) since they have no matching ClusterSpec JSON tag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The platform API now returns the OIDC issuer URL as spec.oidcIssuerURL (matching the CRD field name) instead of spec.cloudUrl. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds nodepool CRUD subcommands that talk to the platform-api. The create command auto-discovers subnet, instance profile, and security groups from the cluster spec when flags are omitted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: typeid The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
WalkthroughAdds ChangesNodepool CLI commands
Cluster field and spec fixes
Sequence Diagram(s)sequenceDiagram
participant CLI as nodepool create
participant fetchClusterSpec
participant PlatformAPI as Platform API (execute-api)
CLI->>fetchClusterSpec: auto-discover missing infra fields
fetchClusterSpec->>PlatformAPI: signedGet /api/v0/clusters/{id}
PlatformAPI-->>fetchClusterSpec: clusterResponse (subnet, profile, sg)
fetchClusterSpec-->>CLI: populated infra fields
CLI->>PlatformAPI: signedPost /api/v0/nodepools (payload)
PlatformAPI-->>CLI: 201 + nodepool body
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (8 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/commands/cluster/create.go`:
- Around line 126-129: The success summary in createCluster is using the wrong
label for the OIDC issuer value, which makes the printed result misleading.
Update the summary text in the create.go flow that reads from response["spec"]
and getStringField(spec, "oidcIssuerURL") so the displayed field name is “OIDC
Issuer URL” instead of “Cloud URL”, keeping the value source unchanged.
In `@internal/commands/nodepool/create.go`:
- Around line 155-163: The JSON output handling in createNodePool currently
swallows unmarshal failures and still returns nil, which makes malformed API
responses look successful. In the opts.output == "json" branch of
createNodePool, update the json.Unmarshal error path to return a non-nil error
instead of just printing body and exiting successfully, and ensure the caller
sees the failure so --output json does not report success when parsing breaks.
- Around line 190-192: The cluster lookup URL in fetchClusterSpec builds the
/api/v0/clusters/... path with a raw clusterID, which can break routing if it
contains path separators or traversal-like characters. Update fetchClusterSpec
to escape or path-encode clusterID before passing it to fmt.Sprintf, and keep
the request targeting the same cluster endpoint by using the encoded value in
the signedGet call.
In `@internal/commands/nodepool/delete.go`:
- Around line 51-52: The DELETE endpoint in deleteNodepool currently
interpolates nodepoolID directly into the URL path, which can alter the target
route if it contains reserved path characters. Update the endpoint construction
in deleteNodepool to path-escape nodepoolID before appending it, using the
existing URL-building flow with baseURL and signedDelete so the request always
targets the intended nodepool resource.
In `@internal/commands/nodepool/list.go`:
- Around line 56-67: The nodepool list command currently forwards unvalidated
pagination inputs from the CLI, so `runList` can send invalid `limit` and
`offset` values to the API. Add input checks in `NewListCmd`/its `RunE` path
before calling `runList`, enforcing `limit` to be within 1-100 and `offset` to
be non-negative, and return a clear error if either flag is out of range.
- Around line 104-112: In the json branch of list handling in the nodepool
command, the current json.Unmarshal failure path prints the raw body and still
returns nil, which makes malformed output look successful. Update the logic in
the list command’s JSON output handling to return the decode error instead of
nil, and keep the existing body print or replace it with an error message so
callers can detect failures from opts.output == "json".
In `@internal/services/cluster/service.go`:
- Around line 96-108: The cluster spec assembly in the service that builds
`ClusterSpec` should fail fast instead of silently accepting missing or blank
stack outputs. Add validation around the `iamOutputs[...]` lookups and
`PrivateSubnetIds` handling so required values are checked before populating
`spec`, and return an error if any are empty or missing. Use the existing
cluster service/spec builder path and the `workerInstanceProfileName` /
`privateSubnetIds` fields as the key points to verify and guard.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift-online/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: cefe0ad5-474b-4966-a26c-b2bacf3970f1
📒 Files selected for processing (9)
internal/commands/cluster/create.gointernal/commands/cluster/list.gointernal/commands/nodepool/api.gointernal/commands/nodepool/create.gointernal/commands/nodepool/delete.gointernal/commands/nodepool/list.gointernal/commands/nodepool/nodepool.gointernal/commands/root.gointernal/services/cluster/service.go
| // Get OIDC issuer URL from spec | ||
| cloudURL := "" | ||
| if spec, ok := response["spec"].(map[string]interface{}); ok { | ||
| cloudURL = getStringField(spec, "cloudUrl") | ||
| cloudURL = getStringField(spec, "oidcIssuerURL") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Rename the summary field to OIDC Issuer URL.
Line 129 now reads spec.oidcIssuerURL, but this function still prints that value as Cloud URL, so the success summary is misleading.
Suggested fix
- cloudURL := ""
+ oidcIssuerURL := ""
if spec, ok := response["spec"].(map[string]interface{}); ok {
- cloudURL = getStringField(spec, "oidcIssuerURL")
+ oidcIssuerURL = getStringField(spec, "oidcIssuerURL")
}
@@
- if cloudURL != "" {
- fmt.Printf(" Cloud URL: %s\n", cloudURL)
+ if oidcIssuerURL != "" {
+ fmt.Printf(" OIDC Issuer URL: %s\n", oidcIssuerURL)
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/commands/cluster/create.go` around lines 126 - 129, The success
summary in createCluster is using the wrong label for the OIDC issuer value,
which makes the printed result misleading. Update the summary text in the
create.go flow that reads from response["spec"] and getStringField(spec,
"oidcIssuerURL") so the displayed field name is “OIDC Issuer URL” instead of
“Cloud URL”, keeping the value source unchanged.
| if opts.output == "json" { | ||
| var result map[string]interface{} | ||
| if err := json.Unmarshal(body, &result); err != nil { | ||
| fmt.Println(string(body)) | ||
| return nil | ||
| } | ||
| prettyJSON, _ := json.MarshalIndent(result, "", " ") | ||
| fmt.Println(string(prettyJSON)) | ||
| return nil |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Don't report success when JSON output parsing fails.
Line 157 enters the error path, but Line 159 still returns nil. With --output json, that turns a malformed API response into exit code 0, so scripts cannot distinguish bad output from a successful create.
Suggested fix
if opts.output == "json" {
var result map[string]interface{}
if err := json.Unmarshal(body, &result); err != nil {
- fmt.Println(string(body))
- return nil
+ return fmt.Errorf("failed to parse JSON response: %w", err)
}
- prettyJSON, _ := json.MarshalIndent(result, "", " ")
+ prettyJSON, err := json.MarshalIndent(result, "", " ")
+ if err != nil {
+ return fmt.Errorf("failed to format JSON response: %w", err)
+ }
fmt.Println(string(prettyJSON))
return nil
}🧰 Tools
🪛 golangci-lint (2.12.2)
[error] 159-159: error is not nil (line 157) but it returns nil
(nilerr)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/commands/nodepool/create.go` around lines 155 - 163, The JSON output
handling in createNodePool currently swallows unmarshal failures and still
returns nil, which makes malformed API responses look successful. In the
opts.output == "json" branch of createNodePool, update the json.Unmarshal error
path to return a non-nil error instead of just printing body and exiting
successfully, and ensure the caller sees the failure so --output json does not
report success when parsing breaks.
Source: Linters/SAST tools
| func fetchClusterSpec(ctx context.Context, baseURL, clusterID string, creds awssdk.Credentials, region string) (*clusterResponse, error) { | ||
| endpoint := fmt.Sprintf("%s/api/v0/clusters/%s", baseURL, clusterID) | ||
| body, statusCode, err := signedGet(ctx, endpoint, creds, region) |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Escape the cluster ID before embedding it in the request path.
Line 191 concatenates a raw CLI value into the URL path. If clusterID contains /, ?, or .., the auto-discovery GET no longer targets the intended cluster endpoint. As per path instructions, "Validate at trust boundaries with allow-lists, not deny-lists".
Suggested fix
import (
"context"
"encoding/json"
"fmt"
+ "net/url"
"os"
"strings"
@@
func fetchClusterSpec(ctx context.Context, baseURL, clusterID string, creds awssdk.Credentials, region string) (*clusterResponse, error) {
- endpoint := fmt.Sprintf("%s/api/v0/clusters/%s", baseURL, clusterID)
+ endpoint := fmt.Sprintf("%s/api/v0/clusters/%s", baseURL, url.PathEscape(clusterID))
body, statusCode, err := signedGet(ctx, endpoint, creds, region)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func fetchClusterSpec(ctx context.Context, baseURL, clusterID string, creds awssdk.Credentials, region string) (*clusterResponse, error) { | |
| endpoint := fmt.Sprintf("%s/api/v0/clusters/%s", baseURL, clusterID) | |
| body, statusCode, err := signedGet(ctx, endpoint, creds, region) | |
| func fetchClusterSpec(ctx context.Context, baseURL, clusterID string, creds awssdk.Credentials, region string) (*clusterResponse, error) { | |
| endpoint := fmt.Sprintf("%s/api/v0/clusters/%s", baseURL, url.PathEscape(clusterID)) | |
| body, statusCode, err := signedGet(ctx, endpoint, creds, region) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/commands/nodepool/create.go` around lines 190 - 192, The cluster
lookup URL in fetchClusterSpec builds the /api/v0/clusters/... path with a raw
clusterID, which can break routing if it contains path separators or
traversal-like characters. Update fetchClusterSpec to escape or path-encode
clusterID before passing it to fmt.Sprintf, and keep the request targeting the
same cluster endpoint by using the encoded value in the signedGet call.
Source: Path instructions
| endpoint := fmt.Sprintf("%s/api/v0/nodepools/%s", baseURL, nodepoolID) | ||
| body, statusCode, err := signedDelete(ctx, endpoint, creds, region) |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Path-escape the nodepool ID in the DELETE endpoint.
Line 51 inserts a raw argument into the URL path. An ID containing /, ?, or .. changes the request target, so this command can hit the wrong route or resource. As per path instructions, "Validate at trust boundaries with allow-lists, not deny-lists".
Suggested fix
import (
"context"
"fmt"
+ "net/url"
"os"
@@
- endpoint := fmt.Sprintf("%s/api/v0/nodepools/%s", baseURL, nodepoolID)
+ endpoint := fmt.Sprintf("%s/api/v0/nodepools/%s", baseURL, url.PathEscape(nodepoolID))
body, statusCode, err := signedDelete(ctx, endpoint, creds, region)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| endpoint := fmt.Sprintf("%s/api/v0/nodepools/%s", baseURL, nodepoolID) | |
| body, statusCode, err := signedDelete(ctx, endpoint, creds, region) | |
| endpoint := fmt.Sprintf("%s/api/v0/nodepools/%s", baseURL, url.PathEscape(nodepoolID)) | |
| body, statusCode, err := signedDelete(ctx, endpoint, creds, region) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/commands/nodepool/delete.go` around lines 51 - 52, The DELETE
endpoint in deleteNodepool currently interpolates nodepoolID directly into the
URL path, which can alter the target route if it contains reserved path
characters. Update the endpoint construction in deleteNodepool to path-escape
nodepoolID before appending it, using the existing URL-building flow with
baseURL and signedDelete so the request always targets the intended nodepool
resource.
Source: Path instructions
| RunE: func(cmd *cobra.Command, args []string) error { | ||
| if opts.clusterID == "" { | ||
| return fmt.Errorf("--cluster-id is required") | ||
| } | ||
| return runList(cmd.Context(), opts) | ||
| }, | ||
| } | ||
|
|
||
| cmd.Flags().StringVar(&opts.clusterID, "cluster-id", "", "Cluster ID (required)") | ||
| cmd.Flags().IntVar(&opts.limit, "limit", opts.limit, "Maximum number of nodepools to return (1-100)") | ||
| cmd.Flags().IntVar(&opts.offset, "offset", opts.offset, "Number of nodepools to skip") | ||
| cmd.Flags().StringVarP(&opts.output, "output", "o", "table", "Output format: table or json") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Validate --limit and --offset before sending the request.
The help text says --limit is 1-100, but Lines 93-94 forward any integer, including 0, negatives, or values above 100. That turns simple CLI misuse into avoidable API errors. As per path instructions, "Validate at trust boundaries with allow-lists, not deny-lists".
Suggested fix
RunE: func(cmd *cobra.Command, args []string) error {
if opts.clusterID == "" {
return fmt.Errorf("--cluster-id is required")
}
+ if opts.limit < 1 || opts.limit > 100 {
+ return fmt.Errorf("--limit must be between 1 and 100")
+ }
+ if opts.offset < 0 {
+ return fmt.Errorf("--offset must be greater than or equal to 0")
+ }
return runList(cmd.Context(), opts)
},
}Also applies to: 93-94
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/commands/nodepool/list.go` around lines 56 - 67, The nodepool list
command currently forwards unvalidated pagination inputs from the CLI, so
`runList` can send invalid `limit` and `offset` values to the API. Add input
checks in `NewListCmd`/its `RunE` path before calling `runList`, enforcing
`limit` to be within 1-100 and `offset` to be non-negative, and return a clear
error if either flag is out of range.
Source: Path instructions
| if opts.output == "json" { | ||
| var result map[string]interface{} | ||
| if err := json.Unmarshal(body, &result); err != nil { | ||
| fmt.Println(string(body)) | ||
| return nil | ||
| } | ||
| prettyJSON, _ := json.MarshalIndent(result, "", " ") | ||
| fmt.Println(string(prettyJSON)) | ||
| return nil |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Return an error when --output json cannot be decoded.
Line 106 already knows the body is not valid JSON, but Line 108 still returns nil. That makes automation treat malformed list output as a successful command.
Suggested fix
if opts.output == "json" {
var result map[string]interface{}
if err := json.Unmarshal(body, &result); err != nil {
- fmt.Println(string(body))
- return nil
+ return fmt.Errorf("failed to parse JSON response: %w", err)
}
- prettyJSON, _ := json.MarshalIndent(result, "", " ")
+ prettyJSON, err := json.MarshalIndent(result, "", " ")
+ if err != nil {
+ return fmt.Errorf("failed to format JSON response: %w", err)
+ }
fmt.Println(string(prettyJSON))
return nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if opts.output == "json" { | |
| var result map[string]interface{} | |
| if err := json.Unmarshal(body, &result); err != nil { | |
| fmt.Println(string(body)) | |
| return nil | |
| } | |
| prettyJSON, _ := json.MarshalIndent(result, "", " ") | |
| fmt.Println(string(prettyJSON)) | |
| return nil | |
| if opts.output == "json" { | |
| var result map[string]interface{} | |
| if err := json.Unmarshal(body, &result); err != nil { | |
| return fmt.Errorf("failed to parse JSON response: %w", err) | |
| } | |
| prettyJSON, err := json.MarshalIndent(result, "", " ") | |
| if err != nil { | |
| return fmt.Errorf("failed to format JSON response: %w", err) | |
| } | |
| fmt.Println(string(prettyJSON)) | |
| return nil |
🧰 Tools
🪛 golangci-lint (2.12.2)
[error] 108-108: error is not nil (line 106) but it returns nil
(nilerr)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/commands/nodepool/list.go` around lines 104 - 112, In the json
branch of list handling in the nodepool command, the current json.Unmarshal
failure path prints the raw body and still returns nil, which makes malformed
output look successful. Update the logic in the list command’s JSON output
handling to return the decode error instead of nil, and keep the existing body
print or replace it with an error message so callers can detect failures from
opts.output == "json".
Source: Linters/SAST tools
| "roles": map[string]interface{}{ | ||
| "ingressARN": iamOutputs["IngressRoleArn"], | ||
| "imageRegistryARN": iamOutputs["ImageRegistryRoleArn"], | ||
| "storageARN": iamOutputs["EBSCSIRoleArn"], | ||
| "networkARN": iamOutputs["NetworkConfigRoleArn"], | ||
| "kubeCloudControllerARN": iamOutputs["CloudControllerManagerRoleArn"], | ||
| "nodePoolManagementARN": iamOutputs["NodePoolManagementRoleArn"], | ||
| "controlPlaneOperatorARN": iamOutputs["ControlPlaneOperatorRoleArn"], | ||
| }, | ||
| }, | ||
| } | ||
| // WorkerInstanceProfileName and WorkerRoleArn are top-level spec fields. | ||
| spec["workerInstanceProfileName"] = iamOutputs["WorkerInstanceProfileName"] |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Fail fast when required stack outputs are missing or blank.
Direct iamOutputs[...] lookups silently emit empty strings, and blank PrivateSubnetIds becomes an empty slice. That can produce an invalid ClusterSpec and break downstream auto-discovery of workerInstanceProfileName / privateSubnetIds.
Proposed validation
+ for _, key := range []string{
+ "IngressRoleArn",
+ "ImageRegistryRoleArn",
+ "EBSCSIRoleArn",
+ "NetworkConfigRoleArn",
+ "CloudControllerManagerRoleArn",
+ "NodePoolManagementRoleArn",
+ "ControlPlaneOperatorRoleArn",
+ "WorkerInstanceProfileName",
+ } {
+ if strings.TrimSpace(iamOutputs[key]) == "" {
+ return nil, fmt.Errorf("missing required IAM stack output %q", key)
+ }
+ }
+
// Map IAM role ARNs into the nested platform.aws.roles structure
// expected by the hyperfleet ClusterSpec CRD.
spec["platform"] = map[string]interface{}{ if key == "PrivateSubnetIds" {
parts := strings.Split(value, ",")
trimmed := make([]string, 0, len(parts))
for _, s := range parts {
if t := strings.TrimSpace(s); t != "" {
trimmed = append(trimmed, t)
}
}
+ if len(trimmed) == 0 {
+ return nil, fmt.Errorf("VPC stack output %q must contain at least one private subnet ID", key)
+ }
spec[camelKey] = trimmed
continue
}Also applies to: 118-126
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/services/cluster/service.go` around lines 96 - 108, The cluster spec
assembly in the service that builds `ClusterSpec` should fail fast instead of
silently accepting missing or blank stack outputs. Add validation around the
`iamOutputs[...]` lookups and `PrivateSubnetIds` handling so required values are
checked before populating `spec`, and return an error if any are empty or
missing. Use the existing cluster service/spec builder path and the
`workerInstanceProfileName` / `privateSubnetIds` fields as the key points to
verify and guard.
Summary
Continuation of #84 on a new branch.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
nodepoolmanagement commands to create, list, and delete nodepools from the CLI.Bug Fixes