Skip to content

feat: Add REST API models/endpoints for Rack creation (POST) and update (PATCH)#247

Draft
kunzhao-nv wants to merge 7 commits intomainfrom
feat/rla/new-rack
Draft

feat: Add REST API models/endpoints for Rack creation (POST) and update (PATCH)#247
kunzhao-nv wants to merge 7 commits intomainfrom
feat/rla/new-rack

Conversation

@kunzhao-nv
Copy link
Contributor

@kunzhao-nv kunzhao-nv commented Mar 12, 2026

Description

Add two new REST API endpoints for rack operations:

  • POST /v2/org/{org}/carbide/rack — Create an expected rack definition.
  • PATCH /v2/org/{org}/carbide/rack/{id} — Partially update an existing rack.

Type of Change

  • Feature - New feature or functionality (feat:)
  • Fix - Bug fixes (fix:)
  • Chore - Modification or removal of existing functionality (chore:)
  • Refactor - Refactoring of existing functionality (refactor:)
  • Docs - Changes in documentation or OpenAPI schema (docs:)
  • CI - Changes in Github workflows. Requires additional scrutiny (ci:)
  • Version - Issuing a new release version (version:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Manager - Site Manager updated
  • Cert Manager - Cert Manager updated
  • Site Agent - Site Agent updated
  • RLA - RLA service updated
  • Powershelf Manager - Powershelf Manager updated
  • NVSwitch Manager - NVSwitch Manager updated

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Copilot AI review requested due to automatic review settings March 12, 2026 19:49
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-03-12 19:52:56 UTC | Commit: 0ae4a72

@github-actions
Copy link

🛡️ Vulnerability Scan

🚨 Found 64 vulnerability(ies)
📊 vs main: 64 (no change)

Severity Breakdown:

  • 🔴 Critical/High: 64
  • 🟡 Medium: 0
  • 🔵 Low/Info: 0

🔗 View full details in Security tab

🕐 Last updated: 2026-03-12 19:53:06 UTC | Commit: 0ae4a72

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for creating and partially updating Rack definitions via new REST endpoints, wiring the API layer through Temporal workflows/activities into the RLA gRPC client, and updating the OpenAPI spec + generated SDK to expose the new operations.

Changes:

  • Added Temporal workflows + activities for CreateExpectedRack and PatchRack, with unit tests.
  • Added API handlers/routes and request/response models for POST /v2/org/{org}/carbide/rack and PATCH /v2/org/{org}/carbide/rack/{id}, with handler/model tests.
  • Updated OpenAPI spec and generated SDK client/models for the new endpoints.

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
site-workflow/pkg/workflow/rack.go Adds CreateExpectedRack and PatchRack workflows that invoke the corresponding activities.
site-workflow/pkg/workflow/rack_test.go Adds workflow unit tests for create + patch rack workflows.
site-workflow/pkg/activity/rack.go Adds ManageRack.CreateExpectedRack and ManageRack.PatchRack activities calling RLA gRPC.
site-workflow/pkg/activity/rack_test.go Adds activity unit tests for create + patch rack request validation and basic success paths.
site-agent/pkg/components/managers/rla/subscriber.go Registers the new rack workflows/activities with the Temporal worker.
api/pkg/api/routes.go Adds REST routes for rack POST and rack PATCH.
api/pkg/api/model/rack.go Adds REST request/response models and proto conversion helpers for rack create/patch.
api/pkg/api/model/rack_test.go Adds validation/conversion tests for the new rack API models.
api/pkg/api/handler/rack.go Implements CreateRackHandler and PatchRackHandler that execute Temporal workflows.
api/pkg/api/handler/rack_test.go Adds handler tests covering success and common failure cases (auth, RLA disabled, validation).
openapi/spec.yaml Adds POST and PATCH operations + schemas for create/patch rack.
sdk/standard/api_rack.go Generated SDK methods for CreateRack and PatchRack endpoints.
sdk/standard/model_create_rack_request.go Generated SDK model for create rack request payload.
sdk/standard/model_create_rack_response.go Generated SDK model for create rack response payload.
sdk/standard/model_patch_rack_request.go Generated SDK model for patch rack request payload.
sdk/standard/model_patch_rack_response.go Generated SDK model for patch rack response payload.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Mar 12, 2026

Test Results

8 178 tests  +54   8 178 ✅ +54   8m 11s ⏱️ +10s
  139 suites ± 0       0 💤 ± 0 
   14 files   ± 0       0 ❌ ± 0 

Results for commit 5719838. ± Comparison against base commit 55c3a85.

♻️ This comment has been updated with latest results.

@thossain-nv thossain-nv changed the title feat: Add REST APIs for Rack POST (Create) and PATCH (Update) feat: Add REST API models and endpoints for Rack creation (POST) and update (PATCH) Mar 13, 2026
@thossain-nv thossain-nv changed the title feat: Add REST API models and endpoints for Rack creation (POST) and update (PATCH) feat: Add REST API models/endpoints for Rack creation (POST) and update (PATCH) Mar 13, 2026
@thossain-nv
Copy link
Contributor

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bbab5405-d2e2-49c5-ad63-e4674b75d854

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Create and Update Rack APIs: new request/response models, validation and proto conversions, HTTP handlers, routes, SDK methods, OpenAPI specs, Temporal workflows and activities, registration, and comprehensive unit tests across layers.

Changes

Cohort / File(s) Summary
API Handler Layer
api/pkg/api/handler/rack.go, api/pkg/api/handler/rack_test.go
Adds CreateRackHandler and UpdateRackHandler implementations and tests; handler logic covers request parsing, org/provider validation, site resolution, Temporal workflow execution, and response handling.
API Model Layer
api/pkg/api/model/rack.go, api/pkg/api/model/rack_test.go
Introduces API types for create/update racks (requests, responses, components, BMCs, location), validations, ToProto conversions, and unit tests for validation and mapping.
REST API Routes
api/pkg/api/routes.go, api/pkg/api/routes_test.go
Registers POST /org/:org/.../rack and PATCH /org/:org/.../rack/:id routes and updates route test expectations.
OpenAPI Spec
openapi/spec.yaml
Adds CreateRack and PatchRack schemas and operations (POST/PATCH) with documented responses including 201/200 and error cases (400, 403, 404, 412).
SDK Service & Models
sdk/standard/api_rack.go, sdk/standard/model_create_rack_*.go, sdk/standard/model_patch_rack_*.go
Generates SDK request/response models and Rack API service methods for CreateRack and UpdateRack, including JSON (de)serialization, validation, nullable wrappers, and Execute methods.
Temporal Workflows
site-workflow/pkg/workflow/rack.go, site-workflow/pkg/workflow/rack_test.go
Adds CreateExpectedRack and UpdateRack workflows with retry/timeouts, activity invocation, and tests for success and failure paths.
Temporal Activities
site-workflow/pkg/activity/rack.go, site-workflow/pkg/activity/rack_test.go
Adds ManageRack activities CreateExpectedRack and UpdateRack that validate requests, call RLA client RPCs, and return structured responses; includes unit tests.
Agent Registration
site-agent/pkg/components/managers/rla/subscriber.go
Registers new rack workflows and activities for the RLA manager.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Handler as CreateRackHandler
    participant Validator as Validation Layer
    participant Temporal as Temporal Client
    participant Activity as CreateExpectedRack Activity
    participant RLA as RLA Client

    Client->>Handler: POST /rack (CreateRackRequest)
    Handler->>Validator: Check org membership & provider admin role
    Validator-->>Handler: OK
    Handler->>Validator: Validate request body (siteId,name,manufacturer,serialNumber,...)
    Validator-->>Handler: Validated
    Handler->>Temporal: Start CreateExpectedRack workflow
    Temporal->>Activity: Execute activity
    Activity->>Activity: Validate request
    Activity->>RLA: Call CreateExpectedRack RPC
    RLA-->>Activity: Return RackID
    Activity-->>Temporal: Return CreateExpectedRackResponse
    Temporal-->>Handler: Workflow result
    Handler-->>Client: 201 Created + { id }
Loading
sequenceDiagram
    actor Client
    participant Handler as UpdateRackHandler
    participant Validator as Validation Layer
    participant Temporal as Temporal Client
    participant Activity as UpdateRack Activity
    participant RLA as RLA Client

    Client->>Handler: PATCH /rack/:id (PatchRackRequest)
    Handler->>Validator: Check org membership & provider admin role
    Validator-->>Handler: OK
    Handler->>Handler: Extract rack ID from URL
    Handler->>Validator: Validate site & association with provider, validate body (siteId required)
    Validator-->>Handler: Validated
    Handler->>Temporal: Start UpdateRack workflow
    Temporal->>Activity: Execute activity
    Activity->>Activity: Validate patch request & rack id
    Activity->>RLA: Call PatchRack RPC
    RLA-->>Activity: Return PatchRackResponse (report)
    Activity-->>Temporal: Return response
    Temporal-->>Handler: Workflow result
    Handler-->>Client: 200 OK + { report }
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through specs and Temporal streams,

I stitched new handlers and model dreams,
From API call to rack's new birth,
I patched and created — oh what mirth! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely describes the main change: adding REST API models/endpoints for Rack creation and update operations.
Description check ✅ Passed The PR description clearly explains the two new REST API endpoints being added (POST and PATCH for rack operations) and relates directly to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/rla/new-rack
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
sdk/standard/api_rack.go (1)

10-10: ⚠️ Potential issue | 🟠 Major

Pipeline failure: SDK regeneration required.

The CI pipeline reports that sdk/standard/ was not updated. Since this file is auto-generated by OpenAPI Generator (per the header), the proper workflow is:

  1. Update openapi/spec.yaml with the new endpoints
  2. Run make generate-sdk to regenerate the SDK files

Please ensure the regeneration step was executed and the generated output matches these changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/standard/api_rack.go` at line 10, The SDK in sdk/standard was not
regenerated after API changes; update the OpenAPI spec (openapi/spec.yaml) with
the new/changed endpoints, then run the SDK generation command (make
generate-sdk) to regenerate the files so sdk/standard/api_rack.go and other
generated sources reflect those changes; commit the regenerated outputs
alongside the spec changes to satisfy the CI check.
openapi/spec.yaml (1)

8796-8974: ⚠️ Potential issue | 🟠 Major

CI gate is currently blocked by missing generated artifacts updates.

This spec change is valid in scope, but the pipeline failure shows required companion outputs were not updated after editing openapi/spec.yaml (SDK/docs generation). Please update sdk/standard/ and docs/index.html in this PR to unblock merge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 8796 - 8974, The CI failed because changes to
openapi/spec.yaml (new/edited endpoints with operationId create-rack, get-rack,
patch-rack and associated schemas/examples) were not accompanied by regenerated
artifacts; regenerate the SDK and docs from the updated spec: run the
OpenAPI/SDK generation commands your repo uses to update sdk/standard/ (re-run
the generator that produces the clients for the modified endpoints) and
regenerate the static documentation that produces docs/index.html (or run the
docs build script), verify the generated files include the new/changed
operationIds and examples (create-rack, get-rack, patch-rack), run the repo's
build/test scripts, then commit the updated sdk/standard/ and docs/index.html
files so the pipeline will pass.
🧹 Nitpick comments (3)
openapi/spec.yaml (2)

15795-15802: Mark response payload fields as required for stronger SDK typing.

CreateRackResponse.id and PatchRackResponse.report are modeled as optional right now because required is missing. If handlers always return these fields, declare them required to avoid nullable/optional client types.

Proposed schema hardening
     CreateRackResponse:
       title: CreateRackResponse
       type: object
       description: Response for creating a Rack containing the new rack's ID
+      required:
+        - id
       properties:
         id:
           type: string
           format: uuid
           description: ID of the newly created Rack
@@
     PatchRackResponse:
       title: PatchRackResponse
       type: object
       description: Response for patching a Rack
+      required:
+        - report
       properties:
         report:
           type: string
           description: Summary report of changes applied

Also applies to: 15831-15838

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 15795 - 15802, Add required arrays to the
relevant OpenAPI schemas so these response fields are non-optional in generated
SDKs: update the CreateRackResponse schema to include "id" in a required list
(so CreateRackResponse.id is mandatory) and update the PatchRackResponse schema
to include "report" in its required list (so PatchRackResponse.report is
mandatory); ensure you place the required: [ ... ] entries at the same object
level as title/type/description to harden typing for generated clients.

15806-15813: Clarify PATCH semantics vs required siteId.

Line 15808 requires siteId, but the operation text (Line 8934) says only provided fields are updated. This reads as contradictory for clients. Either make siteId optional or explicitly document it as required context (not a mutable patch field).

Proposed wording fix (if `siteId` must remain required)
-      description: Request body for patching an existing Rack. Only provided fields are updated.
+      description: Request body for patching an existing Rack. `siteId` is required context; other provided fields are updated.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 15806 - 15813, The PATCH Rack request body
currently lists siteId in the required array while the operation text says only
provided fields are updated; either make siteId optional by removing it from the
required list in the Rack PATCH request body (properties.siteId remains present
but not required), or explicitly document that siteId is not a mutable patch
field by removing it from the request body and moving it to an explicit
path/query/header parameter (or add a description to properties.siteId stating
it is required as context/immutable and will be ignored if supplied), and update
the Rack PATCH operation description to state which approach was chosen so
clients are not confused.
api/pkg/api/model/rack.go (1)

669-674: Consider rejecting no-op PATCH payloads

Validation currently allows requests with only siteId and no mutable fields, which turns PATCH into a no-op call.

Proposed fix
 func (r *APIPatchRackRequest) Validate() error {
 	if r.SiteID == "" {
 		return fmt.Errorf("siteId is required")
 	}
+	if r.Name == nil &&
+		r.Manufacturer == nil &&
+		r.Model == nil &&
+		r.SerialNumber == nil &&
+		r.Description == nil &&
+		r.Location == nil {
+		return fmt.Errorf("at least one field must be provided")
+	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/model/rack.go` around lines 669 - 674,
APIPatchRackRequest.Validate currently only enforces SiteID but permits payloads
that don't set any mutable fields (making PATCH a no-op); update Validate
(method APIPatchRackRequest.Validate) to detect whether at least one updatable
field is provided (e.g., check the request's mutable fields such as Name,
Description, Tags, Status, Capacity, etc. as defined on APIPatchRackRequest) and
if none are set return a validation error like "no fields to update"; keep the
existing SiteID check and add this no-op check after it so empty/unchanged PATCH
payloads are rejected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/pkg/api/handler/rack.go`:
- Around line 711-723: The workflow StartWorkflowOptions currently sets ID to
fmt.Sprintf("rack-patch-%s", rackStrID) and uses
WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING which causes concurrent PatchRack
requests to reuse an existing workflow and lose updates; change the workflow ID
to include the request-specific hash (use common.RequestHash(rlaRequest) or
equivalent) so each distinct request body yields a unique ID, and switch the
conflict policy to allow duplicate executions (or remove USE_EXISTING) where you
call stc.ExecuteWorkflow for "PatchRack" with rlaRequest (also update the
similar occurrence around the other ExecuteWorkflow call at the referenced
second location).
- Around line 552-563: The workflow ID currently built from
fmt.Sprintf("rack-create-%s-%s", apiRequest.Name, apiRequest.SerialNumber)
causes distinct payloads to collide; replace that ID with the request-specific
hash used elsewhere by calling RequestHash(apiRequest) (optionally with a
"rack-create-" prefix) when constructing StartWorkflowOptions.ID so each
CreateExpectedRack invocation is unique per apiRequest payload; keep the
existing WorkflowIDConflictPolicy/WorkflowIDReusePolicy and ensure
ExecuteWorkflow still passes the same workflow name "CreateExpectedRack" and
rlaRequest payload.

In `@api/pkg/api/model/rack_test.go`:
- Around line 334-350: APIPatchRackRequest currently uses APIRackLocation (value
fields) for Location, so converting via ToProtoRack will overwrite sibling
location fields when a partial patch only sets Region; change the model to avoid
destructive partial updates by introducing a patch-specific location type with
pointer/optional fields (e.g., APIPatchRackLocation with *string members for
Region, Datacenter, Room, Position) and update APIPatchRackRequest.Location to
use that type, then modify ToProtoRack to merge only non-nil
APIPatchRackLocation fields into the existing proto location rather than
replacing it wholesale (or alternatively require callers to provide the full
APIRackLocation object if you prefer not to add a patch type).
- Around line 307-310: The test currently treats an APIPatchRackRequest with
only SiteID set ("valid - with siteId") as a valid, but no-op, PATCH; instead
update the validation so PATCH payloads must include at least one mutable field
besides SiteID and then update the test accordingly. Locate the
APIPatchRackRequest validation logic (e.g., the APIPatchRackRequest.Validate
method or the PatchRack handler that processes APIPatchRackRequest) and add a
check that returns an error when the request has no mutable fields set other
than SiteID; then change the test case "valid - with siteId" to either include
another mutable field (so it remains valid) or mark wantErr: true to assert the
SiteID-only payload is rejected.

In `@api/pkg/api/model/rack.go`:
- Around line 512-515: The loops over r.Components and BMCs call
comp.Validate(), comp.ToProto() and bmc.ToProto() without guarding for nil
entries; update the validation and conversion loops (the code around
r.Components iteration and the similar blocks at the other locations noted) to
check for nil items first and return a clear validation error like
"components[%d]: nil" or "bmcs[%d]: nil" if encountered, and in conversion
functions guard against nil before calling ToProto (or skip/return an error) so
comp.Validate(), comp.ToProto(), and bmc.ToProto() are never invoked on a nil
pointer.
- Around line 554-556: The SlotID/TrayIdx/HostID fields are currently plain
int32 so omitted JSON becomes zero in ToProto(); change these fields to pointer
types (e.g., *int32) in the struct so omission is preserved, update any JSON
unmarshalling/usage accordingly, and modify ToProto() to only build and attach a
Position message when at least one of SlotID/TrayIdx/HostID is non-nil (set the
corresponding proto fields from the pointer values). Apply the same
pointer-and-conditional-Position change for the other occurrence of these fields
(the block currently around the second set of SlotID/TrayIdx/HostID) so omitted
values remain omitted rather than becoming explicit zeroes.
- Around line 561-572: The Validate methods (e.g.,
APICreateRackComponent.Validate) currently only check for non-empty strings and
allow invalid enum names to be coerced into UNKNOWN; update these validators to
explicitly reject invalid enum values by attempting to parse/lookup the enum
(using the generated enum helpers for the corresponding proto/enum type) and
returning an error when parsing fails (for fields like Type and any other enum
string fields), and apply the same change to the other Validate functions noted
in the review (the other Create/Update rack/component Validate methods).

In `@openapi/spec.yaml`:
- Around line 8956-8973: The PATCH operation for /v2/org/{org}/carbide/rack/{id}
is missing a 404 response; update the operation responses to include a '404'
entry that references the existing NotFoundError response (e.g., $ref:
'#/components/responses/NotFoundError') so consumers can handle missing-rack
cases; modify the same responses block that currently lists '200', '400', '403',
and '412' and add the '404' reference alongside them, keeping the success schema
reference to PatchRackResponse unchanged.

In `@sdk/standard/model_create_rack_response.go`:
- Around line 1-10: The file model_create_rack_response.go is a generated SDK
model that is out of sync with the OpenAPI spec; run the SDK generator and
commit the result by executing "make generate-sdk", verify the generated
CreateRackResponse model (and other files under the SDK) have the proper
generated header and updated types/fields, resolve any generator errors, and
commit the regenerated files so the CI warning is cleared.

---

Outside diff comments:
In `@openapi/spec.yaml`:
- Around line 8796-8974: The CI failed because changes to openapi/spec.yaml
(new/edited endpoints with operationId create-rack, get-rack, patch-rack and
associated schemas/examples) were not accompanied by regenerated artifacts;
regenerate the SDK and docs from the updated spec: run the OpenAPI/SDK
generation commands your repo uses to update sdk/standard/ (re-run the generator
that produces the clients for the modified endpoints) and regenerate the static
documentation that produces docs/index.html (or run the docs build script),
verify the generated files include the new/changed operationIds and examples
(create-rack, get-rack, patch-rack), run the repo's build/test scripts, then
commit the updated sdk/standard/ and docs/index.html files so the pipeline will
pass.

In `@sdk/standard/api_rack.go`:
- Line 10: The SDK in sdk/standard was not regenerated after API changes; update
the OpenAPI spec (openapi/spec.yaml) with the new/changed endpoints, then run
the SDK generation command (make generate-sdk) to regenerate the files so
sdk/standard/api_rack.go and other generated sources reflect those changes;
commit the regenerated outputs alongside the spec changes to satisfy the CI
check.

---

Nitpick comments:
In `@api/pkg/api/model/rack.go`:
- Around line 669-674: APIPatchRackRequest.Validate currently only enforces
SiteID but permits payloads that don't set any mutable fields (making PATCH a
no-op); update Validate (method APIPatchRackRequest.Validate) to detect whether
at least one updatable field is provided (e.g., check the request's mutable
fields such as Name, Description, Tags, Status, Capacity, etc. as defined on
APIPatchRackRequest) and if none are set return a validation error like "no
fields to update"; keep the existing SiteID check and add this no-op check after
it so empty/unchanged PATCH payloads are rejected.

In `@openapi/spec.yaml`:
- Around line 15795-15802: Add required arrays to the relevant OpenAPI schemas
so these response fields are non-optional in generated SDKs: update the
CreateRackResponse schema to include "id" in a required list (so
CreateRackResponse.id is mandatory) and update the PatchRackResponse schema to
include "report" in its required list (so PatchRackResponse.report is
mandatory); ensure you place the required: [ ... ] entries at the same object
level as title/type/description to harden typing for generated clients.
- Around line 15806-15813: The PATCH Rack request body currently lists siteId in
the required array while the operation text says only provided fields are
updated; either make siteId optional by removing it from the required list in
the Rack PATCH request body (properties.siteId remains present but not
required), or explicitly document that siteId is not a mutable patch field by
removing it from the request body and moving it to an explicit path/query/header
parameter (or add a description to properties.siteId stating it is required as
context/immutable and will be ignored if supplied), and update the Rack PATCH
operation description to state which approach was chosen so clients are not
confused.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8fdcf364-389a-4b43-866a-6a13cd20b593

📥 Commits

Reviewing files that changed from the base of the PR and between 5bbdcf7 and 0ae4a72.

📒 Files selected for processing (18)
  • api/pkg/api/handler/rack.go
  • api/pkg/api/handler/rack_test.go
  • api/pkg/api/model/rack.go
  • api/pkg/api/model/rack_test.go
  • api/pkg/api/routes.go
  • api/pkg/api/routes_test.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_rack.go
  • sdk/standard/model_create_rack_request.go
  • sdk/standard/model_create_rack_response.go
  • sdk/standard/model_patch_rack_request.go
  • sdk/standard/model_patch_rack_response.go
  • site-agent/pkg/components/managers/rla/subscriber.go
  • site-workflow/pkg/activity/rack.go
  • site-workflow/pkg/activity/rack_test.go
  • site-workflow/pkg/workflow/rack.go
  • site-workflow/pkg/workflow/rack_test.go

Copy link
Contributor

@jw-nvidia jw-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we do the following currently

  1. define REST API requests
  2. REST API handlers convert the requests to RLA protobuf requests and start Temporal workflow.
  3. Site-Agent workers calls RLA grpc APIs

Should we consider to do the following instead?

  1. define REST API requests with RLA domain types
  2. REST API handlers start Temporal workflow with the necessary information from REST API requests.
  3. Site-Agent works use the RLA client package

Copy link
Contributor

@thossain-nv thossain-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few areas of concern that need addressing.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
api/pkg/api/model/rack.go (2)

585-587: ⚠️ Potential issue | 🟠 Major

Guard conversion loops against nil entries to avoid panics.

ToProtoRack() and component ToProto() still call comp.ToProto() / bmc.ToProto() without nil checks. Validation catches this on the main handler path, but conversion should remain defensive.

Suggested fix
 func (r *APICreateRackRequest) ToProtoRack() *rlav1.Rack {
@@
-	for _, comp := range r.Components {
-		rack.Components = append(rack.Components, comp.ToProto())
+	for _, comp := range r.Components {
+		if comp == nil {
+			continue
+		}
+		rack.Components = append(rack.Components, comp.ToProto())
 	}
 	return rack
 }
@@
 func (c *APICreateRackComponentRequest) ToProto() *rlav1.Component {
@@
-	for _, bmc := range c.BMCs {
-		comp.Bmcs = append(comp.Bmcs, bmc.ToProto())
+	for _, bmc := range c.BMCs {
+		if bmc == nil {
+			continue
+		}
+		comp.Bmcs = append(comp.Bmcs, bmc.ToProto())
 	}
 	return comp
 }

Also applies to: 673-675

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/model/rack.go` around lines 585 - 587, The conversion loops in
ToProtoRack are not defensive against nil entries; update the loops that iterate
r.Components and the similar loop around BMCs (and any other component slices)
to skip nil elements before calling comp.ToProto() or bmc.ToProto() (e.g., if
comp == nil { continue }), ensuring you reference the ToProtoRack method and the
Component/BMC ToProto calls so no nil ToProto() is invoked.

620-622: ⚠️ Potential issue | 🟠 Major

Keep validation errors structurally consistent.

These branches return plain fmt.Errorf while surrounding validation uses validation.Errors. This creates inconsistent error payload shapes for clients.

Suggested fix
 	if _, ok := rlav1.ComponentType_value[c.Type]; !ok {
-		return fmt.Errorf("invalid component type: %q", c.Type)
+		return validation.Errors{
+			"type": fmt.Errorf("invalid component type: %q", c.Type),
+		}
 	}
@@
 	if _, ok := rlav1.BMCType_value[b.Type]; !ok {
-		return fmt.Errorf("invalid bmc type: %q", b.Type)
+		return validation.Errors{
+			"type": fmt.Errorf("invalid bmc type: %q", b.Type),
+		}
 	}
@@
 	if r.Name == nil && r.Manufacturer == nil && r.Model == nil &&
 		r.SerialNumber == nil && r.Description == nil && r.Location == nil {
-		return fmt.Errorf("at least one field to update must be provided")
+		return validation.Errors{
+			"request": fmt.Errorf("at least one field to update must be provided"),
+		}
 	}

Also applies to: 699-701, 797-800

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/model/rack.go` around lines 620 - 622, Validation branches that
currently return fmt.Errorf (e.g., the check using rlav1.ComponentType_value and
variable c.Type, and the similar checks at the other noted spots) should return
a structured validation.Errors instead of plain errors; locate the validations
in model/rack.go around the component type checks (references:
rlav1.ComponentType_value and variable c) and replace the fmt.Errorf returns
with validation.Errors that map the offending field (e.g., the component's
"type" or the appropriate indexed component path) to a validation.Error
containing the same message so the API always returns a consistent validation
payload shape.
api/pkg/api/handler/rack.go (1)

557-561: ⚠️ Potential issue | 🔴 Critical

USE_EXISTING on create can return the wrong result for concurrent payloads with the same serial.

With Line 560 set to WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING, a second POST with the same serialNumber but different body can attach to the first in-flight workflow and return its result. If the intent is to reject parallel creates for a serial, fail fast and return conflict.

Suggested fix
+	workflowID := fmt.Sprintf("rack-create-%s", apiRequest.SerialNumber)
 	workflowOptions := tClient.StartWorkflowOptions{
-		ID:                       fmt.Sprintf("rack-create-%s", apiRequest.SerialNumber),
+		ID:                       workflowID,
 		WorkflowIDReusePolicy:    temporalEnums.WORKFLOW_ID_REUSE_POLICY_ALLOW_DUPLICATE,
-		WorkflowIDConflictPolicy: temporalEnums.WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING,
+		WorkflowIDConflictPolicy: temporalEnums.WORKFLOW_ID_CONFLICT_POLICY_FAIL,
 		WorkflowExecutionTimeout: cutil.WorkflowExecutionTimeout,
 		TaskQueue:                queue.SiteTaskQueue,
 	}
@@
 	we, err := stc.ExecuteWorkflow(ctx, workflowOptions, "CreateExpectedRack", rlaRequest)
 	if err != nil {
+		var alreadyStarted *serviceerror.WorkflowExecutionAlreadyStarted
+		if errors.As(err, &alreadyStarted) {
+			return cutil.NewAPIErrorResponse(c, http.StatusConflict, "Rack create already in progress for this serialNumber", nil)
+		}
 		logger.Error().Err(err).Msg("failed to execute CreateExpectedRack workflow")
 		return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Rack", nil)
 	}
@@
-			return common.TerminateWorkflowOnTimeOut(c, logger, stc, fmt.Sprintf("rack-create-%s", apiRequest.SerialNumber), err, "Rack", "CreateExpectedRack")
+			return common.TerminateWorkflowOnTimeOut(c, logger, stc, workflowID, err, "Rack", "CreateExpectedRack")
#!/bin/bash
# Verify whether workflow conflict errors are handled explicitly in rack handlers.
rg -n --type go -C2 'WorkflowExecutionAlreadyStarted|WORKFLOW_ID_CONFLICT_POLICY_(USE_EXISTING|FAIL)|ExecuteWorkflow\(' api/pkg/api/handler/rack.go

Also applies to: 568-572, 579-581

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/rack.go` around lines 557 - 561, The current
workflowOptions uses temporalEnums.WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING
which lets a concurrent POST with the same serial attach to an existing
in-flight workflow; change the policy to
temporalEnums.WORKFLOW_ID_CONFLICT_POLICY_FAIL_ON_CONFLICT in the
workflowOptions (symbol: workflowOptions / tClient.StartWorkflowOptions) and
then explicitly handle the workflow start error returned by ExecuteWorkflow:
detect the WorkflowExecutionAlreadyStarted error (symbol:
WorkflowExecutionAlreadyStarted) and return an HTTP 409 Conflict (or otherwise
fail fast) instead of treating it as a successful attach; ensure this change is
applied for all Create flow Start/ExecuteWorkflow calls that currently use
USE_EXISTING.
🧹 Nitpick comments (2)
openapi/spec.yaml (1)

16116-16119: Clarify Patch request description vs required siteId.

The text says only provided fields are updated, but siteId is required. Consider explicitly calling out that siteId is mandatory context while other fields are optional patch fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 16116 - 16119, Update the Patch Rack request
schema description to explicitly state that siteId is a mandatory contextual
identifier while all other properties are optional and will be patched only if
provided; modify the description string that currently precedes "required: -
siteId" (the Patch Rack request description in the OpenAPI schema) to read
something like: "siteId is required as the context/target of the patch; all
other properties are optional and only provided fields will be updated." Ensure
the schema still lists "required: - siteId" and leave other properties as
optional.
api/pkg/api/handler/rack_test.go (1)

339-398: Add explicit tests for invalid rack ID format and workflow-conflict handling.

The update handler now validates UUID path format and should also distinguish conflict outcomes from generic failures. These high-signal cases are not covered in the new test matrix.

Suggested additions
 		{
+			name:           "failure - invalid rack ID format",
+			reqOrg:         org,
+			user:           providerUser,
+			rackID:         "not-a-uuid",
+			body:           fmt.Sprintf(`{"siteId":"%s","name":"Updated-Rack"}`, site.ID.String()),
+			expectedStatus: http.StatusBadRequest,
+		},
+		{
 			name:           "failure - invalid siteId",
 			reqOrg:         org,
 			user:           providerUser,
 			rackID:         rackID,
 			body:           fmt.Sprintf(`{"siteId":"%s","name":"Updated-Rack"}`, uuid.NewString()),
 			expectedStatus: http.StatusBadRequest,
 		},

Also applies to: 400-443

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/rack_test.go` around lines 339 - 398, Add two test cases
to the existing tests slice in rack_test.go: one named "failure - invalid rack
id format" that sets rackID to a non-UUID string (e.g. "not-a-uuid"), provides a
valid body with siteId and expects http.StatusBadRequest; and one named "failure
- conflict - workflow in progress" that uses the existing rackID, sets
mockReport to something like "Workflow conflict" and expects http.StatusConflict
to verify the handler distinguishes workflow-conflict responses from generic
failures. Ensure the new entries follow the same fields (name, reqOrg, user,
rackID, body, mockReport, expectedStatus) so the test harness and existing mocks
that inspect mockReport behave consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/pkg/api/handler/rack.go`:
- Around line 731-735: The current handler treats any error from
stc.ExecuteWorkflow (called as ExecuteWorkflow) as a 500, but when
WORKFLOW_ID_CONFLICT_POLICY_FAIL is set ExecuteWorkflow can return
serviceerror.WorkflowExecutionAlreadyStarted; update the error handling after
stc.ExecuteWorkflow to check for serviceerror.WorkflowExecutionAlreadyStarted
(import "go.temporal.io/api/serviceerror") and return cutil.NewAPIErrorResponse
with http.StatusConflict (409) in that case, otherwise keep logging via
logger.Error().Err(err).Msg("failed to execute UpdateRack workflow") and
returning the existing 500 response.

In `@openapi/spec.yaml`:
- Around line 16104-16112: The response schemas leave success fields optional
which produces weak client types; update the OpenAPI schemas for
CreateRackResponse and PatchRackResponse to mark the success fields as required
(add a required: ["id"] to CreateRackResponse and required: ["report"] to
PatchRackResponse), ensuring CreateRackResponse.id and PatchRackResponse.report
are non-optional in generated clients; locate the CreateRackResponse and
PatchRackResponse object definitions in the spec and add the required arrays
accordingly.

---

Duplicate comments:
In `@api/pkg/api/handler/rack.go`:
- Around line 557-561: The current workflowOptions uses
temporalEnums.WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING which lets a concurrent
POST with the same serial attach to an existing in-flight workflow; change the
policy to temporalEnums.WORKFLOW_ID_CONFLICT_POLICY_FAIL_ON_CONFLICT in the
workflowOptions (symbol: workflowOptions / tClient.StartWorkflowOptions) and
then explicitly handle the workflow start error returned by ExecuteWorkflow:
detect the WorkflowExecutionAlreadyStarted error (symbol:
WorkflowExecutionAlreadyStarted) and return an HTTP 409 Conflict (or otherwise
fail fast) instead of treating it as a successful attach; ensure this change is
applied for all Create flow Start/ExecuteWorkflow calls that currently use
USE_EXISTING.

In `@api/pkg/api/model/rack.go`:
- Around line 585-587: The conversion loops in ToProtoRack are not defensive
against nil entries; update the loops that iterate r.Components and the similar
loop around BMCs (and any other component slices) to skip nil elements before
calling comp.ToProto() or bmc.ToProto() (e.g., if comp == nil { continue }),
ensuring you reference the ToProtoRack method and the Component/BMC ToProto
calls so no nil ToProto() is invoked.
- Around line 620-622: Validation branches that currently return fmt.Errorf
(e.g., the check using rlav1.ComponentType_value and variable c.Type, and the
similar checks at the other noted spots) should return a structured
validation.Errors instead of plain errors; locate the validations in
model/rack.go around the component type checks (references:
rlav1.ComponentType_value and variable c) and replace the fmt.Errorf returns
with validation.Errors that map the offending field (e.g., the component's
"type" or the appropriate indexed component path) to a validation.Error
containing the same message so the API always returns a consistent validation
payload shape.

---

Nitpick comments:
In `@api/pkg/api/handler/rack_test.go`:
- Around line 339-398: Add two test cases to the existing tests slice in
rack_test.go: one named "failure - invalid rack id format" that sets rackID to a
non-UUID string (e.g. "not-a-uuid"), provides a valid body with siteId and
expects http.StatusBadRequest; and one named "failure - conflict - workflow in
progress" that uses the existing rackID, sets mockReport to something like
"Workflow conflict" and expects http.StatusConflict to verify the handler
distinguishes workflow-conflict responses from generic failures. Ensure the new
entries follow the same fields (name, reqOrg, user, rackID, body, mockReport,
expectedStatus) so the test harness and existing mocks that inspect mockReport
behave consistently.

In `@openapi/spec.yaml`:
- Around line 16116-16119: Update the Patch Rack request schema description to
explicitly state that siteId is a mandatory contextual identifier while all
other properties are optional and will be patched only if provided; modify the
description string that currently precedes "required: - siteId" (the Patch Rack
request description in the OpenAPI schema) to read something like: "siteId is
required as the context/target of the patch; all other properties are optional
and only provided fields will be updated." Ensure the schema still lists
"required: - siteId" and leave other properties as optional.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6cb7f27c-8d92-408f-8de0-13ef54d02bfe

📥 Commits

Reviewing files that changed from the base of the PR and between 0ae4a72 and 222524f.

📒 Files selected for processing (14)
  • api/pkg/api/handler/rack.go
  • api/pkg/api/handler/rack_test.go
  • api/pkg/api/model/rack.go
  • api/pkg/api/model/rack_test.go
  • api/pkg/api/routes.go
  • api/pkg/api/routes_test.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_rack.go
  • site-agent/pkg/components/managers/rla/subscriber.go
  • site-workflow/pkg/activity/rack.go
  • site-workflow/pkg/activity/rack_test.go
  • site-workflow/pkg/workflow/rack.go
  • site-workflow/pkg/workflow/rack_test.go
✅ Files skipped from review due to trivial changes (1)
  • api/pkg/api/model/rack_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • site-agent/pkg/components/managers/rla/subscriber.go

@kunzhao-nv kunzhao-nv marked this pull request as draft March 13, 2026 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants