Skip to content

docs: add CloudFormation timing analysis and deferred-IAM plan#79

Open
cdoan1 wants to merge 1 commit into
openshift-online:mainfrom
cdoan1:refactor-oidc
Open

docs: add CloudFormation timing analysis and deferred-IAM plan#79
cdoan1 wants to merge 1 commit into
openshift-online:mainfrom
cdoan1:refactor-oidc

Conversation

@cdoan1

@cdoan1 cdoan1 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Add comprehensive documentation analyzing stack creation timing and proposing the deferred-IAM optimization to reduce provisioning time from ~16-24 min to ~5-8 min.

  • docs/cloudformation-timing.md: Timing analysis of VPC/IAM/OIDC stack creation, wait points, and proposed timing observability features
  • docs/deferred-iam-implementation.md: Complete implementation plan for deferred-IAM flow including code examples, testing strategy, and 4-week rollout plan

Key insight: Creating IAM stack AFTER OIDC provider (instead of before) eliminates the 10-15 min IAM eventual consistency delay caused by trust policy updates.

Description

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • CI/CD or tooling change

Testing

  • Unit tests pass (make test)
  • Integration tests pass (if applicable)
  • Manual verification completed

Checklist

  • My code follows the project's coding conventions
  • I have updated documentation as needed
  • I have added tests that prove my fix/feature works
  • All new and existing tests pass

Summary by CodeRabbit

  • Documentation
    • Added comprehensive documentation for ROSA cluster infrastructure provisioning, including detailed timing analysis of CloudFormation stack creation sequences and durations.
    • Added documentation for an optimized cluster provisioning approach designed to reduce provisioning delays through reordered stack creation.
    • Covers timing characteristics, planned CLI enhancements for timing visibility, and optimization opportunities.

…ion plan

Add comprehensive documentation analyzing stack creation timing and proposing
the deferred-IAM optimization to reduce provisioning time from ~16-24 min to ~5-8 min.

- docs/cloudformation-timing.md: Timing analysis of VPC/IAM/OIDC stack creation,
  wait points, and proposed timing observability features
- docs/deferred-iam-implementation.md: Complete implementation plan for deferred-IAM
  flow including code examples, testing strategy, and 4-week rollout plan

Key insight: Creating IAM stack AFTER OIDC provider (instead of before) eliminates
the 10-15 min IAM eventual consistency delay caused by trust policy updates.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cdoan1

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2026
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Walkthrough

Two new documentation files are added. docs/cloudformation-timing.md analyzes CloudFormation stack wait times for ROSA cluster provisioning and proposes timing observability enhancements. docs/deferred-iam-implementation.md specifies a deferred-IAM provisioning design with an orchestration service, OIDC SkipIAMUpdate flag, timing instrumentation, two new CLI commands, and a rollout plan.

Changes

Deferred-IAM and CloudFormation Timing Documentation

Layer / File(s) Summary
CloudFormation timing analysis and observability proposals
docs/cloudformation-timing.md
Documents the VPC→IAM→OIDC stack sequence, per-stack timeouts and durations, NAT gateway and eventual-consistency wait characteristics, a wait-points summary table, proposed CLI timing output formats (stdout and JSON), and optimization opportunities with references to implementation files.
Deferred-IAM concept, flow comparison, and phased strategy
docs/deferred-iam-implementation.md
Compares the traditional IAM trust-policy update flow with the optimized VPC→OIDC→IAM order, states the key insight for eliminating the trust-policy update step, and outlines a three-phase implementation plan.
Orchestration service, OIDC refactor, timing instrumentation, and CLI commands
docs/deferred-iam-implementation.md
Specifies the ProvisionInfrastructure orchestration service with request/response structures; documents the SkipIAMUpdate flag on the OIDC request; extends CloudFormation stack output with Duration, StartedAt, and CompletedAt; defines rosactl cluster provision and rosactl cluster timings CLI commands with flags, output formatting, and command wiring.
Migration guidance, testing plan, and rollout schedule
docs/deferred-iam-implementation.md
Covers migration path for new vs. existing clusters, unit and LocalStack integration test strategy, a week-by-week rollout schedule, success metrics, and a list of future enhancements including parallelism, async mode, and retry logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 errors, 1 warning)

Check name Status Explanation Resolution
No-Weak-Crypto ❌ Error PR introduces SHA-1 cryptographic hashing in new file internal/crypto/thumbprint.go via crypto/sha1 package, which is a weak/deprecated hash algorithm. Replace crypto/sha1 with crypto/sha256 for certificate thumbprint calculation. Update GetOIDCThumbprint() to use SHA-256 instead: change "crypto/sha1" import to "crypto/sha256" and use sha256.Sum256() instead of sha1.Sum().
Container-Privileges ❌ Error PR adds docker-compose.localstack.yaml with privileged: true and user: "0" (root). While a comment mentions "Run as root to access Podman socket," justification is minimal and lacks explanation... Add detailed documentation justifying privileged mode requirement, explain why socket mounting with proper permissions/group membership is insufficient, and note this is development-only (not production).
Ai-Attribution ⚠️ Warning Commit 0338a7e uses Co-Authored-By trailer for Claude Sonnet 4.5 (AI tool), violating requirement to use Assisted-by/Generated-by for Red Hat AI attribution. Replace Co-Authored-By trailer with Assisted-by or Generated-by trailer per Red Hat attribution requirements for AI-assisted content.
✅ Passed checks (8 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main changes: adding two documentation files about CloudFormation timing analysis and the deferred-IAM optimization plan.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No-Sensitive-Data-In-Logs ✅ Passed Documentation does not expose passwords, tokens, API keys, PII, session IDs, internal hostnames, or customer data in logs. Only safe metadata is logged: timings, cluster names, regions, ARNs, and C...
No-Hardcoded-Secrets ✅ Passed No hardcoded secrets found. Documentation uses appropriate placeholder values (example.com URLs, generic cluster names, ellipsis for ARNs) without exposing API keys, tokens, credentials, or sensiti...
No-Injection-Vectors ✅ Passed No injection vectors detected. Documentation contains safe code examples with proper parameterization of stack names, AWS SDK usage, and no eval/exec/shell/SQL operations on user input.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@cdoan1 cdoan1 added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 18, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
docs/deferred-iam-implementation.md (1)

286-333: ⚡ Quick win

Clarify timing instrumentation scope for all CloudFormation stack operations.

Line 332 states "Similar changes for UpdateStack and DeleteStack," but the document only shows implementation details for CreateStack (lines 289–329). For consistency and completeness:

  1. UpdateStack should capture Duration, StartedAt, CompletedAt the same way
  2. DeleteStack should also capture timing if the StackOutput struct is returned
  3. Verify that all three operations return the extended StackOutput struct or if timing is optional for some

This is important because the deferred-IAM flow will call UpdateStack on the IAM stack (line 259), and timing instrumentation should be consistent across all stack operations.

Consider adding explicit examples or a code block for UpdateStack timing in the same manner as CreateStack, rather than deferring to a generic "similar changes" comment.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deferred-iam-implementation.md` around lines 286 - 333, The document
currently shows detailed implementation for the CreateStack method with timing
instrumentation but only references "similar changes" for UpdateStack and
DeleteStack without concrete examples. Add explicit code blocks for the
UpdateStack and DeleteStack methods in the diff that mirror the timing pattern
shown in CreateStack, capturing startTime before the operation begins and
endTime after the waiter completes, then assigning Duration, StartedAt, and
CompletedAt to the returned StackOutput struct. This ensures clarity about how
timing instrumentation is consistently applied across all three CloudFormation
stack operations and confirms that all three methods return the extended
StackOutput struct with timing information, which is critical for the
deferred-IAM flow that calls UpdateStack on the IAM stack.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/deferred-iam-implementation.md`:
- Around line 214-217: The createOIDCWithoutIAMUpdate function is currently a
placeholder that calls clusteroidc.CreateOIDC without actually using the
SkipIAMUpdate flag, so the IAM update will still run. Modify the function to set
the SkipIAMUpdate field to true on the CreateOIDCRequest before passing it to
clusteroidc.CreateOIDC. This ensures that when the request reaches CreateOIDC,
the flag will be checked and the IAM update step will be skipped as intended for
the deferred-IAM flow.
- Around line 688-724: The test function TestProvisionInfrastructure_DeferredIAM
is missing critical imports and helper function definitions required for
compilation. Add the necessary imports at the top of the test file including the
testing package, context package, testify/require, testify/assert, and any AWS
SDK packages for cloudformation and config. For the getLocalStackConfig helper
function that is called within TestProvisionInfrastructure_DeferredIAM, either
define it directly in the test file or import it from the appropriate test
utility package (such as test/localstack/setup.go) and ensure it is properly
exported with a capital letter if it needs to be imported from another package.
- Around line 226-269: The documentation for the `SkipIAMUpdate` flag in the
CreateOIDCRequest struct needs clarification on its purpose and behavior.
Currently, the code shows the flag allows callers to skip the IAM trust policy
update, but the documentation should explicitly explain whether this flag is
meant to optimize the deferred-IAM flow by avoiding unnecessary CloudFormation
API calls when the IAM stack doesn't exist, or whether it simply exposes the
existing implicit error-handling behavior as an explicit control. Add a section
that clearly distinguishes between the current behavior (always attempting the
update with internal error handling) and the proposed behavior (explicit
skipping via the flag), including the backward compatibility implications for
callers that depend on each approach.
- Around line 131-139: The documentation references function names that do not
exist in the actual codebase. Replace the function call
`createOIDCWithoutIAMUpdate()` with the correct exported function
`clusteroidc.CreateOIDC()` to match the actual service API. Additionally, the
function `extractOIDCIssuerDomain()` shown in the documentation has no
implementation in the codebase and should either be removed from the pseudocode
or replaced with the actual implementation if one exists, ensuring all code
examples in the documentation accurately reflect the real API signatures and
available functions.

---

Nitpick comments:
In `@docs/deferred-iam-implementation.md`:
- Around line 286-333: The document currently shows detailed implementation for
the CreateStack method with timing instrumentation but only references "similar
changes" for UpdateStack and DeleteStack without concrete examples. Add explicit
code blocks for the UpdateStack and DeleteStack methods in the diff that mirror
the timing pattern shown in CreateStack, capturing startTime before the
operation begins and endTime after the waiter completes, then assigning
Duration, StartedAt, and CompletedAt to the returned StackOutput struct. This
ensures clarity about how timing instrumentation is consistently applied across
all three CloudFormation stack operations and confirms that all three methods
return the extended StackOutput struct with timing information, which is
critical for the deferred-IAM flow that calls UpdateStack on the IAM stack.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift-online/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1c0cd5ee-954c-4579-a9f5-c8d4bfffddd0

📥 Commits

Reviewing files that changed from the base of the PR and between 80e0a95 and 0338a7e.

📒 Files selected for processing (2)
  • docs/cloudformation-timing.md
  • docs/deferred-iam-implementation.md

Comment on lines +131 to +139
vpcResp, err := clustervpc.CreateVPC(ctx, &clustervpc.CreateVPCRequest{
ClusterName: req.ClusterName,
VpcCidr: req.VpcCidr,
PublicSubnetCidrs: req.PublicSubnetCidrs,
PrivateSubnetCidrs: req.PrivateSubnetCidrs,
AvailabilityZones: req.AvailabilityZones,
SingleNatGateway: req.SingleNatGateway,
AWSConfig: req.AWSConfig,
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify function signatures
echo "=== Checking CreateVPC signature ==="
rg 'type CreateVPCRequest struct' internal/services/clustervpc/

echo -e "\n=== Checking CreateOIDC signature ==="
rg 'type CreateOIDCRequest struct' internal/services/clusteroidc/

echo -e "\n=== Checking CreateIAM signature ==="
rg 'type CreateIAMRequest struct' internal/services/clusteriam/

echo -e "\n=== Checking return types ==="
rg 'type CreateVPCResponse struct' internal/services/clustervpc/
rg 'type CreateOIDCResponse struct' internal/services/clusteroidc/

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 593


🏁 Script executed:

#!/bin/bash
# Get full struct definitions
echo "=== CreateVPCRequest struct ==="
rg -A 15 'type CreateVPCRequest struct' internal/services/clustervpc/service.go

echo -e "\n=== CreateOIDCRequest struct ==="
rg -A 15 'type CreateOIDCRequest struct' internal/services/clusteroidc/service.go

echo -e "\n=== CreateIAMRequest struct ==="
rg -A 15 'type CreateIAMRequest struct' internal/services/clusteriam/service.go

echo -e "\n=== CreateVPC function signature ==="
rg -B 2 -A 10 'func.*CreateVPC.*ctx.*CreateVPCRequest' internal/services/clustervpc/service.go

echo -e "\n=== CreateIAM function signature ==="
rg -B 2 -A 10 'func.*CreateIAM.*ctx.*CreateIAMRequest' internal/services/clusteriam/service.go

echo -e "\n=== Response structs ==="
rg -A 10 'type CreateVPCResponse struct' internal/services/clustervpc/service.go
rg -A 10 'type CreateOIDCResponse struct' internal/services/clusteroidc/service.go
rg -A 10 'type CreateIAMResponse struct' internal/services/clusteriam/service.go

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 2869


🏁 Script executed:

#!/bin/bash
# Search for createOIDCWithoutIAMUpdate function
echo "=== Searching for createOIDCWithoutIAMUpdate function ==="
rg 'func.*createOIDCWithoutIAMUpdate' --type go

echo -e "\n=== Checking clusteroidc package for OIDC functions ==="
rg 'func.*Create' internal/services/clusteroidc/service.go | head -20

echo -e "\n=== Looking at docs file for ProvisionInfrastructure function ==="
rg -B 5 -A 30 'func.*ProvisionInfrastructure' docs/deferred-iam-implementation.md

echo -e "\n=== Checking if functions are exported (capital first letter) ==="
rg '^func (CreateVPC|CreateIAM|CreateOIDC)' internal/services/clustervpc/service.go internal/services/clusteriam/service.go internal/services/clusteroidc/service.go

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 3397


🏁 Script executed:

#!/bin/bash
echo "=== Reading lines 131-192 from docs file ==="
sed -n '131,192p' docs/deferred-iam-implementation.md

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 2422


🏁 Script executed:

#!/bin/bash
echo "=== Searching entire codebase for createOIDCWithoutIAMUpdate ==="
rg 'createOIDCWithoutIAMUpdate' --type go

echo -e "\n=== Searching entire codebase for CreateOIDCWithoutIAMUpdate (capitalized) ==="
rg 'CreateOIDCWithoutIAMUpdate' --type go

echo -e "\n=== Searching docs directory for all function definitions ==="
rg 'func .*\(' docs/deferred-iam-implementation.md | grep -i oidc

echo -e "\n=== Check if docs file is intended as pseudocode or actual implementation ==="
head -50 docs/deferred-iam-implementation.md | grep -i 'pseudocode\|example\|implementation\|mock'

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 752


Fix function references to match actual service API.

The review raised valid concerns about function signature verification. Verification found:

clustervpc.CreateVPC() is exported with correct parameter struct
clusteriam.CreateIAM() is exported with correct parameter struct
Line 158: createOIDCWithoutIAMUpdate() does not exist in the codebase — it should be clusteroidc.CreateOIDC() instead

All request struct fields match the parameters shown. However, the docs pseudocode references createOIDCWithoutIAMUpdate (lowercase, unexported wrapper) which is not implemented. The actual service only provides clusteroidc.CreateOIDC().

Additionally, extractOIDCIssuerDomain() on line 171 is shown in the docs but has no implementation in the codebase.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deferred-iam-implementation.md` around lines 131 - 139, The
documentation references function names that do not exist in the actual
codebase. Replace the function call `createOIDCWithoutIAMUpdate()` with the
correct exported function `clusteroidc.CreateOIDC()` to match the actual service
API. Additionally, the function `extractOIDCIssuerDomain()` shown in the
documentation has no implementation in the codebase and should either be removed
from the pseudocode or replaced with the actual implementation if one exists,
ensuring all code examples in the documentation accurately reflect the real API
signatures and available functions.

Comment on lines +214 to +217
func createOIDCWithoutIAMUpdate(ctx context.Context, req *clusteroidc.CreateOIDCRequest) (*clusteroidc.CreateOIDCResponse, error) {
// TODO: Need to refactor clusteroidc.CreateOIDC to make IAM update optional
// For now, this is a placeholder showing the intent
return clusteroidc.CreateOIDC(ctx, req)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

Placeholder implementation of createOIDCWithoutIAMUpdate won't actually skip IAM update.

Lines 214–217 define a wrapper function that calls clusteroidc.CreateOIDC(ctx, req) without using the SkipIAMUpdate flag introduced in the refactored CreateOIDCRequest (lines 231–237). This placeholder will not achieve the deferred-IAM flow because the IAM update will still run.

The correct implementation should:

  1. Set SkipIAMUpdate: true on the request, or
  2. Call a variant of CreateOIDC that explicitly skips the update, or
  3. Directly call the lower-level CloudFormation stack creation without the update step.

This needs to be resolved before the code is implemented to ensure the deferred-IAM flow actually eliminates the 10–15 minute eventual consistency delay.

💡 Suggested approach

If SkipIAMUpdate is passed through CreateOIDCRequest (as shown in lines 231–237), the wrapper should explicitly set it:

  func createOIDCWithoutIAMUpdate(ctx context.Context, req *clusteroidc.CreateOIDCRequest) (*clusteroidc.CreateOIDCResponse, error) {
-     // TODO: Need to refactor clusteroidc.CreateOIDC to make IAM update optional
-     // For now, this is a placeholder showing the intent
-     return clusteroidc.CreateOIDC(ctx, req)
+     req.SkipIAMUpdate = true
+     return clusteroidc.CreateOIDC(ctx, req)
  }

Alternatively, if clusteroidc.CreateOIDC will be fully refactored to check the flag internally (lines 256–262), ensure the flag is set by the caller in the ProvisionInfrastructure function at line 158.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deferred-iam-implementation.md` around lines 214 - 217, The
createOIDCWithoutIAMUpdate function is currently a placeholder that calls
clusteroidc.CreateOIDC without actually using the SkipIAMUpdate flag, so the IAM
update will still run. Modify the function to set the SkipIAMUpdate field to
true on the CreateOIDCRequest before passing it to clusteroidc.CreateOIDC. This
ensures that when the request reaches CreateOIDC, the flag will be checked and
the IAM update step will be skipped as intended for the deferred-IAM flow.

Comment on lines +226 to +269
### 2. Refactor `internal/services/clusteroidc/service.go`

Make IAM trust policy update optional:

```go
type CreateOIDCRequest struct {
ClusterName string
OIDCIssuerURL string
OIDCThumbprint string
AWSConfig aws.Config
SkipIAMUpdate bool // NEW: skip IAM trust policy update
}

func CreateOIDC(ctx context.Context, req *CreateOIDCRequest) (*CreateOIDCResponse, error) {
// ... existing thumbprint and template logic ...

// Create (or update) the OIDC provider stack
output, err := cfnClient.CreateStack(ctx, cfParams)
if err != nil {
var alreadyExists *cloudformation.StackAlreadyExistsError
if errors.As(err, &alreadyExists) {
output, err = updateOIDCStack(ctx, cfnClient, req, oidcStackName, templateBody, thumbprint)
if err != nil {
return nil, err
}
} else {
return nil, fmt.Errorf("failed to create OIDC stack: %w", err)
}
}

// NEW: Only update IAM stack if not skipped
if !req.SkipIAMUpdate {
iamStackName := fmt.Sprintf("rosa-%s-iam", req.ClusterName)
if err := updateIAMTrustPolicies(ctx, cfnClient, req.ClusterName, iamStackName, oidcIssuerDomain); err != nil {
return nil, fmt.Errorf("OIDC provider created but failed to update IAM trust policies: %w", err)
}
}

return &CreateOIDCResponse{
StackID: output.StackID,
Outputs: output.Outputs,
}, nil
}
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how clusteroidc.CreateOIDC currently handles missing IAM stack
rg -A 10 'StackNotFoundError' internal/services/clusteroidc/service.go

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 553


🏁 Script executed:

#!/bin/bash
# Get the full CreateOIDC function to understand the complete flow
rg -B 5 -A 40 'func CreateOIDC' internal/services/clusteroidc/service.go

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 1664


🏁 Script executed:

#!/bin/bash
# Get the IAM stack update logic in CreateOIDC
rg -B 2 -A 15 'updateIAMTrustPolicies\|iamStackName :=' internal/services/clusteroidc/service.go | head -60

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 69


🏁 Script executed:

#!/bin/bash
# Search for IAM-related logic after OIDC stack creation
rg -A 50 'func CreateOIDC' internal/services/clusteroidc/service.go | grep -A 20 'iamStackName\|IAM'

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 806


🏁 Script executed:

#!/bin/bash
# Get the complete CreateOIDC function - read the file directly
wc -l internal/services/clusteroidc/service.go

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 132


🏁 Script executed:

#!/bin/bash
# Read the entire service.go file to understand the complete flow
cat -n internal/services/clusteroidc/service.go

Repository: openshift-online/rosa-regional-platform-cli

Length of output: 7087


Clarify whether SkipIAMUpdate flag adds explicit control or merely formalizes error handling.

The current code (lines 94–98 and 130–163) always calls updateIAMTrustPolicies, which internally catches StackNotFoundError and silently skips the update. The proposed SkipIAMUpdate flag would allow callers to avoid the UpdateStack API call entirely when the IAM stack doesn't exist.

This is a valid enhancement for the deferred-IAM flow—it prevents unnecessary CloudFormation API calls when you know the IAM stack hasn't been created yet. However, the distinction matters for backward compatibility:

  • Current: Always attempts IAM update; skipping happens implicitly via error handling
  • Proposed: Skipping would be explicit and avoid the API call

The documentation should clarify: Is the flag meant to optimize the deferred-IAM flow (avoid unnecessary API calls), or just expose the existing implicit behavior as an explicit control? The two approaches have different implications for callers and error handling.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deferred-iam-implementation.md` around lines 226 - 269, The
documentation for the `SkipIAMUpdate` flag in the CreateOIDCRequest struct needs
clarification on its purpose and behavior. Currently, the code shows the flag
allows callers to skip the IAM trust policy update, but the documentation should
explicitly explain whether this flag is meant to optimize the deferred-IAM flow
by avoiding unnecessary CloudFormation API calls when the IAM stack doesn't
exist, or whether it simply exposes the existing implicit error-handling
behavior as an explicit control. Add a section that clearly distinguishes
between the current behavior (always attempting the update with internal error
handling) and the proposed behavior (explicit skipping via the flag), including
the backward compatibility implications for callers that depend on each
approach.

Comment on lines +688 to +724
func TestProvisionInfrastructure_DeferredIAM(t *testing.T) {
// Setup LocalStack
ctx := context.Background()
cfg := getLocalStackConfig(t)

req := &provision.ProvisionRequest{
ClusterName: "test-cluster",
Region: "us-east-1",
OIDCIssuerURL: "https://example.com/test-cluster",
// ... other params
AWSConfig: cfg,
}

resp, err := provision.ProvisionInfrastructure(ctx, req)
require.NoError(t, err)

// Verify stacks exist
cfnClient := cloudformation.NewClient(cfg)
vpcStack, err := cfnClient.DescribeStack(ctx, "rosa-test-cluster-vpc")
require.NoError(t, err)

oidcStack, err := cfnClient.DescribeStack(ctx, "rosa-test-cluster-oidc")
require.NoError(t, err)

iamStack, err := cfnClient.DescribeStack(ctx, "rosa-test-cluster-iam")
require.NoError(t, err)

// Verify IAM stack was created AFTER OIDC (check creation timestamps)
assert.True(t, iamStack.CreationTime.After(*oidcStack.CreationTime))

// Verify timing data
assert.Greater(t, resp.VPC.Duration, time.Duration(0))
assert.Greater(t, resp.OIDC.Duration, time.Duration(0))
assert.Greater(t, resp.IAM.Duration, time.Duration(0))
assert.Greater(t, resp.TotalDuration, time.Duration(0))
}
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Complete test example with missing imports and function definitions.

The test function TestProvisionInfrastructure_DeferredIAM (lines 688–724) uses:

  • getLocalStackConfig(t) — Not imported or defined
  • errors.As() — Requires import "errors", not shown
  • require.NoError() — Requires testify/require import
  • assert.True(), assert.Greater() — Requires testify/assert import

For the documentation example to be complete and correct, add the necessary imports and ensure getLocalStackConfig is defined or reference an existing helper in the test suite.

🔧 Proposed fix: Add imports and helper reference
  func TestProvisionInfrastructure_DeferredIAM(t *testing.T) {
+     // Load LocalStack configuration from test suite helper
      ctx := context.Background()
      cfg := getLocalStackConfig(t)  // From test/localstack/setup.go
      
      req := &provision.ProvisionRequest{

Ensure test/localstack/setup.go (or relevant test helper) exports GetLocalStackConfig function, or document where this should come from.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deferred-iam-implementation.md` around lines 688 - 724, The test
function TestProvisionInfrastructure_DeferredIAM is missing critical imports and
helper function definitions required for compilation. Add the necessary imports
at the top of the test file including the testing package, context package,
testify/require, testify/assert, and any AWS SDK packages for cloudformation and
config. For the getLocalStackConfig helper function that is called within
TestProvisionInfrastructure_DeferredIAM, either define it directly in the test
file or import it from the appropriate test utility package (such as
test/localstack/setup.go) and ensure it is properly exported with a capital
letter if it needs to be imported from another package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant