Skip to content

Conversation

@Pacho20
Copy link
Contributor

@Pacho20 Pacho20 commented Dec 2, 2025

By default, the ibmcloud provider uses the VPC's default Security Group, which does not allow connections to the cluster by default, but it can expose the VSIs to other services unnecessarily. With this change, CAA will use the cluster's security group as the default and always attach it to the VSIs. This ensures that the VSIs can always access the cluster and CAA.

The PR also introduces the ability to define multiple security groups instead of just one, and all of them will be attached to the VSI during creation. The IBMCLOUD_VPC_SG_ID variable is replaced by IBMCLOUD_SECURITY_GROUP_IDS. This field accepts a comma-separated list of security group IDs.

Example:

kind: ConfigMap
apiVersion: v1
metadata:
  name: peer-pods-cm
  namespace: openshift-sandboxed-containers-operator
data:
  ...
  IBMCLOUD_SECURITY_GROUP_IDS: '550e8400-e29b-41d4-a716-446655440000,123e4567-e89b-12d3-a456-426614174000,9f8c7d6e-5b4a-3c2d-1e0f-abcdefabcdef'
  ...

@Pacho20 Pacho20 force-pushed the ibmcloud-security-groups branch 2 times, most recently from 6a9dfa8 to 21bef26 Compare December 2, 2025 16:50
@Pacho20 Pacho20 marked this pull request as ready for review December 12, 2025 18:38
@Pacho20 Pacho20 requested a review from a team as a code owner December 12, 2025 18:38
@stevenhorsman
Copy link
Member

I don't know enough about the IBM Cloud SDK to tell if this is definitely the right approach, but it seems reasonably and isolated to the ibmcloud provider, so if it passes the build and unit tests then it's probably good enough. I think it will need a rebase to re-trigger the CI though.

@Pacho20 Pacho20 force-pushed the ibmcloud-security-groups branch from 7a24023 to 8be71ed Compare January 6, 2026 10:22
@Pacho20 Pacho20 force-pushed the ibmcloud-security-groups branch 3 times, most recently from a0c1c13 to 54e4033 Compare January 13, 2026 16:28
@stevenhorsman stevenhorsman self-requested a review January 22, 2026 10:24
@Pacho20 Pacho20 force-pushed the ibmcloud-security-groups branch from 54e4033 to 5680ad2 Compare January 22, 2026 13:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request improves IBM Cloud security group management by transitioning from a single security group to support for multiple security groups and automatically fetching and attaching the cluster's security group to pod VMs.

Changes:

  • Replaces single IBMCLOUD_VPC_SG_ID environment variable with IBMCLOUD_SECURITY_GROUP_IDS that accepts comma-separated list of security group IDs
  • Introduces automatic cluster security group detection and attachment via new Cluster V2 API integration
  • Removes dependency on VPC's default security group, replacing it with cluster-specific security group for better security isolation

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/cloud-providers/ibmcloud/types.go Adds new securityGroupIds type to support multiple security groups, replacing single PrimarySecurityGroupID field
src/cloud-providers/ibmcloud/provider.go Integrates cluster API to automatically fetch cluster security group, updates instance creation to attach all configured security groups
src/cloud-providers/ibmcloud/manager.go Updates command-line flags to use new IBMCLOUD_SECURITY_GROUP_IDS environment variable
src/cloud-providers/ibmcloud/cluster.go Implements new Cluster V2 API client for fetching cluster-specific security groups
src/cloud-providers/go.mod Changes google/uuid from indirect to direct dependency (used by cluster.go)
src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_kustomize.go Updates test provisioner to recognize new environment variable name
src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_initializer.go Renames property from SecurityGroupID to SecurityGroupIDs
src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_ibmcloud.properties Updates property comments to reflect new behavior
src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_common.go Removes logic for fetching VPC's default security group (no longer needed)
src/cloud-api-adaptor/install/overlays/ibmcloud/kustomization.yaml Updates configuration to use new comma-separated security group IDs field
src/cloud-api-adaptor/install/charts/peerpods/providers/ibmcloud.yaml Removes old IBMCLOUD_VPC_SG_ID config and adds new IBMCLOUD_SECURITY_GROUP_IDS config
src/cloud-api-adaptor/install/charts/peerpods/providers/gcp.yaml Adds auto-generated GCP_INSTANCE_TYPES configuration (unrelated to this PR)
src/cloud-api-adaptor/ibmcloud/ROKS_SETUP.md Removes manual cluster security group ID extraction from documentation
src/cloud-api-adaptor/ibmcloud/README.md Removes VPC_SECURITY_GROUP_ID from example properties file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +156 to +189
return nil, err
}
config.SecurityGroupIds = append(config.SecurityGroupIds, sgID)
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code always fails provider initialization if the cluster security group cannot be fetched (lines 154-157). According to the PR description, the cluster security group should be used "by default" and "always attached" to ensure VSIs can access the cluster. However, this makes it impossible to use the provider if the cluster security group lookup fails for any reason (e.g., API issues, permissions, etc.).

Consider whether this should be a hard failure or if there should be a fallback mechanism. If the cluster security group is truly required for functionality, the current behavior is correct but should be clearly documented. If it's meant to be a best-practice default that can be overridden, consider making the fetch failure non-fatal and logging a warning instead, especially if the user has explicitly provided their own security groups via IBMCLOUD_SECURITY_GROUP_IDS.

Suggested change
return nil, err
}
config.SecurityGroupIds = append(config.SecurityGroupIds, sgID)
// If the user has provided security groups explicitly, treat failure to fetch
// the cluster security group as a non-fatal warning and fall back to the
// user-specified security groups. Otherwise, fail hard to avoid running
// without any security group associated with the cluster.
if len(config.SecurityGroupIds) == 0 {
return nil, err
}
log.Printf("warning, unable to automatically attach cluster security group for cluster %q: %v", config.ClusterID, err)
} else {
config.SecurityGroupIds = append(config.SecurityGroupIds, sgID)
}

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pacho20 wdyt about this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard fail is an explicit request from others in IBM Cloud. If you don’t use the cluster’s security group, there’s a high chance that the connection between PeerPods and CAA won’t work, and they really want to prevent that.

Comment on lines 47 to 52
func (clusterApi *ClusterV2) GetClusterTypeSecurityGroups(clusterID string) (result []securityGroup, response *core.DetailedResponse, err error) {
builder := core.NewRequestBuilder(core.GET)
builder = builder.WithContext(context.Background())
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method uses a hardcoded context.Background() (line 49) instead of accepting a context parameter. This prevents callers from controlling timeouts, cancellation, or passing context values. Consider accepting a context.Context parameter and using it with builder.WithContext(). This would make the API more flexible and align with Go best practices for context propagation, especially for network operations that could potentially hang or need cancellation.

Suggested change
func (clusterApi *ClusterV2) GetClusterTypeSecurityGroups(clusterID string) (result []securityGroup, response *core.DetailedResponse, err error) {
builder := core.NewRequestBuilder(core.GET)
builder = builder.WithContext(context.Background())
func (clusterApi *ClusterV2) GetClusterTypeSecurityGroups(ctx context.Context, clusterID string) (result []securityGroup, response *core.DetailedResponse, err error) {
if ctx == nil {
ctx = context.Background()
}
builder := core.NewRequestBuilder(core.GET)
builder = builder.WithContext(ctx)

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced a function called GetClusterTypeSecurityGroupsWithContext like in the SDKs. The normal GetClusterTypeSecurityGroups function is a wrapper around that.

Comment on lines +254 to +343
securityGroups := make([]vpcv1.SecurityGroupIdentityIntf, 0, len(p.serviceConfig.SecurityGroupIds))
for i := range p.serviceConfig.SecurityGroupIds {
securityGroups = append(securityGroups, &vpcv1.SecurityGroupIdentityByID{
ID: &p.serviceConfig.SecurityGroupIds[i],
})
}
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code iterates over p.serviceConfig.SecurityGroupIds and creates security group identity objects without validating that the IDs are non-empty strings. If the comma-separated input contains empty values (e.g., "id1,,id3" or trailing commas "id1,id2,"), this could result in creating SecurityGroupIdentityByID objects with empty ID pointers, which would likely cause API errors during instance creation. Consider filtering out empty strings before creating the SecurityGroupIdentityByID objects, or validating the input earlier in the configuration parsing stage.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the toList function that should also address this issue.

- IBMCLOUD_ZONE="" #set
- IBMCLOUD_VPC_SUBNET_ID="" #set
- IBMCLOUD_VPC_SG_ID="" #set
- IBMCLOUD_SECURITY_GROUP_IDS="" #optional, comma separated list
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "optional, comma separated list" is accurate but incomplete. According to the PR description and implementation (provider.go lines 154-158), the cluster's security group is automatically fetched and always added to the list. Users can optionally provide additional security group IDs here, but the cluster security group will be included automatically. Consider updating the comment to clarify this behavior, for example: "optional, comma separated list of additional security group IDs (cluster security group is automatically added)".

Suggested change
- IBMCLOUD_SECURITY_GROUP_IDS="" #optional, comma separated list
- IBMCLOUD_SECURITY_GROUP_IDS="" #optional, comma separated list of additional security group IDs (cluster security group is automatically added)

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pacho20 This makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it.

# (default: "")
# IBMCLOUD_RESOURCE_GROUP_ID: ""

# List of Security Group IDs to be used for the Pod VM, comma separated
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "List of Security Group IDs to be used for the Pod VM, comma separated" doesn't clarify that the cluster's security group is automatically fetched and added. This could mislead users into thinking they need to manually specify the cluster security group. Consider updating the comment to match the actual behavior, for example: "List of additional Security Group IDs to be used for the Pod VM, comma separated (cluster security group is automatically added)".

Suggested change
# List of Security Group IDs to be used for the Pod VM, comma separated
# List of additional Security Group IDs to be used for the Pod VM, comma separated (cluster security group is automatically added)

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pacho20 This makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, changed it.

Comment on lines +79 to +92
func (i *securityGroupIds) Set(value string) error {
*i = append(*i, toList(value, ",")...)
return nil
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The toList function used by securityGroupIds.Set() splits on commas but doesn't trim whitespace from the resulting values. The PR description's example shows comma-separated IDs with spaces after commas ('550e8400-e29b-41d4-a716-446655440000,123e4567-e89b-12d3-a456-426614174000'), which would work correctly. However, if users provide values with spaces after commas (e.g., '550e8400-e29b-41d4-a716-446655440000, 123e4567-e89b-12d3-a456-426614174000'), the second ID would start with a space, potentially causing issues when these IDs are used in API calls. Consider trimming whitespace from each element after splitting, similar to how other CSV parsers handle this common user input pattern.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the implementation to trim the white spaces and also skip empty values. This addresses both this and the other comment.

if len(IBMCloudProps.SecurityGroupID) <= 0 {
log.Info("[warning] VPC_SECURITY_GROUP_ID was not set.")
if len(IBMCloudProps.SecurityGroupIDs) <= 0 {
log.Info("IBMCLOUD_SECURITY_GROUP_IDS was not set.")
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning message format was changed from "[warning]" prefix to no prefix. This creates an inconsistency with the other log messages in the same block (lines 201, 207) which still use the "[warning]" prefix. For consistency, this message should either keep the "[warning]" prefix or all three messages should be updated to use the same format.

Suggested change
log.Info("IBMCLOUD_SECURITY_GROUP_IDS was not set.")
log.Info("[warning] IBMCLOUD_SECURITY_GROUP_IDS was not set.")

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Comment on lines 45 to 47
# Machine types to be used for the Pod VMs, comma separated
# (default: "")
# GCP_INSTANCE_TYPES: ""
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change appears to be unrelated to the IBM Cloud security group management improvements described in this PR. The addition of GCP_INSTANCE_TYPES configuration is for Google Cloud Platform, not IBM Cloud. While the file header indicates it's auto-generated ('# Auto-generated by: make sync-chart-values'), this unrelated change should ideally be in a separate PR or the auto-generation should be run separately to avoid mixing concerns. If this is an intentional inclusion due to shared tooling updates, it should be mentioned in the PR description.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenhorsman asked me to include it in the PR.

Copy link
Contributor Author

@Pacho20 Pacho20 Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. Someone fixed already in another PR.

@Pacho20 Pacho20 force-pushed the ibmcloud-security-groups branch 2 times, most recently from 7a4dd58 to 6481deb Compare February 2, 2026 10:26
Add ClusterV2 service for retrieving security groups.
Implements a temporary service using the IBM Cloud Cluster v2 API
to fetch a cluster's security group, which can serve as the default.
This workaround is needed because current IBM Cloud SDKs
lack endpoints for security groups.

The plan is to add this functionalityto one of the
SDKs in the future and use that instead.

Signed-off-by: Patrik Fodor <[email protected]>
Introduce the Cluster v2 interface and retrieve the cluster's security
group using it instead of the VPC's.

Signed-off-by: Patrik Fodor <[email protected]>
Instead of attaching a single security group to VSIs, allow specifying
multiple security groups.

Signed-off-by: Patrik Fodor <[email protected]>
Update README and ROKS_SETUP to reflect recent changes
in security group configuration.

Signed-off-by: Patrik Fodor <[email protected]>
The toList helper did not correctly handle inputs containing empty
elements or values with surrounding whitespace. Comma‑separated lists
with such formatting caused parsing issues for all functions that rely
on the custom types using this utility.

This change adds trimming logic and skips empty items entirely,
ensuring that the returned list contains only meaningful, sanitized
values.

Signed-off-by: Patrik Fodor <[email protected]>
@Pacho20 Pacho20 force-pushed the ibmcloud-security-groups branch from 6481deb to f38e3b2 Compare February 9, 2026 08:29
@wainersm
Copy link
Member

@Pacho20 are you happy with this PR? We are closer than ever to make a 0.18 release so if we want that included then it would be nice to get the approvals. Yours stand @stevenhorsman ?

@stevenhorsman
Copy link
Member

@Pacho20 are you happy with this PR? We are closer than ever to make a 0.18 release so if we want that included then it would be nice to get the approvals. Yours stand @stevenhorsman ?

I'm okay with it, but Pradipta & co-pilot had other suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants