-
Notifications
You must be signed in to change notification settings - Fork 123
Ibmcloud: improve security group management #2704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Ibmcloud: improve security group management #2704
Conversation
6a9dfa8 to
21bef26
Compare
|
I don't know enough about the IBM Cloud SDK to tell if this is definitely the right approach, but it seems reasonably and isolated to the ibmcloud provider, so if it passes the build and unit tests then it's probably good enough. I think it will need a rebase to re-trigger the CI though. |
7a24023 to
8be71ed
Compare
a0c1c13 to
54e4033
Compare
54e4033 to
5680ad2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request improves IBM Cloud security group management by transitioning from a single security group to support for multiple security groups and automatically fetching and attaching the cluster's security group to pod VMs.
Changes:
- Replaces single
IBMCLOUD_VPC_SG_IDenvironment variable withIBMCLOUD_SECURITY_GROUP_IDSthat accepts comma-separated list of security group IDs - Introduces automatic cluster security group detection and attachment via new Cluster V2 API integration
- Removes dependency on VPC's default security group, replacing it with cluster-specific security group for better security isolation
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/cloud-providers/ibmcloud/types.go | Adds new securityGroupIds type to support multiple security groups, replacing single PrimarySecurityGroupID field |
| src/cloud-providers/ibmcloud/provider.go | Integrates cluster API to automatically fetch cluster security group, updates instance creation to attach all configured security groups |
| src/cloud-providers/ibmcloud/manager.go | Updates command-line flags to use new IBMCLOUD_SECURITY_GROUP_IDS environment variable |
| src/cloud-providers/ibmcloud/cluster.go | Implements new Cluster V2 API client for fetching cluster-specific security groups |
| src/cloud-providers/go.mod | Changes google/uuid from indirect to direct dependency (used by cluster.go) |
| src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_kustomize.go | Updates test provisioner to recognize new environment variable name |
| src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_initializer.go | Renames property from SecurityGroupID to SecurityGroupIDs |
| src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_ibmcloud.properties | Updates property comments to reflect new behavior |
| src/cloud-api-adaptor/test/provisioner/ibmcloud/provision_common.go | Removes logic for fetching VPC's default security group (no longer needed) |
| src/cloud-api-adaptor/install/overlays/ibmcloud/kustomization.yaml | Updates configuration to use new comma-separated security group IDs field |
| src/cloud-api-adaptor/install/charts/peerpods/providers/ibmcloud.yaml | Removes old IBMCLOUD_VPC_SG_ID config and adds new IBMCLOUD_SECURITY_GROUP_IDS config |
| src/cloud-api-adaptor/install/charts/peerpods/providers/gcp.yaml | Adds auto-generated GCP_INSTANCE_TYPES configuration (unrelated to this PR) |
| src/cloud-api-adaptor/ibmcloud/ROKS_SETUP.md | Removes manual cluster security group ID extraction from documentation |
| src/cloud-api-adaptor/ibmcloud/README.md | Removes VPC_SECURITY_GROUP_ID from example properties file |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return nil, err | ||
| } | ||
| config.SecurityGroupIds = append(config.SecurityGroupIds, sgID) |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code always fails provider initialization if the cluster security group cannot be fetched (lines 154-157). According to the PR description, the cluster security group should be used "by default" and "always attached" to ensure VSIs can access the cluster. However, this makes it impossible to use the provider if the cluster security group lookup fails for any reason (e.g., API issues, permissions, etc.).
Consider whether this should be a hard failure or if there should be a fallback mechanism. If the cluster security group is truly required for functionality, the current behavior is correct but should be clearly documented. If it's meant to be a best-practice default that can be overridden, consider making the fetch failure non-fatal and logging a warning instead, especially if the user has explicitly provided their own security groups via IBMCLOUD_SECURITY_GROUP_IDS.
| return nil, err | |
| } | |
| config.SecurityGroupIds = append(config.SecurityGroupIds, sgID) | |
| // If the user has provided security groups explicitly, treat failure to fetch | |
| // the cluster security group as a non-fatal warning and fall back to the | |
| // user-specified security groups. Otherwise, fail hard to avoid running | |
| // without any security group associated with the cluster. | |
| if len(config.SecurityGroupIds) == 0 { | |
| return nil, err | |
| } | |
| log.Printf("warning, unable to automatically attach cluster security group for cluster %q: %v", config.ClusterID, err) | |
| } else { | |
| config.SecurityGroupIds = append(config.SecurityGroupIds, sgID) | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pacho20 wdyt about this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hard fail is an explicit request from others in IBM Cloud. If you don’t use the cluster’s security group, there’s a high chance that the connection between PeerPods and CAA won’t work, and they really want to prevent that.
| func (clusterApi *ClusterV2) GetClusterTypeSecurityGroups(clusterID string) (result []securityGroup, response *core.DetailedResponse, err error) { | ||
| builder := core.NewRequestBuilder(core.GET) | ||
| builder = builder.WithContext(context.Background()) |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method uses a hardcoded context.Background() (line 49) instead of accepting a context parameter. This prevents callers from controlling timeouts, cancellation, or passing context values. Consider accepting a context.Context parameter and using it with builder.WithContext(). This would make the API more flexible and align with Go best practices for context propagation, especially for network operations that could potentially hang or need cancellation.
| func (clusterApi *ClusterV2) GetClusterTypeSecurityGroups(clusterID string) (result []securityGroup, response *core.DetailedResponse, err error) { | |
| builder := core.NewRequestBuilder(core.GET) | |
| builder = builder.WithContext(context.Background()) | |
| func (clusterApi *ClusterV2) GetClusterTypeSecurityGroups(ctx context.Context, clusterID string) (result []securityGroup, response *core.DetailedResponse, err error) { | |
| if ctx == nil { | |
| ctx = context.Background() | |
| } | |
| builder := core.NewRequestBuilder(core.GET) | |
| builder = builder.WithContext(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced a function called GetClusterTypeSecurityGroupsWithContext like in the SDKs. The normal GetClusterTypeSecurityGroups function is a wrapper around that.
| securityGroups := make([]vpcv1.SecurityGroupIdentityIntf, 0, len(p.serviceConfig.SecurityGroupIds)) | ||
| for i := range p.serviceConfig.SecurityGroupIds { | ||
| securityGroups = append(securityGroups, &vpcv1.SecurityGroupIdentityByID{ | ||
| ID: &p.serviceConfig.SecurityGroupIds[i], | ||
| }) | ||
| } |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code iterates over p.serviceConfig.SecurityGroupIds and creates security group identity objects without validating that the IDs are non-empty strings. If the comma-separated input contains empty values (e.g., "id1,,id3" or trailing commas "id1,id2,"), this could result in creating SecurityGroupIdentityByID objects with empty ID pointers, which would likely cause API errors during instance creation. Consider filtering out empty strings before creating the SecurityGroupIdentityByID objects, or validating the input earlier in the configuration parsing stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the toList function that should also address this issue.
| - IBMCLOUD_ZONE="" #set | ||
| - IBMCLOUD_VPC_SUBNET_ID="" #set | ||
| - IBMCLOUD_VPC_SG_ID="" #set | ||
| - IBMCLOUD_SECURITY_GROUP_IDS="" #optional, comma separated list |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment "optional, comma separated list" is accurate but incomplete. According to the PR description and implementation (provider.go lines 154-158), the cluster's security group is automatically fetched and always added to the list. Users can optionally provide additional security group IDs here, but the cluster security group will be included automatically. Consider updating the comment to clarify this behavior, for example: "optional, comma separated list of additional security group IDs (cluster security group is automatically added)".
| - IBMCLOUD_SECURITY_GROUP_IDS="" #optional, comma separated list | |
| - IBMCLOUD_SECURITY_GROUP_IDS="" #optional, comma separated list of additional security group IDs (cluster security group is automatically added) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pacho20 This makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it.
| # (default: "") | ||
| # IBMCLOUD_RESOURCE_GROUP_ID: "" | ||
|
|
||
| # List of Security Group IDs to be used for the Pod VM, comma separated |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment "List of Security Group IDs to be used for the Pod VM, comma separated" doesn't clarify that the cluster's security group is automatically fetched and added. This could mislead users into thinking they need to manually specify the cluster security group. Consider updating the comment to match the actual behavior, for example: "List of additional Security Group IDs to be used for the Pod VM, comma separated (cluster security group is automatically added)".
| # List of Security Group IDs to be used for the Pod VM, comma separated | |
| # List of additional Security Group IDs to be used for the Pod VM, comma separated (cluster security group is automatically added) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pacho20 This makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, changed it.
| func (i *securityGroupIds) Set(value string) error { | ||
| *i = append(*i, toList(value, ",")...) | ||
| return nil |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The toList function used by securityGroupIds.Set() splits on commas but doesn't trim whitespace from the resulting values. The PR description's example shows comma-separated IDs with spaces after commas ('550e8400-e29b-41d4-a716-446655440000,123e4567-e89b-12d3-a456-426614174000'), which would work correctly. However, if users provide values with spaces after commas (e.g., '550e8400-e29b-41d4-a716-446655440000, 123e4567-e89b-12d3-a456-426614174000'), the second ID would start with a space, potentially causing issues when these IDs are used in API calls. Consider trimming whitespace from each element after splitting, similar to how other CSV parsers handle this common user input pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the implementation to trim the white spaces and also skip empty values. This addresses both this and the other comment.
| if len(IBMCloudProps.SecurityGroupID) <= 0 { | ||
| log.Info("[warning] VPC_SECURITY_GROUP_ID was not set.") | ||
| if len(IBMCloudProps.SecurityGroupIDs) <= 0 { | ||
| log.Info("IBMCLOUD_SECURITY_GROUP_IDS was not set.") |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning message format was changed from "[warning]" prefix to no prefix. This creates an inconsistency with the other log messages in the same block (lines 201, 207) which still use the "[warning]" prefix. For consistency, this message should either keep the "[warning]" prefix or all three messages should be updated to use the same format.
| log.Info("IBMCLOUD_SECURITY_GROUP_IDS was not set.") | |
| log.Info("[warning] IBMCLOUD_SECURITY_GROUP_IDS was not set.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
| # Machine types to be used for the Pod VMs, comma separated | ||
| # (default: "") | ||
| # GCP_INSTANCE_TYPES: "" |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change appears to be unrelated to the IBM Cloud security group management improvements described in this PR. The addition of GCP_INSTANCE_TYPES configuration is for Google Cloud Platform, not IBM Cloud. While the file header indicates it's auto-generated ('# Auto-generated by: make sync-chart-values'), this unrelated change should ideally be in a separate PR or the auto-generation should be run separately to avoid mixing concerns. If this is an intentional inclusion due to shared tooling updates, it should be mentioned in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevenhorsman asked me to include it in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. Someone fixed already in another PR.
7a4dd58 to
6481deb
Compare
Add ClusterV2 service for retrieving security groups. Implements a temporary service using the IBM Cloud Cluster v2 API to fetch a cluster's security group, which can serve as the default. This workaround is needed because current IBM Cloud SDKs lack endpoints for security groups. The plan is to add this functionalityto one of the SDKs in the future and use that instead. Signed-off-by: Patrik Fodor <[email protected]>
Introduce the Cluster v2 interface and retrieve the cluster's security group using it instead of the VPC's. Signed-off-by: Patrik Fodor <[email protected]>
Instead of attaching a single security group to VSIs, allow specifying multiple security groups. Signed-off-by: Patrik Fodor <[email protected]>
Update README and ROKS_SETUP to reflect recent changes in security group configuration. Signed-off-by: Patrik Fodor <[email protected]>
The toList helper did not correctly handle inputs containing empty elements or values with surrounding whitespace. Comma‑separated lists with such formatting caused parsing issues for all functions that rely on the custom types using this utility. This change adds trimming logic and skips empty items entirely, ensuring that the returned list contains only meaningful, sanitized values. Signed-off-by: Patrik Fodor <[email protected]>
6481deb to
f38e3b2
Compare
|
@Pacho20 are you happy with this PR? We are closer than ever to make a 0.18 release so if we want that included then it would be nice to get the approvals. Yours stand @stevenhorsman ? |
I'm okay with it, but Pradipta & co-pilot had other suggestions. |
By default, the ibmcloud provider uses the VPC's default Security Group, which does not allow connections to the cluster by default, but it can expose the VSIs to other services unnecessarily. With this change, CAA will use the cluster's security group as the default and always attach it to the VSIs. This ensures that the VSIs can always access the cluster and CAA.
The PR also introduces the ability to define multiple security groups instead of just one, and all of them will be attached to the VSI during creation. The
IBMCLOUD_VPC_SG_IDvariable is replaced byIBMCLOUD_SECURITY_GROUP_IDS. This field accepts a comma-separated list of security group IDs.Example: