Skip to content

Conversation

@kishen-v
Copy link
Contributor

@kishen-v kishen-v commented Dec 8, 2025

What type of PR is this?

/kind bug

What this PR does / why we need it:
This PR mitigates the flakey-nature of passing-on Volume related information at the early stages of creation, which may have multiple fields unavailable when the volume is creating as observed in the recent failing Integration test suites.
Added supporting changes to capture information of the volume when available before proceeding with the subsequent phases.

Which issue(s) this PR fixes:

Fixes #1021

Special notes for your reviewer:

  • Ran IT locally multiple times, no issues observed, along with e2e.
  • No stale/duplicate volumes are observed on the cloud-side.
Integration tests:

Ran 1 of 1 Specs in 39.243 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestIntegration (39.24s)
PASS
ok  	sigs.k8s.io/ibm-powervs-block-csi-driver/tests/it	39.263s


[root@kishen-test ibm-powervs-block-csi-driver]# git log --shortstat -n1
commit 4c99745e8ee63a9f448801ca1642a4f653ed2856 (HEAD -> fix-it, origin/fix-it)
Author: Kishen Viswanathan <[email protected]>
Date:   Sun Nov 23 13:57:35 2025 +0530

    Pass on volume details when disk is in available state

 7 files changed, 65 insertions(+), 45 deletions(-)
[root@kishen-mayuka-powervs-csi--1 ibm-powervs-block-csi-driver]# make test-e2e
go test -v -timeout 100m sigs.k8s.io/ibm-powervs-block-csi-driver/tests/e2e -run ^TestE2E$
 W1208 09:08:13.972545 3889990 test_context.go:542] Unable to find in-cluster config, using default host : https://127.0.0.1:6443
 I1208 09:08:13.972626 3889990 test_context.go:565] The --provider flag is not set. Continuing as if --provider=skeleton had been used.
=== RUN   TestE2E
Running Suite: IBM PowerVS Block CSI Driver End-to-End Tests - /root/kishen/ibm-powervs-block-csi-driver/tests/e2e
==================================================================================================================
Random Seed: 1765202893

Will run 20 of 20 specs
•SSSS••••••••••••S••

Ran 15 of 20 Specs in 2047.842 seconds
SUCCESS! -- 15 Passed | 0 Failed | 0 Pending | 5 Skipped
--- PASS: TestE2E (2047.84s)
PASS
ok  	sigs.k8s.io/ibm-powervs-block-csi-driver/tests/e2e	2047.893s
[root@kishen-mayuka-powervs-csi--1 ibm-powervs-block-csi-driver]# git remote -v
origin	https://github.com/kishen-v/ibm-powervs-block-csi-driver.git (fetch)
origin	https://github.com/kishen-v/ibm-powervs-block-csi-driver.git (push)
[root@kishen-mayuka-powervs-csi--1 ibm-powervs-block-csi-driver]# git branch
* fix-it

Release note:

Fix broken integration tests

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 8, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kishen-v
Once this PR has been reviewed and has the lgtm label, please assign yussufsh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 8, 2025
@kishen-v kishen-v changed the title Pass on volume details when disk is in available state [WIP]Pass on volume details when disk is in available state Dec 8, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 8, 2025
@kishen-v kishen-v force-pushed the fix-it branch 3 times, most recently from d8617c7 to db93dd6 Compare December 9, 2025 16:26
@kishen-v
Copy link
Contributor Author

Retested with latest changes.
Both IT and E2E had run successfully with no stale or duplicate volumes created.

@kishen-v kishen-v changed the title [WIP]Pass on volume details when disk is in available state Pass on volume details when disk is in available state Dec 10, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 10, 2025
@kishen-v
Copy link
Contributor Author

e2e:
=== RUN   TestE2E
Running Suite: IBM PowerVS Block CSI Driver End-to-End Tests - /root/ibm-powervs-block-csi-driver/tests/e2e
===========================================================================================================
Random Seed: 1765334042

Will run 20 of 20 specs
••••••••••••S••SSSS•

Ran 15 of 20 Specs in 3212.428 seconds
SUCCESS! -- 15 Passed | 0 Failed | 0 Pending | 5 Skipped
--- PASS: TestE2E (3212.43s)
PASS
ok      sigs.k8s.io/ibm-powervs-block-csi-driver/tests/e2e      3212.478s
IT:

I1209 21:40:52.706278   36811 integration_test.go:124] Detached volume 3f43089c-5cb2-4a99-b7be-0fc81af1e59a from node ee8bcc12-3bda-44dc-9093-15442858ac4b in 9.247418773s
I1209 21:40:52.706305   36811 integration_test.go:80] Deleting volume 3f43089c-5cb2-4a99-b7be-0fc81af1e59a
I1209 21:40:54.401116   36811 integration_test.go:84] Deleted volume 3f43089c-5cb2-4a99-b7be-0fc81af1e59a in 1.694791753s
•I1209 21:40:54.401220   36811 driver.go:148] Stopping server


Ran 1 of 1 Specs in 44.083 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestIntegration (44.08s)
PASS
ok  	sigs.k8s.io/ibm-powervs-block-csi-driver/tests/it	44.103s
[root@kishen-csi-it ibm-powervs-block-csi-driver]# git log --shortstat -n1
commit 4537918def02e530d56eae29489297064533da37 (HEAD -> fix-it, origin/fix-it)
Author: Kishen Viswanathan <[email protected]>
Date:   Sun Nov 23 13:57:35 2025 +0530

    Pass on volume details when disk is in available state

 7 files changed, 131 insertions(+), 59 deletions(-)

if err != nil {
return nil, status.Errorf(codes.Internal, "Could not create volume %q: %v", volName, err)
if errors.Is(err, cloud.ErrNotFound) {
disk, err = d.cloud.CreateDisk(volName, opts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just worried if Get fails because the creation did not initiate at the cloud side, this might create duplicate volumes. ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we expect this not to happen usually, can we sleep for 5s and try GET/POST again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point..
Initial creation begins with
-> disk, err := d.cloud.GetDiskByName(volName)
-> proceeds to create a Volume
-> Duplicate call in the meantime(?) - Volume creation is re-triggered.
But the catch is, cloud doesn't allow volumes with duplicate names to be created.. We should be good here.

ishenv@kishen-mbp-m1 ~ % ibmcloud pi vol create test --size=1
Creating volume test under account PCLOUD Hypershift as user [email protected]...

ID                    93a78bd6-fff5-4f49-ac3a-1a4f5224c5f5
Name                  test
Profile               tier3
Status                creating
Size                  1
Created               2025-12-10T06:18:59.000Z
Updated               1970-01-01T00:00:00.000Z
Shareable             false
Bootable              false
Storage Pool          General-Flash-6
IO Throttle Rate      -
Replication Enabled   -
PVMInstanceIDs        -
WWN                   -
kishenv@kishen-mbp-m1 ~ % ibmcloud pi vol create test --size=1
Creating volume test under account PCLOUD Hypershift as user [email protected]...
FAILED
Failed to create volume.

FAILED
Error: [POST /pcloud/v2/cloud-instances/{cloud_instance_id}/volumes][400] pcloudV2VolumesPostBadRequest {"description":"Bad Request: test volume name already exists for cloud instance 384c8836936f4b658c5c628dc50efce0; duplicate names are not allowed","error":"Bad Request"}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Volume creation is re-triggered.
But the catch is, cloud doesn't allow volumes with duplicate names to be created.. We should be good here.

Sorry but I cannot trust this because I have seen duplicate vm instances with same name before on the cloud. 🙂

@yussufsh
Copy link
Member

/hold
Needs more discussion.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integration tests are broken post code cleanup

3 participants