Skip to content

Comments

fabtests/cuda: cleanup tests and executable names#11889

Merged
shijin-aws merged 3 commits intoofiwg:mainfrom
aingerson:main
Feb 19, 2026
Merged

fabtests/cuda: cleanup tests and executable names#11889
shijin-aws merged 3 commits intoofiwg:mainfrom
aingerson:main

Conversation

@aingerson
Copy link
Contributor

Move tests to the appropriate location and add correct prefix so the binaries are ignored.

The test checks calling getinfo with FI_HMEM flags. Add fi_ prefix for executable
because it uses OFI. Move test to unit/ folder since common/ folder should not
have any tests in it

Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
@aingerson
Copy link
Contributor Author

@shijin-aws Cleaning up a bit. Please take a look and let me know if that's ok. I'm tempted to remove the check_hmem test because it doesn't really do anything but I see that you use it in a pytest check so I don't want to break anything for you.

@shijin-aws
Copy link
Contributor

shijin-aws commented Feb 13, 2026

aws ci failure: is the PR removing the util binary check_cuda_dmabuf ?

--------------------------------- Captured Log ---------------------------------

--------------------------------- Captured Out ---------------------------------
Running cuda_memory_type_validation() validation checks!
The ssh return is CompletedProcess(args='ssh 172.31.92.176 FI_LOG_LEVEL=warn timeout 360 /home/ec2-user/PortaFiducia/build/libraries/libfabric/pr11889-undebug/install/fabtests/bin/check_cuda_dmabuf -p efa', returncode=127, stdout='timeout: failed to run command ‘/home/ec2-user/PortaFiducia/build/libraries/libfabric/pr11889-undebug/install/fabtests/bin/check_cuda_dmabuf’: No such file or directory\n')
[warn] check_dmabuf returned unexpected code 127, treating as NOT_INITIALIZED

@shijin-aws
Copy link
Contributor

aws ci failure: is the PR removing the util binary check_cuda_dmabuf ?

--------------------------------- Captured Log ---------------------------------

--------------------------------- Captured Out ---------------------------------
Running cuda_memory_type_validation() validation checks!
The ssh return is CompletedProcess(args='ssh 172.31.92.176 FI_LOG_LEVEL=warn timeout 360 /home/ec2-user/PortaFiducia/build/libraries/libfabric/pr11889-undebug/install/fabtests/bin/check_cuda_dmabuf -p efa', returncode=127, stdout='timeout: failed to run command ‘/home/ec2-user/PortaFiducia/build/libraries/libfabric/pr11889-undebug/install/fabtests/bin/check_cuda_dmabuf’: No such file or directory\n')
[warn] check_dmabuf returned unexpected code 127, treating as NOT_INITIALIZED

Ok I see 08a7167 renamed it. Please update its references as well :)

Move test to component/dmabuf-rdma since it is a CUDA-direct test targeting dmabuf support

Modify test executable name to have cuda prefix to match other iface-specific tests and
add that prefix to the gitignore to ignore the binary

Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
@aingerson
Copy link
Contributor Author

@shijin-aws Oops, sorry! Updated!

@aingerson
Copy link
Contributor Author

@shijin-aws Ok to merge?

@shijin-aws shijin-aws merged commit 44de819 into ofiwg:main Feb 19, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants