Skip to content

[BUG] Java test fails on GB100 and CUDA 12.8 (branch-25.02) #18103

Open
@ttnghia

Description

@ttnghia

When running Java test (branch 25.02) on GB100 and CUDA 12.8, the following tests failed:

[ERROR] Errors: 
[ERROR]   TableTest.testReadAvro:1788 » CudaFatal reduce failed to synchronize: cudaErro...
[INFO] 
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
[ERROR] Errors: 
[ERROR]   TableTest.testReadAvroFromDataSource:1808 » CudaFatal reduce failed to synchronize: cudaErro...
[INFO] 
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
[ERROR] Errors: 
[ERROR]   TableTest.testReadAvroFull:1848 » CudaFatal reduce failed to synchronize: cuda...
[INFO] 
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
[ERROR] Errors: 
[ERROR]   TableTest.testReadAvroBuffer:1827 » CudaFatal reduce failed to synchronize: cu...
[INFO] 
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0

I tried to run these individual tests one at a time to make sure such failures are not affected by the previous test and found this:

Note that I didn't test with other cudf branches nor other CUDA versions nor other similar GPU such as GB200.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions