Skip to content

Seeing failure in reduction tests on Perlmutter-CPU with nvidia #161

@xylar

Description

@xylar

I just ran CTests on Perlmutter-CPU with nvidia and I'm seeing:

1: Global sum I4:    PASS (exp,act=2,2)
1: Global sum I8:    PASS (exp,act=4,4)
1: Global sum R4:    PASS (exp,act=6.000002,6.000002)
1: Global sum R8:    PASS (exp,act=8.000000000000201,8.000000000000201)
1: Global sum real:  PASS (exp,act=10.000002000000000,10.000002000000000)
1: Global sum A1DI4: PASS (exp,act=90,90)
1: Global sum A2DI4: PASS (exp,act=9900,9900)
1: Global sum A1DI8: PASS (exp,act=90,90)
1: Global sum A2DI8: PASS (exp,act=9900,9900)
1: Global sum A1DR4: PASS (exp,act=90.0001983643,90.0001983643)
1: Global sum A2DR4: PASS (exp,act=9900.098633,9900.098633)
1: Global sum A1DR8: PASS (exp,act=90.0000000000020,90.0000000000020)
1: Global sum A2DR8: PASS (exp,act=9900.0000000009859,9900.0000000009859)
1: Global min I4:    PASS (exp,act=0,0)
1: Global max I4:    PASS (exp,act=1,1)
1: Global min R8:    PASS (exp,act=4.0000000000001,4.0000000000001)
1: Global max R8:    PASS (exp,act=5.0000000000001,5.0000000000001)
1: Global min A1DI4: PASS
1: Global max A1DI4: FAIL
1: Global sum device A1DI4: PASS (exp,act=90,90)
1: Global sum device A2DI4: PASS (exp,act=9900,9900)
1: Global sum device A1DR4: PASS (exp,act=90.0001983643,90.0001983643)
0: Global sum I4:    PASS (exp,act=2,2)
0: Global sum I8:    PASS (exp,act=4,4)
0: Global sum R4:    PASS (exp,act=6.000002,6.000002)
0: Global sum R8:    PASS (exp,act=8.000000000000201,8.000000000000201)
0: Global sum real:  PASS (exp,act=10.000002000000000,10.000002000000000)
0: Global sum A1DI4: PASS (exp,act=90,90)
0: Global sum A2DI4: PASS (exp,act=9900,9900)
0: Global sum A1DI8: PASS (exp,act=90,90)
0: Global sum A2DI8: PASS (exp,act=9900,9900)
0: Global sum A1DR4: PASS (exp,act=90.0001983643,90.0001983643)
0: Global sum A2DR4: PASS (exp,act=9900.098633,9900.098633)
0: Global sum A1DR8: PASS (exp,act=90.0000000000020,90.0000000000020)
0: Global sum A2DR8: PASS (exp,act=9900.0000000009859,9900.0000000009859)
0: Global min I4:    PASS (exp,act=0,0)
0: Global max I4:    PASS (exp,act=1,1)
0: Global min R8:    PASS (exp,act=4.0000000000001,4.0000000000001)
0: Global max R8:    PASS (exp,act=5.0000000000001,5.0000000000001)
0: Global min A1DI4: PASS
0: Global max A1DI4: FAIL
0: Global sum device A1DI4: PASS (exp,act=90,90)
0: Global sum device A2DI4: PASS (exp,act=9900,9900)
0: Global sum device A1DR4: PASS (exp,act=90.0001983643,90.0001983643)
srun: error: nid004451: tasks 0-1: Exited with exit code 10
srun: Terminating StepId=32907910.24

Note: Global max A1DI4: FAIL on both cores.

All other tests are passing.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions