Incorrect initialization of max_val in the softmax_cuda kernel function may lead to inaccurate calculation results.

https://github.com/Infatoshi/cuda-course/blob/master/08_Triton/02_softmax.cu
line 12        float max_val = input[offset + tid];

In the kernel softmax_cuda (line 12), each thread initializes max_val with its own element input[offset + tid]. This leads to incorrect results because each thread only compares its own value, not the global maximum for the entire batch.

Problem:
The max_val is calculated per thread, causing incorrect softmax calculations.
Only the thread responsible for the maximum value computes it correctly, while others compute incorrect results.

Solution:
float max_val = input[offset];

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect initialization of max_val in the softmax_cuda kernel function may lead to inaccurate calculation results. #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Incorrect initialization of max_val in the softmax_cuda kernel function may lead to inaccurate calculation results. #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions