Skip to content

Incorrect initialization of max_val in the softmax_cuda kernel function may lead to inaccurate calculation results. #10

@jokerD888

Description

@jokerD888

https://github.com/Infatoshi/cuda-course/blob/master/08_Triton/02_softmax.cu
line 12 float max_val = input[offset + tid];

In the kernel softmax_cuda (line 12), each thread initializes max_val with its own element input[offset + tid]. This leads to incorrect results because each thread only compares its own value, not the global maximum for the entire batch.

Problem:
The max_val is calculated per thread, causing incorrect softmax calculations.
Only the thread responsible for the maximum value computes it correctly, while others compute incorrect results.

Solution:
float max_val = input[offset];

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions