Skip to content

[Polite inquiry]: What are the differences between ggml_qgemm_lut() and ggml_vec_dot_i2_i8_s()? #170

Open
@WanRui37

Description

@WanRui37
  1. In the ggml_compute_forward_mul_mat() function in ggml.c, ggml_qgemm_lut() is executed first, which I think is an accumulation operation.

  2. Subsequently, in the ggml_compute_forward_mul_mat_one_chunk() function, ggml_vec_dot_i2_i8_s() is executed, which performs a multiply-accumulate operation for ternary and 8-bit data.

My understanding is that the former has already computed the matrix multiplication of ternary and 8-bit data through a lookup table and accumulation. So why is there another multiply-accumulate operation in the following function?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions