[Polite inquiry]: What are the differences between ggml_qgemm_lut() and ggml_vec_dot_i2_i8_s()?

1. In the ggml_compute_forward_mul_mat() function in ggml.c, ggml_qgemm_lut() is executed first, which I think is an accumulation operation.

1. Subsequently, in the ggml_compute_forward_mul_mat_one_chunk() function, ggml_vec_dot_i2_i8_s() is executed, which performs a multiply-accumulate operation for ternary and 8-bit data.

My understanding is that the former has already computed the matrix multiplication of ternary and 8-bit data through a lookup table and accumulation. So why is there another multiply-accumulate operation in the following function?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Polite inquiry]: What are the differences between ggml_qgemm_lut() and ggml_vec_dot_i2_i8_s()? #170

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Polite inquiry]: What are the differences between ggml_qgemm_lut() and ggml_vec_dot_i2_i8_s()? #170

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions