Efficient Permute/Transpose Operations on 4-Bit Quantized Tensors #1591

yhyang201 · 2025-04-11T18:29:26Z

yhyang201
Apr 11, 2025

I have a tensor A that has already been quantized into a 4-bit format, resulting in a quantized representation A_bit.
Now, I would like to apply a permute or transpose operation to tensor A, in order to obtain a new tensor B and its corresponding 4-bit quantized form B_bit.

Is there a simpler or more efficient way to directly perform these operations on A_bit, without first dequantizing it back to floating point (via dequantize_nf4) and then requantizing (quantize_nf4)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient Permute/Transpose Operations on 4-Bit Quantized Tensors #1591

{{title}}

Replies: 0 comments

Select a reply

Efficient Permute/Transpose Operations on 4-Bit Quantized Tensors #1591

yhyang201 Apr 11, 2025

Replies: 0 comments

yhyang201
Apr 11, 2025