Skip to content

quantize_4bit/dequantize_4bit gives wrong output on in-contiguous tensor #1342

Open
@chenqianfzh

Description

@chenqianfzh

System Info

Linux

Reproduction

I hit this bug during my coding work. I verified it with the following script. With bitsandbytes 0.43.3, it shows that half of the elements from the quantize/dequantize from the in-contiguous tensor are different from the counterpart of contiguous tensor.

import torch
from bitsandbytes.functional import quantize_4bit, dequantize_4bit

# Create a contiguous tensor
original_tensor = torch.randn(3, 4, 6).to('cuda')


# Create a non-contiguous tensor by slicing (skipping elements)
non_contiguous_tensor = original_tensor[:, ::2, :]
compressed_non_contiguous_tensor, quant_state = quantize_4bit(non_contiguous_tensor)
uncompressed_non_contiguous_tensor = dequantize_4bit(compressed_non_contiguous_tensor, quant_state)

# Create a contiguous tensor by calling contiguous()
contiguous_tensor = non_contiguous_tensor.contiguous()
compressed_contiguous_tensor, quant_state = quantize_4bit(contiguous_tensor)
uncompressed_contiguous_tensor = dequantize_4bit(compressed_contiguous_tensor, quant_state)

# it is expected to be the same, but half of the elements are different
num_different_elements = (uncompressed_contiguous_tensor != uncompressed_non_contiguous_tensor).sum().item()
print(f"The number of different elements is: {num_different_elements}")

Expected behavior

Incontinguous tensors are expected to give the same quant/dequant results with the contiguous one, as long as the elements are the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglow priority(will be worked on after all priority issues)questionFurther information is requested

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions