make smoothquant more PT2 friendly

torchao's smoothquant recently broke after a change to PyTorch core: https://github.com/pytorch/pytorch/issues/145733 .  We should make the updates suggested by @anijain2305 in that issue to our code.  I actually think we should go a bit farther and go with something like

```python
#
# before
#
class _ActQuantizer:
    def __init__(self, target_dtype, quant_min=-127):
        self.target_dtype = target_dtype
        self.quant_min = quant_min

    def dynamic_quantize(self, input):
        return to_affine_quantized_intx(
            input,
            MappingType.SYMMETRIC,
            _get_per_token_block_size(input),
            self.target_dtype,
            self.quant_min,
        )

    def static_quantize(self, input, scale, zero_point):
        return to_affine_quantized_intx_static(
            input,
            scale,
            zero_point,
            list(input.shape),
            self.target_dtype,
            self.quant_min,
        )

#
# after
#
@dataclass
class _ActQuantConfig:
    target_dtype: torch.dtype
    quant_min: int = -127

# then, logic elsewhere chooses whether to call static or dynamic quant based on the contents of an instance of `_ActQuantConfig`
``` 

My feedback here is similar in spirit to https://github.com/pytorch/ao/pull/1595 - IMO it's simpler and safer to pass around dumb config objects and use them to choose which function to call, instead of encoding the "which function to call" information in the config as a callable object.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

make smoothquant more PT2 friendly #1639

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

make smoothquant more PT2 friendly #1639

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions