[Feature]:  Support Re‑Quantizing Per‑Tensor FP8 Models to Other Dtypes

### Feature Description

As the title.

### Motivation and Use Case

- Input:
Models whose weights have already been quantized to FP8 with per‑tensor granularity, such as: https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512
Output:
Re‑quantized models in other dtypes, e.g. `W4A16`

- The relevant implementation starting point is here:
https://github.com/intel/auto-round/blob/6190869b7f0ff00dab98e786655ce2145c8a8891/auto_round/utils/model.py#L1070-L1075

### Alternatives Considered

_No response_

### Definition of Done

- [ ] Allow taking `mistralai/Devstral-2-123B-Instruct-2512` as input, and `W4A16` as output.
- [ ] UT

### Additional Context

_No response_

	if layer.__class__.__name__ == "CompressedLinear":
	dq_weight = layer.compressor.decompress_module(layer)
	else:
	weight_scale = layer.weight_scale if hasattr(layer, "weight_scale") else layer.weight_scale_inv
	data_type = getattr(layer, "data_type", None)
	dq_weight = dequant_block_fp8_weight(layer.weight, weight_scale, layer.block_size, data_type=data_type)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Support Re‑Quantizing Per‑Tensor FP8 Models to Other Dtypes #1337

Feature Description

Motivation and Use Case

Alternatives Considered

Definition of Done

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Support Re‑Quantizing Per‑Tensor FP8 Models to Other Dtypes #1337

Description

Feature Description

Motivation and Use Case

Alternatives Considered

Definition of Done

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions