[Model Bringup] - openai/gpt-oss-20b

Model card - https://huggingface.co/blog/welcome-openai-gpt-oss
The transformers version will need to be updated accordingly to support this model , testing with transformer version upgraded to 4.55.0 and when model is loaded in fp32 or bfp16, following issue is arising
```
Traceback (most recent call last):
  File "/opt/ttforge-toolchain/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/ttforge-toolchain/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/ttforge-toolchain/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 316, in _wrapper
    return func(*args, **kwargs)
  File "/opt/ttforge-toolchain/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4879, in from_pretrained
    hf_quantizer.validate_environment(
  File "/opt/ttforge-toolchain/venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_mxfp4.py", line 60, in validate_environment
    raise RuntimeError("Using MXFP4 quantized models requires a GPU")
RuntimeError: Using MXFP4 quantized models requires a GPU
```
Looks like the model is using MXFP4 quantization, which requires a GPU to run. But since we’re planning to use the model in BF16 or FP32 and don’t actually need quantization, we’re checking if we can just remove the quantization_config  while loading model. That should let us load the model without triggering the MXFP4 GPU checks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model Bringup] - openai/gpt-oss-20b #2744

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Model Bringup] - openai/gpt-oss-20b #2744

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions