[Bug]: Missing 'content' in response while 'reasoning_content' is present for GPT-OSS-120B

### System Info

NVIDIA-SMI 580.126.18            
Driver Version: 580.126.18     
CUDA Version: 13.0
GPU: 2 NVIDIA RTX PRO 6000
Memory-size: 97887MiB each
CPU Architecture:  x86_64
CPU op-mode(s): 32-bit, 64-bit




### Who can help?

I'm trying to host gpt-oss-120b on my 2x RTX PRO GPU.  Sometimes im getting the 'content' key in the output and sometimes not. The reasoning_content is always there but final answer i.e content is missing. What can be the possible issue, i read reducing the stream_interval would help but still its super inconsistent.

MODEL_DIR="/workspace/models/gpt-oss-120b"
EXTRA_CONFIG="/workspace/extra_config.yml"
TP_SIZE=2                 
MAX_BATCH_SIZE=64          
MAX_INPUT_LEN=64000     
MAX_NUM_TOKEN=64000 

extra_config.yml:

disable_overlap_scheduler: true
speculative_config:
  decoding_type: Eagle3
  max_draft_len: 6
  speculative_model: /workspace/models/eagle3-draft
enable_chunked_prefill: true
stream_interval: 1



python3 -m dynamo.trtllm \
    --model-path "${MODEL_DIR}" \
    --tensor-parallel-size "${TP_SIZE}" \
    --expert-parallel-size "2" \
    --max-batch-size "${MAX_BATCH_SIZE}" \
    --max-num-tokens "${MAX_NUM_TOKEN}" \
    --max-seq-len "${MAX_INPUT_LEN}" \
    --free-gpu-memory-fraction 0.85 \
    --dyn-tool-call-parser harmony \
    --store-kv etcd \
    --extra-engine-args "${EXTRA_CONFIG}" \
    --dyn-reasoning-parser gpt_oss \



When sending a post request to the same: 

```python
import json
import requests

headers = {
    'Content-Type': 'application/json',
}

data = {
    "model": "/workspace/models/gpt-oss-120b",
    "messages": [
        {
        "role": "user", 
        "content": test_prompt,
    }],
    "max_tokens": 8192,

}

response = requests.post('http://localhost:8000/v1/chat/completions', headers=headers, json=data)
res = response.json()['choices'][0]['message']['content']
print(res)
```




### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Run:
python3 -m dynamo.trtllm
--model-path "${MODEL_DIR}"
--tensor-parallel-size "${TP_SIZE}"
--expert-parallel-size "2"
--max-batch-size "${MAX_BATCH_SIZE}"
--max-num-tokens "${MAX_NUM_TOKEN}"
--max-seq-len "${MAX_INPUT_LEN}"
--free-gpu-memory-fraction 0.85
--dyn-tool-call-parser harmony
--store-kv etcd
--extra-engine-args "${EXTRA_CONFIG}"
--dyn-reasoning-parser gpt_oss \

and then hit the post requests.

### Expected behavior

the 'content' and 'reasoning_content' both keys to present inside message key should be present in the final output


### actual behavior

Current: 
```
{'id': 'chatcmpl-b8a2ae2b-106d-4892-a04d-1b7ee1e8abaf',
 'choices': [{'index': 0,
   'message': {'role': 'assistant',
    'reasoning_content': 'We need to output JSON with standard_declaration_key, confidence_score, justification. ....... So 9.\n\nConfidence_score integer 95.\n\nJustification: mention that filled and matches template 9.\n\nReturn JSON.\n\n'},
   'finish_reason': 'stop'}],
 'created': 1773046229,
 'model': '/workspace/models/gpt-oss-120b',
 'object': 'chat.completion',
 'usage': {'prompt_tokens': 3027,
  'completion_tokens': 602,
  'total_tokens': 3629,
  'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 3027}}}
```

### additional notes

using pytorch as backend

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Missing 'content' in response while 'reasoning_content' is present for GPT-OSS-120B #12030

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Missing 'content' in response while 'reasoning_content' is present for GPT-OSS-120B #12030

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions