🐛 Describe the bug
if you take an instruction-tuned model and then attempt to run it as a next-token predictor, many of its predictions include the token that marks the start of the assistant turn.
If I'm not mistaken, there's no reason for an instruction-tuned model to ever generate this token, and this probably indicates that the chat/assistant/character training, done with RL, failed to mask out the gradients on this token. This could be an off-by-one error, or as bad as a total lack of mask; given that OLMo mostly works it's probably only the former.
To reproduce
For example, if you run the following fairly generic HF tutorial-like code:
# Tokenize directly - no chat template, no special tokens
input_ids = tokenizer.encode(prompt, return_tensors="pt", add_special_tokens=False)
input_ids = input_ids.to(model.device)
# Create attention mask (all ones since we have no padding)
attention_mask = torch.ones_like(input_ids)
# Build generation kwargs, explicitly overriding any model defaults
gen_kwargs = {
"attention_mask": attention_mask,
"max_new_tokens": max_new_tokens,
"do_sample": do_sample,
"pad_token_id": tokenizer.eos_token_id,
# Explicitly set to ensure we override any model defaults
"repetition_penalty": 1.0,
"no_repeat_ngram_size": 0,
}
if do_sample:
gen_kwargs["temperature"] = temperature
gen_kwargs["top_p"] = top_p
if top_k > 0:
gen_kwargs["top_k"] = top_k
if not stop_at_eos:
# Disable EOS stopping - model will generate until max_new_tokens
gen_kwargs["eos_token_id"] = []
with torch.no_grad():
outputs = model.generate(input_ids, **gen_kwargs)
# Decode only the new tokens, preserving special tokens in output
generated_ids = outputs[0, input_ids.shape[1]:]
generated_text = tokenizer.decode(generated_ids, skip_special_tokens=False)
and set the prompt to When the AI assistant was asked "What is the capital of France?", it responded: "
you will get that token in the first 100 tokens about 60% of the time.
Versions
Python 3.12.3
accelerate==1.12.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aiosignal==1.4.0
annotated-doc==0.0.4
annotated-types==0.7.0
anthropic==0.71.0
anyio==4.12.1
apache-tvm-ffi==0.1.8.post2
astor==0.8.1
attrs==25.4.0
blake3==1.0.8
cachetools==6.2.4
cbor2==5.8.0
certifi==2026.1.4
cffi==2.0.0
charset-normalizer==3.4.4
click==8.3.1
cloudpickle==3.1.2
compressed-tensors==0.13.0
cryptography==46.0.3
cuda-bindings==13.1.1
cuda-pathfinder==1.3.3
cuda-python==13.1.1
cupy-cuda12x==13.6.0
depyf==0.20.0
dill==0.4.1
diskcache==5.6.3
distro==1.9.0
dnspython==2.8.0
docstring-parser==0.17.0
einops==0.8.1
email-validator==2.3.0
fastapi==0.128.0
fastapi-cli==0.0.20
fastapi-cloud-cli==0.11.0
fastar==0.8.0
fastrlock==0.8.3
filelock==3.20.3
flashinfer-python==0.5.3
frozenlist==1.8.0
fsspec==2026.1.0
gguf==0.17.1
grpcio==1.76.0
grpcio-reflection==1.76.0
h11==0.16.0
hf-xet==1.2.0
httpcore==1.0.9
httptools==0.7.1
httpx==0.28.1
httpx-sse==0.4.3
huggingface-hub==0.36.0
idna==3.11
ijson==3.4.0.post0
interegular==0.3.3
jinja2==3.1.6
jiter==0.12.0
jmespath==1.0.1
jsonschema==4.26.0
jsonschema-specifications==2025.9.1
lark==1.2.2
llguidance==1.3.0
llvmlite==0.44.0
lm-format-enforcer==0.11.3
loguru==0.7.3
markdown-it-py==4.0.0
markupsafe==3.0.3
mcp==1.25.0
mdurl==0.1.2
mistral-common==1.8.8
model-hosting-container-standards==0.1.13
mpmath==1.3.0
msgpack==1.1.2
msgspec==0.20.0
multidict==6.7.0
networkx==3.6.1
ninja==1.13.0
numba==0.61.2
numpy==2.2.6
nvidia-cublas-cu12==12.8.4.1
nvidia-cuda-cupti-cu12==12.8.90
nvidia-cuda-nvrtc-cu12==12.8.93
nvidia-cuda-runtime-cu12==12.8.90
nvidia-cudnn-cu12==9.10.2.21
nvidia-cudnn-frontend==1.17.0
nvidia-cufft-cu12==11.3.3.83
nvidia-cufile-cu12==1.13.1.3
nvidia-curand-cu12==10.3.9.90
nvidia-cusolver-cu12==11.7.3.90
nvidia-cusparse-cu12==12.5.8.93
nvidia-cusparselt-cu12==0.7.1
nvidia-cutlass-dsl==4.3.5
nvidia-ml-py==13.590.44
nvidia-nccl-cu12==2.27.5
nvidia-nvjitlink-cu12==12.8.93
nvidia-nvshmem-cu12==3.3.20
nvidia-nvtx-cu12==12.8.90
openai==2.15.0
openai-harmony==0.0.8
opencv-python-headless==4.13.0.90
outlines-core==0.2.11
packaging==25.0
partial-json-parser==0.2.1.1.post7
pillow==12.1.0
prometheus-client==0.24.1
prometheus-fastapi-instrumentator==7.1.0
propcache==0.4.1
protobuf==6.33.4
psutil==7.2.1
py-cpuinfo==9.0.0
pybase64==1.4.3
pycountry==24.6.1
pycparser==2.23
pydantic==2.12.5
pydantic-core==2.41.5
pydantic-extra-types==2.11.0
pydantic-settings==2.12.0
pygments==2.19.2
pyjwt==2.10.1
python-dotenv==1.2.1
python-json-logger==4.0.0
python-multipart==0.0.21
pyyaml==6.0.3
pyzmq==27.1.0
ray==2.53.0
referencing==0.37.0
regex==2026.1.15
requests==2.32.5
rich==14.2.0
rich-toolkit==0.17.1
rignore==0.7.6
rpds-py==0.30.0
safetensors==0.7.0
sentencepiece==0.2.1
sentry-sdk==2.50.0
setproctitle==1.3.7
setuptools==80.9.0
shellingham==1.5.4
six==1.17.0
sniffio==1.3.1
sse-starlette==3.2.0
starlette==0.50.0
supervisor==4.3.0
sympy==1.14.0
tabulate==0.9.0
tiktoken==0.12.0
tokenizers==0.22.2
torch==2.9.1
torchaudio==2.9.1
torchvision==0.24.1
tqdm==4.67.1
transformers==4.57.6
triton==3.5.1
typer==0.21.1
typing-extensions==4.15.0
typing-inspection==0.4.2
urllib3==2.6.3
uvicorn==0.40.0
uvloop==0.22.1
vllm==0.14.0
watchfiles==1.1.1
websockets==16.0
xgrammar==0.1.29
yarl==1.22.0
🐛 Describe the bug
if you take an instruction-tuned model and then attempt to run it as a next-token predictor, many of its predictions include the token that marks the start of the assistant turn.
If I'm not mistaken, there's no reason for an instruction-tuned model to ever generate this token, and this probably indicates that the chat/assistant/character training, done with RL, failed to mask out the gradients on this token. This could be an off-by-one error, or as bad as a total lack of mask; given that OLMo mostly works it's probably only the former.
To reproduce
For example, if you run the following fairly generic HF tutorial-like code:
and set the prompt to
When the AI assistant was asked "What is the capital of France?", it responded: "you will get that token in the first 100 tokens about 60% of the time.
Versions
Python 3.12.3
accelerate==1.12.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aiosignal==1.4.0
annotated-doc==0.0.4
annotated-types==0.7.0
anthropic==0.71.0
anyio==4.12.1
apache-tvm-ffi==0.1.8.post2
astor==0.8.1
attrs==25.4.0
blake3==1.0.8
cachetools==6.2.4
cbor2==5.8.0
certifi==2026.1.4
cffi==2.0.0
charset-normalizer==3.4.4
click==8.3.1
cloudpickle==3.1.2
compressed-tensors==0.13.0
cryptography==46.0.3
cuda-bindings==13.1.1
cuda-pathfinder==1.3.3
cuda-python==13.1.1
cupy-cuda12x==13.6.0
depyf==0.20.0
dill==0.4.1
diskcache==5.6.3
distro==1.9.0
dnspython==2.8.0
docstring-parser==0.17.0
einops==0.8.1
email-validator==2.3.0
fastapi==0.128.0
fastapi-cli==0.0.20
fastapi-cloud-cli==0.11.0
fastar==0.8.0
fastrlock==0.8.3
filelock==3.20.3
flashinfer-python==0.5.3
frozenlist==1.8.0
fsspec==2026.1.0
gguf==0.17.1
grpcio==1.76.0
grpcio-reflection==1.76.0
h11==0.16.0
hf-xet==1.2.0
httpcore==1.0.9
httptools==0.7.1
httpx==0.28.1
httpx-sse==0.4.3
huggingface-hub==0.36.0
idna==3.11
ijson==3.4.0.post0
interegular==0.3.3
jinja2==3.1.6
jiter==0.12.0
jmespath==1.0.1
jsonschema==4.26.0
jsonschema-specifications==2025.9.1
lark==1.2.2
llguidance==1.3.0
llvmlite==0.44.0
lm-format-enforcer==0.11.3
loguru==0.7.3
markdown-it-py==4.0.0
markupsafe==3.0.3
mcp==1.25.0
mdurl==0.1.2
mistral-common==1.8.8
model-hosting-container-standards==0.1.13
mpmath==1.3.0
msgpack==1.1.2
msgspec==0.20.0
multidict==6.7.0
networkx==3.6.1
ninja==1.13.0
numba==0.61.2
numpy==2.2.6
nvidia-cublas-cu12==12.8.4.1
nvidia-cuda-cupti-cu12==12.8.90
nvidia-cuda-nvrtc-cu12==12.8.93
nvidia-cuda-runtime-cu12==12.8.90
nvidia-cudnn-cu12==9.10.2.21
nvidia-cudnn-frontend==1.17.0
nvidia-cufft-cu12==11.3.3.83
nvidia-cufile-cu12==1.13.1.3
nvidia-curand-cu12==10.3.9.90
nvidia-cusolver-cu12==11.7.3.90
nvidia-cusparse-cu12==12.5.8.93
nvidia-cusparselt-cu12==0.7.1
nvidia-cutlass-dsl==4.3.5
nvidia-ml-py==13.590.44
nvidia-nccl-cu12==2.27.5
nvidia-nvjitlink-cu12==12.8.93
nvidia-nvshmem-cu12==3.3.20
nvidia-nvtx-cu12==12.8.90
openai==2.15.0
openai-harmony==0.0.8
opencv-python-headless==4.13.0.90
outlines-core==0.2.11
packaging==25.0
partial-json-parser==0.2.1.1.post7
pillow==12.1.0
prometheus-client==0.24.1
prometheus-fastapi-instrumentator==7.1.0
propcache==0.4.1
protobuf==6.33.4
psutil==7.2.1
py-cpuinfo==9.0.0
pybase64==1.4.3
pycountry==24.6.1
pycparser==2.23
pydantic==2.12.5
pydantic-core==2.41.5
pydantic-extra-types==2.11.0
pydantic-settings==2.12.0
pygments==2.19.2
pyjwt==2.10.1
python-dotenv==1.2.1
python-json-logger==4.0.0
python-multipart==0.0.21
pyyaml==6.0.3
pyzmq==27.1.0
ray==2.53.0
referencing==0.37.0
regex==2026.1.15
requests==2.32.5
rich==14.2.0
rich-toolkit==0.17.1
rignore==0.7.6
rpds-py==0.30.0
safetensors==0.7.0
sentencepiece==0.2.1
sentry-sdk==2.50.0
setproctitle==1.3.7
setuptools==80.9.0
shellingham==1.5.4
six==1.17.0
sniffio==1.3.1
sse-starlette==3.2.0
starlette==0.50.0
supervisor==4.3.0
sympy==1.14.0
tabulate==0.9.0
tiktoken==0.12.0
tokenizers==0.22.2
torch==2.9.1
torchaudio==2.9.1
torchvision==0.24.1
tqdm==4.67.1
transformers==4.57.6
triton==3.5.1
typer==0.21.1
typing-extensions==4.15.0
typing-inspection==0.4.2
urllib3==2.6.3
uvicorn==0.40.0
uvloop==0.22.1
vllm==0.14.0
watchfiles==1.1.1
websockets==16.0
xgrammar==0.1.29
yarl==1.22.0