🚨 [generate] Never use `cache_position` anymore in generation by Cyrilvallez · Pull Request #44816 · huggingface/transformers

Cyrilvallez · 2026-03-18T10:32:04Z

What does this PR do?

As per the title. This is the last of many PR to remove the cache_position. At this point, all the models were already updated to not use them, and they are fully ignored in all the modelings. So this removes their creation and usage in generate, so they are not passed as kwarg anywhere anymore.
This is fully safe as all the models already ignore them.

Note: the 🚨 marker is ONLY FOR REMOTE CODE. On the main repo, all models were previously adapted as explained, so no BC issues. For remote code however, as most things, it can break if the code is using cache_position in weird way and do not provide a creation fallback inside the model.

HuggingFaceDocBuilderDev · 2026-03-18T10:43:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

While I still remember it, let's kick it out from docs as well pls and if needed add correct examples with cache

Cyrilvallez · 2026-03-18T16:12:33Z

run-slow: dia

github-actions · 2026-03-18T16:14:00Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/dia"]
quantizations: []

github-actions · 2026-03-18T16:27:23Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	37af686e	workflow commit (merge commit)
PR	5cb41c5b	branch commit (from PR)
main	4ec84a02	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

github-actions · 2026-03-18T17:02:14Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: csm, dia, ernie4_5_vl_moe, glm46v, glm4v, glm4v_moe, glm_image, glm_ocr, janus, paddleocr_vl, qwen2_5_omni, qwen2_5_vl, qwen2_vl, qwen3_5, qwen3_5_moe, qwen3_vl

vasqu

Not approving yet because I want to discuss the deprecation a bit more

I still found references where imo they shouldnt be there
Do we keep cache positions in the generate preparation (alias for position ids): I think we will get a remote model apocalypse otherwise and vLLM already showed how brittle it is

vasqu · 2026-03-19T05:13:34Z

src/transformers/generation/logits_process.py

-        # build `cache_position` on the fly
-        seq_length = inputs["input_ids"].shape[1]
-        inputs = self.model._get_initial_cache_position(seq_length, self.model.device, inputs)


Just for my mind, can we run-slow with whisper

vasqu · 2026-03-19T05:16:02Z

src/transformers/generation/utils.py

-        # Cache position (always 1D)
-        if (cache_position := model_kwargs.get("cache_position")) is not None:
-            next_cache_position = (
-                torch.arange(num_new_tokens, dtype=cache_position.dtype, device=cache_position.device)
-                + cache_position[-1]
-                + 1
-            )
-            next_cache_position = torch.cat((cache_position, next_cache_position))
-            model_kwargs["cache_position"] = next_cache_position


I think complete removal for cache position will be too breaking for remote code and there is still quite a lot, just looking at all the vLLM stuff we had to fix 😭

Shouldn't pos ids be the same as cache positions now? What do you think about passing this as alias kwarg as well - we really need to check with a remote model, e.g. deepseek v3 remote code maybe?

vasqu · 2026-03-19T05:20:03Z

src/transformers/integrations/executorch.py

General question: Are we deprecating it everywhere?

I think I still see a few ocurrences:

Mask creation

Within this executorch integration

Models

Lfm2

Ministral3

Mistral4

Tests

Imo, only the mask might be critical and might be kept a bit longer. Wdyt?

remove from generation

d93347f

Merge branch 'main' into fully-remove-cache-pos-from-generate

6ec8165

Cyrilvallez changed the title ~~[generate] Never use cache_position anymore in generation~~ [generate] 🚨 Never use cache_position anymore in generation Mar 18, 2026

Cyrilvallez changed the title ~~[generate] 🚨 Never use cache_position anymore in generation~~ 🚨 [generate] Never use cache_position anymore in generation Mar 18, 2026

Cyrilvallez added 2 commits March 18, 2026 15:39

update tests

327513e

more tests

8083d6b

zucchini-nlp reviewed Mar 18, 2026

View reviewed changes

Cyrilvallez and others added 4 commits March 18, 2026 16:12

fix

d64db4b

doc

cb5cc41

Merge branch 'main' into fully-remove-cache-pos-from-generate

0dd7cfd

last changes

5cb41c5

huggingface deleted a comment from github-actions bot Mar 18, 2026

Merge branch 'main' into fully-remove-cache-pos-from-generate

14ebd35

Cyrilvallez and others added 2 commits March 18, 2026 18:12

Merge branch 'main' into fully-remove-cache-pos-from-generate

0be1489

aqlm slipped through

4a639fd

vasqu reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨 [generate] Never use `cache_position` anymore in generation#44816

🚨 [generate] Never use `cache_position` anymore in generation#44816
Cyrilvallez wants to merge 11 commits intomainfrom
fully-remove-cache-pos-from-generate

Cyrilvallez commented Mar 18, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 18, 2026

Uh oh!

zucchini-nlp left a comment

Uh oh!

Cyrilvallez commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

vasqu left a comment

Uh oh!

vasqu Mar 19, 2026

Uh oh!

vasqu Mar 19, 2026

Uh oh!

vasqu Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Cyrilvallez commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 18, 2026

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

CI Results

Commit Info

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

vasqu Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

vasqu Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Cyrilvallez commented Mar 18, 2026 •

edited

Loading