bug fix - remove attn.bias keys from GPT state dict in 'from_pretrine… by amnonbleich · Pull Request #122 · karpathy/minGPT

amnonbleich · 2023-08-21T09:44:38Z

bug fix - remove attn.bias keys from GPT state dict in 'from_pretrined'. otherwise assertion fails. if that's not a bug, would be happy to hear what is the reasoning. in addition, the above mentioned keys are not used elsewhere, only in the assertion

…d'. otherwise assertion fails

erno123 · 2024-01-16T19:21:20Z

The root cause of the problem is that persistent=False is set for the attn.bias keys in the original Hugging Face code (https://github.com/huggingface/transformers/blob/main/src/transformers/models/gpt2/modeling_gpt2.py, line 133). It means that these keys are not included in the state dictionary at HF while they still are in minGPT. That's while the assetion fails in line 200 of minGPT/model.py.

So a better solution is to also set the same persistent=False option for the attn.bias keys in line 48 of minGPT/model.py, like this:
self.register_buffer("bias", torch.tril(torch.ones(config.block_size, config.block_size))
.view(1, 1, config.block_size, config.block_size), persistent=False)

Also, the attn.masked_bias keys get the same persistent=False option in the HF code, hence they aren't included in the HF state dictionary. So excluding them in line 196 of minGPT/model.py is unnecesary. And consequently we don't need the keys variable at all, we can directly use sd_hf instead everywhere.

bug fix - remove attn.bias keys from GPT state dict in 'from_pretrine…

88ba2a4

…d'. otherwise assertion fails

danra mentioned this pull request May 23, 2026

AssertionError when run generate.ipynb with default parameter #120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug fix - remove attn.bias keys from GPT state dict in 'from_pretrine…#122

bug fix - remove attn.bias keys from GPT state dict in 'from_pretrine…#122
amnonbleich wants to merge 1 commit into
karpathy:masterfrom
amnonbleich:master

amnonbleich commented Aug 21, 2023

Uh oh!

erno123 commented Jan 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amnonbleich commented Aug 21, 2023

Uh oh!

erno123 commented Jan 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants