Skip to content

AssertionError in position embedding (potentially due to missing clear_cache between batches of data) #1340

@ameyagodbole

Description

@ameyagodbole

Describe the bug
I have been trying to use lm-evaluation-harness with gpt-neox/eval.py. AFAIK the other query types besides generate_until work fine. With generate_until, I run into this assertion check (in the position embedding module) after a couple of examples have been processed.

assert seq_len <= self.max_seq_len

In my testing, the model is about to generate (say) token 48. I have verified that the token_index_to_generate in gpt-neox/megatron/text_generation_utils.py is in fact 48. But somehow RotaryEmbedding is trying to create an embedding for position 1025 (beyond the model_max_length).

To Reproduce
Will fill in reproducible configs. Currently, I'm using a model with a custom config (but trained in neox) and evalauting on a QA dataset (where eval-harness uses generate_until).

Proposed solution
I suspect the issue is caused by a missing clear_cache() between batches of data. Adding model.module.clear_cache() at the start of gpt-neox/megatron/text_generation_utils.py:stream_tokens seems to fix it on my side.

I am unsure whether this is correct and if it's a complete fix. The same clear_cache operation seems to be invoked in generate_samples_interactive but not in generate_samples_from_prompt.

Environment (please complete the following information):

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions