Skip to content

Setting ignore_eos for llama serverless endpoint #51

Closed
@vidhishanair

Description

@vidhishanair

ignore_eos: str = "false"

LlamaServerlessAzureRestEndpointModel which is used to run 405B models sets ignore_eos: str = "false" by default. This is passed to the api as a string and as a consequence it's not set correctly. This causes the model to continue post EOS and generate random tokens till max_token limit.

Fix: Need to set ignore_eos: bool = False. I have tested this fix for Calendar Planning.

We will need to test other bool flags like skip_special_tokens and use_beam_search as well. Similar str flags are there in Mistral model class too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions