Setting ignore_eos for llama serverless endpoint

https://github.com/microsoft/eureka-ml-insights/blob/1713e793815fe8e0e4891c187617fcd0ab973cd0/eureka_ml_insights/models/models.py#L279

LlamaServerlessAzureRestEndpointModel which is used to run 405B models sets ignore_eos: str = "false" by default. This is passed to the api as a string and as a consequence it's not set correctly. This causes the model to continue post EOS and generate random tokens till max_token limit.

Fix: Need to set ignore_eos: bool = False. I have tested this fix for Calendar Planning.

We will need to test other bool flags like skip_special_tokens and use_beam_search as well. Similar str flags are there in Mistral model class too.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting ignore_eos for llama serverless endpoint #51

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Setting ignore_eos for llama serverless endpoint #51

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions