Open
Description
Llama3 Tokenizer was updated to support add_end_token in tokenize_messages to support correct generation #1494. These changes need to be made to Mistral.
Changes:
- Update tokenize_message to use add_start_tokens and add_end_tokens like in Make final EOT tag optional for Llama3 tokenizer #1494
- Replace add_eos with add_end_tokens and update tokenize_messages as in Make final EOT tag optional for Llama3 tokenizer #1494
- In call update tokens, mask = self.tokenize_messages(messages) to tokens, mask = self.tokenize_messages(messages, add_end_tokens=not inference)