Skip to content

Commit 8520eb8

Browse files
Update tokenize.py for common >= 1.4.0
1 parent 41c793f commit 8520eb8

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

finetune/data/tokenize.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,10 @@ def tokenize_instruct(
311311
is_first=msg_idx == first_user_idx,
312312
system_prompt=sample.system_prompt,
313313
)
314+
if isinstance(curr_tokens, tuple):
315+
# Versions of mistral_common>1.3.4 return a tuple of tokens (text), tokens (image), spans (image)
316+
curr_tokens = curr_tokens[0]
317+
314318
curr_masks = [False] * len(curr_tokens) # only predict bot answers
315319
elif isinstance(message, ToolMessage):
316320
curr_tokens = instruct_tokenizer.encode_tool_message(

0 commit comments

Comments
 (0)