Skip to content

Commit 16c972f

Browse files
authored
fix: custom tokenizer truncates inputs to model max input length (kubernetes-sigs#266)
1 parent 3631096 commit 16c972f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

inference_perf/utils/custom_tokenizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ def __init__(self, config: CustomTokenizerConfig) -> None:
2424
def count_tokens(self, text: str) -> int:
2525
if text == "":
2626
return 0
27-
return len(self.tokenizer(text).input_ids)
27+
return len(self.tokenizer(text, truncation=True, max_length=self.tokenizer.model_max_length).input_ids)
2828

2929
def get_tokenizer(self) -> PreTrainedTokenizerBase:
3030
return self.tokenizer

0 commit comments

Comments
 (0)