Skip to content

fallback for finding padToken#461

Merged
davidkoski merged 1 commit intomainfrom
fix457
Jan 14, 2026
Merged

fallback for finding padToken#461
davidkoski merged 1 commit intomainfrom
fix457

Conversation

@davidkoski
Copy link
Copy Markdown
Collaborator

this fixes e.g. nomic-ai/nomic-embed-text-v1.5

Proposed changes

Please include a description of the problem or feature this PR is addressing. If there is a corresponding issue, include the issue #.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

- fixes #457
- per @rudrankriyam "It looks for explicit [PAD] token (BERT standard), falls back to EOS token ID (autoregressive models like Qwen)"

this fixes e.g. nomic-ai/nomic-embed-text-v1.5
@davidkoski davidkoski requested a review from awni January 14, 2026 16:55
Copy link
Copy Markdown
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@davidkoski davidkoski merged commit 44b14cf into main Jan 14, 2026
2 checks passed
@davidkoski davidkoski deleted the fix457 branch January 14, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] embedder-tool throws "Could not determine a padding token from the tokenizer." errors

2 participants