generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[GRPO] Fix re-tokenization bug in tool-calling loop by concatenating token IDs #5242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
qgallouedec
wants to merge
102
commits into
main
Choose a base branch
from
fix-retokenization-tool-loop
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+63
−45
Open
Changes from 97 commits
Commits
Show all changes
102 commits
Select commit
Hold shift + click to select a range
f10285e
support prompts or token IDs in VLLMClient and update API request han…
qgallouedec 7d2bb67
test
qgallouedec 3b356ac
consistency
qgallouedec 82c4508
fix
qgallouedec 3ea2fcf
another fix
qgallouedec 445f4ba
fix docstring
qgallouedec 8c6c88d
Add support for multi-modal inputs in VLLMClient and vllm_serve
qgallouedec f617b2d
Merge branch 'main' into vllm-accept-token-ids
qgallouedec eaffd67
Merge branch 'main' into vllm-accept-token-ids
qgallouedec f3f6a5d
Move `rollout_func from `_generate_single_turn` to `_generate`
qgallouedec d417543
fix style
qgallouedec 4b927d6
support multi-image
qgallouedec 029fc1f
style
qgallouedec 20b4039
Merge branch 'vllm-accept-token-ids' into vllm-support-image-with-raw…
qgallouedec b8e3912
Merge branch 'vllm-support-image-with-raw-token' into move-rollout-func
qgallouedec 07181cb
Fix handling of images in OnlineDPOTrainer to ensure proper structure…
qgallouedec 6ff1e56
Merge branch 'main' into vllm-accept-token-ids
qgallouedec 9f340e4
Merge branch 'vllm-accept-token-ids' into vllm-support-image-with-raw…
qgallouedec d138be7
Merge branch 'vllm-support-image-with-raw-token' into move-rollout-func
qgallouedec 09128d6
Move tokenization before vLLM generation call
qgallouedec 7fd1711
Fix deadlock issue by ensuring images are always gathered in VLLMGene…
qgallouedec 3ab04b0
Unify tokenization across all generation backends in _generate_single…
qgallouedec 5d6d067
Extract tokenization out of _generate_single_turn into _tokenize_prompts
qgallouedec b4d2c34
Enhance multimodal input handling in GRPO and RLOO trainers by adding…
qgallouedec 4922362
style
qgallouedec 37c48b3
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 3375aea
Fix re-tokenization bug in tool-calling loop by concatenating token IDs
qgallouedec 638f88a
Enhance _tool_call_loop to support multimodal inputs by adding images…
qgallouedec 9825358
Refactor generation methods in GRPO and RLOO trainers to remove unuse…
qgallouedec 65d62db
Refactor GRPOTrainer generation methods to remove unused extra_fields…
qgallouedec d1685b1
multimodal
qgallouedec 71de8c0
fix
qgallouedec 0a264a2
Fix tokenization padding issue in GRPOTrainer to handle unpadded inpu…
qgallouedec 0aa0e30
style
qgallouedec b490357
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 6fd47dc
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 8fecba1
align rloo
qgallouedec 6c093dd
style
qgallouedec a9a91c7
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 934aae7
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 7e863e1
fix
qgallouedec f033e63
revert doc modif
qgallouedec 5a1f609
Merge branch 'vllm-accept-token-ids' into vllm-support-image-with-raw…
qgallouedec 1eb3540
Merge branch 'vllm-support-image-with-raw-token' into move-rollout-func
qgallouedec 498a564
Merge branch 'move-rollout-func' into vllm-generate-with-token-ids
qgallouedec be2ff99
Merge branch 'vllm-generate-with-token-ids' into unify-tokenization-g…
qgallouedec 5df2069
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec ae8767f
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec d3f7971
Merge branch 'main' into vllm-support-image-with-raw-token
qgallouedec 319d52a
simplify multimodal
qgallouedec d5e1906
Merge branch 'main' into vllm-support-image-with-raw-token
qgallouedec 4ccadcf
Merge branch 'vllm-support-image-with-raw-token' into move-rollout-func
qgallouedec 2a80df9
Merge branch 'move-rollout-func' into vllm-generate-with-token-ids
qgallouedec a0df552
Merge branch 'vllm-generate-with-token-ids' into unify-tokenization-g…
qgallouedec 3350588
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 19ffe9e
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 0558dc9
Merge branch 'main' into move-rollout-func
qgallouedec 6ebb681
Merge branch 'move-rollout-func' into vllm-generate-with-token-ids
qgallouedec 93640e4
Merge branch 'vllm-generate-with-token-ids' into unify-tokenization-g…
qgallouedec 1c009b0
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 0c1fe0f
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 97a813b
Merge branch 'main' into vllm-generate-with-token-ids
qgallouedec 83ab9bd
Merge branch 'vllm-generate-with-token-ids' into unify-tokenization-g…
qgallouedec 408fb2e
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 087b5e9
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec ade2831
Merge branch 'main' into vllm-generate-with-token-ids
qgallouedec 258e0a8
Update trl/trainer/grpo_trainer.py
qgallouedec ef96048
Update trl/trainer/rloo_trainer.py
qgallouedec 0ee6495
Merge branch 'vllm-generate-with-token-ids' into unify-tokenization-g…
qgallouedec bb6dc69
Update trl/trainer/grpo_trainer.py
qgallouedec 0effa0d
Update trl/trainer/rloo_trainer.py
qgallouedec fad1fdd
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec f2d1e01
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec b35f250
Remove unused chat/tool configuration parameters from VLLM and RLOO t…
qgallouedec 040e392
Update trl/generation/vllm_generation.py
qgallouedec ca2cae3
Update trl/trainer/rloo_trainer.py
qgallouedec fee553d
Merge branch 'main' into vllm-generate-with-token-ids
qgallouedec 90df2de
Merge branch 'vllm-generate-with-token-ids' into unify-tokenization-g…
qgallouedec f36c0ea
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 8678382
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec fdaa90a
fix
qgallouedec 6f10cd2
style
qgallouedec 533c337
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec 50418e0
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 7e7e3b3
Merge branch 'main' into unify-tokenization-generate
qgallouedec 31d8a0c
Merge branch 'unify-tokenization-generate' into extract-tokenize-prompts
qgallouedec e88987f
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 8b4f6af
Merge branch 'main' into extract-tokenize-prompts
qgallouedec a704d89
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 81cf273
Merge branch 'main' into extract-tokenize-prompts
qgallouedec 918686b
Remove dead code: eliminate prompt tokenization logic from GRPOTraine…
qgallouedec 9b8de83
remove unused extra_fields from _generate_single_turn return value
qgallouedec 6c8f55c
style
qgallouedec 130d974
Merge branch 'extract-tokenize-prompts' into fix-retokenization-tool-…
qgallouedec 8b27397
properly merge upstream
qgallouedec 6c9db28
fix
qgallouedec 441725b
Merge branch 'main' into fix-retokenization-tool-loop
qgallouedec 367a79e
align with main
qgallouedec f3f0f8d
fix
qgallouedec 5147625
Merge branch 'main' into fix-retokenization-tool-loop
qgallouedec 10708ca
Merge branch 'main' into fix-retokenization-tool-loop
qgallouedec f81f6a9
Merge branch 'main' into fix-retokenization-tool-loop
qgallouedec File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.