Skip to content

Add batched inference possibility#15

Merged
aerdem4 merged 1 commit intoNVIDIA:mainfrom
alonsosilvaallende:add-batched-inference
Apr 26, 2025
Merged

Add batched inference possibility#15
aerdem4 merged 1 commit intoNVIDIA:mainfrom
alonsosilvaallende:add-batched-inference

Conversation

@alonsosilvaallende
Copy link
Contributor

@alonsosilvaallende alonsosilvaallende commented Apr 25, 2025

By providing a clone attribute to the logits processors they get copied for all the batch:
I found it here in the vllm code (not documented I think):
https://github.com/vllm-project/vllm/blob/19dcc02a72e3ed52e3bf95aae44ea1f40ce42ea0/vllm/sampling_params.py#L537-L550

Solves #14

@aerdem4
Copy link
Contributor

aerdem4 commented Apr 25, 2025

Great work!

Can you please

  1. Sign your commits
  2. Make sure flake8 doesn't complain
  3. Increment package version so that I can update it on pypi after it is merged?

@alonsosilvaallende
Copy link
Contributor Author

Thank you very much @aerdem4
I have fixed the issues. Let me know if something else is needed.

@aerdem4
Copy link
Contributor

aerdem4 commented Apr 26, 2025

thanks @alonsosilvaallende all commits must have verified signatures for us to merge it. You can consider squashing them to one signed commit.

@alonsosilvaallende
Copy link
Contributor Author

I tried to do what you mentioned but I don't think I succeded.

@aerdem4
Copy link
Contributor

aerdem4 commented Apr 26, 2025

Now it has a merge conflict. You may get help from an LLM to make it work.

@alonsosilvaallende
Copy link
Contributor Author

I tried with the help of claude. Does it work now?

@aerdem4 aerdem4 merged commit 5d65e36 into NVIDIA:main Apr 26, 2025
1 check passed
@alonsosilvaallende alonsosilvaallende deleted the add-batched-inference branch April 26, 2025 20:47
@aerdem4
Copy link
Contributor

aerdem4 commented Apr 26, 2025

Thank you. I just published the new version on pypi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants