Feature request
openai-compatible responses and chat completions endpoints in vllm_serve.py so that applications implementing a custom rollout function can use these endpoints as if they were a openai or vllm server with these endpoints. Should also support tool call formatting. Would make training existing agents easier.
Motivation
Creating a custom rollout function that uses an agent which is built for openai-compatible responses and chat completions endpoints is challenging without support for these endpoints from vllm_serve.py
Your contribution
This PR seems to be what I am requesting: #3469