Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion matrix/app_server/llm/openai.proto
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,26 @@ message Usage {
int32 total_tokens = 3;
}

message FunctionCall {
string name = 1;
string arguments = 2;
}
message ToolCall {
string id = 1;
string type = 2; // Only "function" is valid
FunctionCall function = 3;
}

Comment on lines +45 to +54
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how multiple arguments are represented in proto?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// CompletionMessage is the message that is sent to the server.
message CompletionMessage {
string role = 1;
string content = 2;
string name = 3; // omitempty will need to be handled in implementation
string refusal = 3;
repeated FunctionCall function_call = 4;
repeated ToolCall tool_calls = 5;

// vLLM-specific fields that are not in OpenAI spec
string reasoning_content = 6;
}

message ChatCompletionResponseChoice {
Expand Down
58 changes: 31 additions & 27 deletions matrix/app_server/llm/openai_pb2.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 0 additions & 8 deletions matrix/app_server/llm/ray_serve_vllm.py
Original file line number Diff line number Diff line change
Expand Up @@ -460,14 +460,6 @@ async def CreateCompletion(self, request):
await self.openai_serving_chat.models.init_static_loras()
generator = await self.openai_serving_completion.create_completion(
completion_request,
Request( # this Request is purely dummy, it is changed to optional in vllm's recent pull https://github.com/vllm-project/vllm/pull/12503
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does removing it work for vllm 0.8.3 too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. this was updated long time ago.

scope={
"type": "http",
"method": "GET",
"path": "",
"headers": [],
}
),
)
if isinstance(generator, ErrorResponse):
if hasattr(generator, "error"):
Expand Down