-
Notifications
You must be signed in to change notification settings - Fork 788
Description
System Info / 系統信息
langchain1.0 xinference v1.17.0 cuda12.9
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- docker / docker
- pip install / 通过 pip install 安装
- installation from source / 从源码安装
Version info / 版本信息
xinference v1.17.0
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997 方式运行
Reproduction / 复现过程
class DicInfo(BaseModel):
"""xxxx"""
category: StateType = Field(description="xxxx")
confidence: Optional[float] = Field(
default=0.0,
ge=0.0,
le=1.0,
description="xxxx"
)
detail_categories: Optional[List[str]] = Field(
default_factory=list,
description="xxxx"
)
structured_llm = llm.with_structured_output(DicInfo)
messages = [SystemMessage(content=system_prompt), HumanMessage(content=f"{user_input}")]
final_answer = structured_llm.invoke(messages)
Traceback (most recent call last):
File "/xxxx/xxxx.py", line 130, in classification_execution
final_answer = structured_llm.invoke(messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/site-packages/langchain_core/runnables/base.py", line 3149, in invoke
input_ = context.run(step.invoke, input_, config, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 402, in invoke
self.generate_prompt(
File "/xxxx/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1121, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/site-packages/langchain_core/language_models/chat_models.py", line 931, in generate
self._generate_with_cache(
File "/xxxx/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1225, in _generate_with_cache
result = self._generate(
^^^^^^^^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/langchain_xinference/chat_models.py", line 226, in _generate
final_chunk = self._chat_with_aggregation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/langchain_xinference/chat_models.py", line 310, in _chat_with_aggregation
for stream_resp in response:
^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/xinference/client/common.py", line 62, in streaming_response_iterator
raise Exception(str(error))
Exception: [address=127.0.0.1:39175, pid=167075] Expecting ',' delimiter: line 3 column 1 (char 120)
xinference + langchain llm.with_structured_output 多并发时,有小概率会返回少逗号报错。
Expected behavior / 期待表现
能正常返回完整数据