Skip to content

xinference + langchain llm.with_structured_output 多并发时 小概率会报缺少逗号报错 100次大约3次左右 #4512

@berserker3912

Description

@berserker3912

System Info / 系統信息

langchain1.0 xinference v1.17.0 cuda12.9

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

xinference v1.17.0

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997 方式运行

Reproduction / 复现过程

class DicInfo(BaseModel):
"""xxxx"""
category: StateType = Field(description="xxxx")
confidence: Optional[float] = Field(
default=0.0,
ge=0.0,
le=1.0,
description="xxxx"
)
detail_categories: Optional[List[str]] = Field(
default_factory=list,
description="xxxx"
)

structured_llm = llm.with_structured_output(DicInfo)
messages = [SystemMessage(content=system_prompt), HumanMessage(content=f"{user_input}")]
final_answer = structured_llm.invoke(messages)

Traceback (most recent call last):
File "/xxxx/xxxx.py", line 130, in classification_execution
final_answer = structured_llm.invoke(messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/site-packages/langchain_core/runnables/base.py", line 3149, in invoke
input_ = context.run(step.invoke, input_, config, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 402, in invoke
self.generate_prompt(
File "/xxxx/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1121, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/site-packages/langchain_core/language_models/chat_models.py", line 931, in generate
self._generate_with_cache(
File "/xxxx/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1225, in _generate_with_cache
result = self._generate(
^^^^^^^^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/langchain_xinference/chat_models.py", line 226, in _generate
final_chunk = self._chat_with_aggregation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/langchain_xinference/chat_models.py", line 310, in _chat_with_aggregation
for stream_resp in response:
^^^^^^^^
File "/xxxx/lib/python3.12/site-packages/xinference/client/common.py", line 62, in streaming_response_iterator
raise Exception(str(error))
Exception: [address=127.0.0.1:39175, pid=167075] Expecting ',' delimiter: line 3 column 1 (char 120)

xinference + langchain llm.with_structured_output 多并发时,有小概率会返回少逗号报错。

Expected behavior / 期待表现

能正常返回完整数据

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions