Skip to content

Messages Trimming Index out of range if include_system=True, but empty human message #30376

Open
@komikndr

Description

@komikndr

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

I am currently testing bunch of configuration, and numerous chatbot backend. And I implemented both custom token counter and prebuilt token counter (using LLM as the token counter) based on the https://python.langchain.com/docs/how_to/trim_messages

This is my test code

@pytest.mark.asyncio
async def test_chat_node_empty_state():
    workflow = DB2Chat()
    state = GraphState(
        name="CHAT",
        messages=[],
        chat_model="(openai)gpt-4o-mini"
    )
    config = {"configurable": {"session_id": "test_session"}}

    result = await workflow.chat_node(state, config)
    assert isinstance(result, dict)
    assert "messages" in result
"""

And the actual code itself
"""python
    async def chat_node(self, state: GraphState, config: RunnableConfig) -> GraphState:
        prompt = ChatPromptTemplate.from_messages(
            [
                SystemMessage(content="""
                    You are assistant chatbot
                              """),
                MessagesPlaceholder(variable_name="messages"),
            ]
        )
        llm = llm_factory.create_model(
            self.output_chat_model, model=state["chat_model"], tools=self.tools
        )

        if state["chat_model"].startswith("(openai)"):
            trimmer = trim_messages(
                token_counter=llm,
                strategy="last",
                max_tokens=32000,
                start_on="human",
                end_on=("human", "tool"),
                include_system=True,
            )
            chain: Runnable = prompt | trimmer | llm
            return {"messages": [await chain.ainvoke(state, config=config)]}

        else:
            model_name = re.sub(r"^\([^)]*\)", "", state["chat_model"]).removesuffix(":latest")
            token_count = msg_token_counter_factory(model_name)
            trimmer = trim_messages(
                token_counter=token_count,
                strategy="last",
                max_tokens=32000,
                start_on="human",
                end_on=("human", "tool"),
                include_system=True,
            )
            chain: Runnable = prompt | trimmer | llm
            return {"messages": [await chain.ainvoke(state, config=config)]}

This is likely because of SystemMessages is a valid message so the conditional is valid, pop the msg, and then failed since there is no msg in the list of messages anymore, langchain_core/messages/utils.py, line 1302

def _last_max_tokens(
    messages: Sequence[BaseMessage],
    *,
    max_tokens: int,
    token_counter: Callable[[list[BaseMessage]], int],
    text_splitter: Callable[[str], list[str]],
    allow_partial: bool = False,
    include_system: bool = False,
    start_on: Optional[
        Union[str, type[BaseMessage], Sequence[Union[str, type[BaseMessage]]]]
    ] = None,
    end_on: Optional[
        Union[str, type[BaseMessage], Sequence[Union[str, type[BaseMessage]]]]
    ] = None,
) -> list[BaseMessage]:
    messages = list(messages)
    print("This is the length of the message",len(messages))
    print(messages[0])
    if len(messages) == 0:
        return []
    if end_on:
        while messages and not _is_message_type(messages[-1], end_on):
            messages.pop()
    swapped_system = include_system and isinstance(messages[0], SystemMessage)
    reversed_ = messages[:1] + messages[1:][::-1] if swapped_system else messages[::-1]

    reversed_ = _first_max_tokens(
        reversed_,
        max_tokens=max_tokens,
        token_counter=token_counter,
        text_splitter=text_splitter,
        partial_strategy="last" if allow_partial else None,
        end_on=start_on,
    )
    if swapped_system:
        return reversed_[:1] + reversed_[1:][::-1]
    else:
        return reversed_[::-1]

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
File "", line 1, in
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "", line 9, in test_chat_node_empty_state
File "/home/demon/research-onyx-ai/onyx_assistant/src/chat_workflow/workflows/db2_chat.py", line 80, in chat_node
return {"messages": [await chain.ainvoke(state, config=config)]}
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3066, in ainvoke
input = await asyncio.create_task(part())
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4741, in ainvoke
return await self._acall_with_config(
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1978, in _acall_with_config
output = await coro
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4665, in _ainvoke
output = await acall_func_with_variable_args(
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4635, in f
return await run_in_executor(config, func, *args, **kwargs)
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 588, in run_in_executor
return await asyncio.get_running_loop().run_in_executor(
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 579, in wrapper
return func(*args, **kwargs)
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4629, in func
return call_func_with_variable_args(
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 396, in call_func_with_variable_args
return func(input, **kwargs) # type: ignore[call-arg]
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/messages/utils.py", line 869, in trim_messages
return _last_max_tokens(
File "/home/demon/miniconda3/envs/chainlit/lib/python3.10/site-packages/langchain_core/messages/utils.py", line 1304, in _last_max_tokens
swapped_system = include_system and isinstance(messages[0], SystemMessage)
IndexError: list index out of range

Description

I am currently testing bunch of configuration, and numerous chatbot backend. And I implemented both custom token counter and prebuilt token counter (using LLM as the token counter) based on the https://python.langchain.com/docs/how_to/trim_messages. But Messages Trimming Index out of range if include_system=True, but empty human message

System Info

System Information

OS: Linux
OS Version: #1 SMP Tue Nov 5 00:21:55 UTC 2024
Python Version: 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]

Package Information

langchain_core: 0.3.31
langchain: 0.3.7
langchain_community: 0.3.4
langsmith: 0.1.139
langchain_anthropic: 0.2.4
langchain_google_genai: 2.0.4
langchain_google_vertexai: 2.0.7
langchain_groq: 0.2.1
langchain_ollama: 0.2.0
langchain_openai: 0.2.5
langchain_text_splitters: 0.3.2
langgraph_sdk: 0.1.51

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.10.10
anthropic: 0.39.0
anthropic[vertexai]: Installed. No version info available.
async-timeout: 4.0.3
dataclasses-json: 0.6.7
defusedxml: 0.7.1
google-cloud-aiplatform: 1.71.1
google-cloud-storage: 2.18.2
google-generativeai: 0.8.3
groq: 0.11.0
httpx: 0.27.2
httpx-sse: 0.4.0
jsonpatch: 1.33
langchain-mistralai: Installed. No version info available.
numpy: 1.26.4
ollama: 0.3.3
openai: 1.54.0
orjson: 3.10.11
packaging: 23.2
pillow: 11.1.0
pydantic: 2.9.2
pydantic-settings: 2.6.1
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
SQLAlchemy: 2.0.36
tenacity: 9.0.0
tiktoken: 0.8.0
typing-extensions: 4.12.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    investigateFlagged for investigation.🤖:bugRelated to a bug, vulnerability, unexpected error with an existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions