Skip to content

[Bug]: ag-ui get_default_workflow_factory shares the initial_state dict across every AGUIChatWorkflow it produces #22069

Description

@Fr3ya

Bug Description

get_default_workflow_factory in llama-index-protocols-ag-ui/llama_index/protocols/ag_ui/router.py defines an inner workflow_factory() that closes over the operator's initial_state dict and constructs AGUIChatWorkflow(initial_state=initial_state, ...) per request. AGUIChatWorkflow.__init__ stores the parameter by reference: self.initial_state = initial_state or {}. Every workflow the factory yields therefore aliases the same dict on self.initial_state — and that dict is the operator's original config object.

Two leak surfaces follow: (1) any mutation through self.initial_state[...] on one workflow is visible on every other workflow the factory has produced (and on the operator's config); (2) the workflow's per-request derivation inside the chat step uses a shallow state = self.initial_state.copy(), so nested mutable values (items: [], user: {"roles": []}) are aliased across all in-flight requests. This is graph-native because the factory pattern exists precisely to give per-request workflow isolation, and the bug exists only because __init__ plus shallow-copy together defeat that isolation.

Found via static analysis (factory that captures a mutable container and hands it to every instance it produces, combined with shallow .copy() at use-site) plus runtime verification against the published wheel. Suggested fix: copy.deepcopy(initial_state) inside the factory, and replace the chat step's .copy() with copy.deepcopy(...).

Version

llama-index-protocols-ag-ui==0.3.1

Steps to Reproduce

Reproducer uses the real get_default_workflow_factory and real AGUIChatWorkflow from the installed wheel. Only the LLM is stubbed (its abstract methods are never invoked).

import asyncio
from llama_index.core.base.llms.types import LLMMetadata
from llama_index.core.llms.function_calling import FunctionCallingLLM
from llama_index.protocols.ag_ui.router import get_default_workflow_factory

class _LLM(FunctionCallingLLM):
    @property
    def metadata(self):
        return LLMMetadata(is_function_calling_model=True, model_name="fake")
    def chat(self, *a, **k): raise NotImplementedError
    async def achat(self, *a, **k): raise NotImplementedError
    def stream_chat(self, *a, **k): raise NotImplementedError
    async def astream_chat(self, *a, **k): raise NotImplementedError
    def complete(self, *a, **k): raise NotImplementedError
    async def acomplete(self, *a, **k): raise NotImplementedError
    def stream_complete(self, *a, **k): raise NotImplementedError
    async def astream_complete(self, *a, **k): raise NotImplementedError
    def _prepare_chat_with_tools(self, *a, **k): raise NotImplementedError

async def main():
    shared = {"counter": 0, "items": [], "user": {"name": "default", "roles": []}}
    factory = get_default_workflow_factory(llm=_LLM(), initial_state=shared, timeout=30)
    a = await factory()
    b = await factory()
    print("a.initial_state IS b.initial_state:", a.initial_state is b.initial_state)

    # Direct leak: mutate a -> visible on b and on the operator dict.
    a.initial_state["secret_for_alice"] = "API-KEY-ALICE-12345"
    print("b.initial_state['secret_for_alice']:", b.initial_state.get("secret_for_alice"))

    # Shallow-copy leak: the exact line from agent.py:187 the chat step uses.
    a_state = a.initial_state.copy()
    b_state = b.initial_state.copy()
    print("a_state['items'] IS b_state['items']:", a_state["items"] is b_state["items"])
    a_state["items"].append({"order_id": "ALICE-77"})
    a_state["user"]["roles"].append("admin")
    print("b_state['items']:        ", b_state["items"])
    print("b_state['user']['roles']:", b_state["user"]["roles"])

asyncio.run(main())
# CONFIRMED BUG: identity is True; both leaks fire.

Relevant Logs/Tracebacks

a.initial_state IS b.initial_state: True
b.initial_state['secret_for_alice']: 'API-KEY-ALICE-12345'
a_state['items'] IS b_state['items']: True
b_state['items']:         [{'order_id': 'ALICE-77'}]
b_state['user']['roles']: ['admin']

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageIssue needs to be triaged/prioritized

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions