Skip to content

Thread-unsafe global mutable state in agent.py breaks multi-agent concurrency #1145

@MervinPraison

Description

@MervinPraison

Problem

The core agent.py contains multiple global and class-level mutable variables accessed without any thread synchronization, violating the "multi-agent + async safe by default" engineering principle.

This causes race conditions when multiple agents run concurrently — a core use case for PraisonAI.

Specific Locations

1. Unprotected HTTP server globals (agent.py lines 143-145)

_server_started = {}   # Dict of port -> started boolean (NO LOCK)
_registered_agents = {} # Dict of port -> Dict (NO LOCK)
_shared_apps = {}       # Dict of port -> FastAPI app (NO LOCK)

Risk: Multiple agents calling .launch() on different ports concurrently will race on dict updates.

2. Non-atomic class counter (agent.py lines 157, 1254)

_agent_counter = 0  # Class-level, incremented without lock
# ...
Agent._agent_counter += 1  # NOT atomic — two threads can get duplicate indices

3. Thread-unsafe lazy loaders (agent.py lines 16-94)

Six manual lazy-loader functions (_get_console, _get_live, _get_llm_functions, _get_display_functions, _get_hooks_module, _get_stream_emitter) use classic TOCTOU pattern:

_rich_console = None
def _get_console():
    global _rich_console
    if _rich_console is None:     # Thread A checks
        from rich.console import Console  # Thread B also checks, both import
        _rich_console = Console()
    return _rich_console

Compare to the correct pattern already in _lazy.py (lines 61-74) with double-checked locking.

4. Same issue in context_agent.py (lines 22-57)

Four more manual lazy loaders (_get_subprocess, _get_glob, _get_ast, _get_asyncio) with the same unsafe pattern.

5. Tools module instance cache (tools/__init__.py line 188, 256-259)

_instances = {}  # Module-level dict accessed without locks
# ...
_instances[class_name] = class_()  # No lock protection

Expected Behavior

All global/class-level mutable state should be protected with threading.Lock() or threading.RLock(), consistent with the existing thread-safe patterns in:

  • _lazy.py (double-checked locking)
  • tools/registry.py (RLock on all operations)

Proposed Fix

  1. Replace manual lazy loaders in agent.py and context_agent.py with the centralized _lazy.py utilities (lazy_import() or create_lazy_getattr())
  2. Add threading.Lock() around _server_started, _registered_agents, _shared_apps mutations
  3. Protect _agent_counter increment with a class-level lock
  4. Add lock to tools/__init__.py _instances dict

Impact

  • Severity: High — affects any multi-agent deployment
  • Principle violated: "Multi-agent + async safe by default"
  • Existing fix pattern: Already implemented correctly in _lazy.py and tools/registry.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions