Skip to content

Thread/async safety: Global mutable state across core SDK is not multi-agent safe #1158

@MervinPraison

Description

@MervinPraison

Problem

The core SDK (praisonaiagents) uses unprotected global mutable state in multiple critical locations. This violates the project's "Multi-agent + async safe by default" engineering principle. In concurrent multi-agent scenarios, these shared globals can cause race conditions, data corruption, and unpredictable behavior.

Affected Locations

1. Unprotected Global Callbacks and Error Logs

File: src/praisonai-agents/praisonaiagents/main.py:24-31

error_logs = []                    # Global list — no synchronization
sync_display_callbacks = {}        # Global dict — no synchronization
async_display_callbacks = {}       # Global dict — no synchronization
approval_callback = None           # Global var — no synchronization
  • Risk: Multiple agents running concurrently will read/write these globals simultaneously, causing lost updates, mixed error logs between agents, and callback registration races.

2. Non-Thread-Safe Lazy Loading Caches (10+ locations)

Lazy _lazy_cache / _module_cache dicts are used across many __init__.py files without any locking:

File Variable
praisonaiagents/__init__.py:104 _lazy_cache = {}
praisonaiagents/agent/__init__.py:4 _lazy_cache = {}
praisonaiagents/tools/__init__.py:6,188 _tools_lazy_cache = {}, _instances = {}
praisonaiagents/ui/__init__.py:16 _lazy_cache = {}
praisonaiagents/llm/__init__.py:13 _lazy_cache = {}
praisonaiagents/planning/__init__.py:35 _lazy_cache = {}
praisonaiagents/gateway/__init__.py:25 _lazy_cache = {}
praisonaiagents/rag/__init__.py:70 _cache = {}
praisonaiagents/flow_display.py:12 _rich_cache = {}
praisonaiagents/session/__init__.py:36 _module_cache = {}
praisonaiagents/scheduler/__init__.py:31 _module_cache = {}
  • Risk: Concurrent imports from multiple threads/async tasks can cause the same module to be loaded multiple times or cache corruption.

3. Duplicated Server State Globals

Files:

  • src/praisonai-agents/praisonaiagents/agent/agent.py:156-160
  • src/praisonai-agents/praisonaiagents/agents/agents.py:34-36

Both define nearly identical globals:

_server_started = {}        # Dict of port -> started boolean
_registered_agents = {}     # Dict of port -> Dict of path -> agent_id
_shared_apps = {}           # Dict of port -> FastAPI app
  • agent.py protects these with _server_lock (good), but agents.py duplicates the pattern independently.
  • Risk: Two separate lock domains managing the same conceptual resource (server ports). Concurrent use of both Agent and Agents classes sharing ports leads to conflicts.

Recommended Approach

Immediate Fixes

  1. Protect main.py globals with threading.Lock() or use contextvars.ContextVar to make them per-agent/per-session:

    import contextvars
    error_logs: contextvars.ContextVar[list] = contextvars.ContextVar('error_logs', default=[])
  2. Add locks to lazy caches — use threading.Lock around cache dict writes in all __getattr__ implementations:

    _lazy_lock = threading.Lock()
    def __getattr__(name):
        with _lazy_lock:
            if name in _lazy_cache:
                return _lazy_cache[name]
            # ... import and cache

Architectural Fix

  1. Unify server state into a single ServerRegistry class with proper locking, used by both Agent and Agents:

    class ServerRegistry:
        _lock = threading.Lock()
        _servers: Dict[int, ServerInfo] = {}
  2. Scope error logs and callbacks per session/agent using contextvars so multi-agent runs don't leak state between agents.

Impact

  • Race conditions in multi-agent concurrent execution
  • Corrupted error logs mixing output from different agents
  • Server port conflicts when Agent and Agents are used together
  • Potential double-imports and cache corruption under concurrent load

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions