PraisonAIDocs/docs/features/runtime-resolution.mdx at 057708755bf56dbea2a4613bd4a2dc81c43b1c7f · MervinPraison/PraisonAIDocs

title	Runtime Resolution
sidebarTitle	Runtime Resolution
description	General runtime-resolution subsystem for resolving agent runtimes at turn-time
icon	rotate

The runtime-resolution subsystem maps agent and model references to concrete runtime instances at turn-time, enabling dynamic model switching and custom routing logic.

**Handoffs no longer use this subsystem.** As of [PR #2178](MervinPraison/PraisonAI#2178), handoffs always delegate directly to `agent.chat()` / `agent.achat()`. The runtime-resolution layer is used by other parts of the SDK for agent-level runtime configuration. If you came here for handoff execution behaviour, see [Agent Handoffs](/docs/features/handoffs).

graph LR
    subgraph "Runtime Resolution"
        A[🤖 Agent] --> B[🔍 Resolve Runtime]
        B --> C[(Cache)]
        C -->|hit| E[✅ Execute]
        C -->|miss| F[🏗️ Create Runtime]
        F --> C
        F --> E
    end

    classDef agent fill:#8B0000,stroke:#7C90A0,color:#fff
    classDef process fill:#F59E0B,stroke:#7C90A0,color:#fff
    classDef cache fill:#189AB4,stroke:#7C90A0,color:#fff
    classDef success fill:#10B981,stroke:#7C90A0,color:#fff

    class A agent
    class B,F process
    class C cache
    class E success

Quick Start

Update the target model at any time — the next invocation automatically picks up the change because agent.chat() reads the live llm value.

from praisonaiagents import Agent

researcher = Agent(
    name="Researcher",
    instructions="Research the topic and summarise it",
    llm="gpt-4o-mini",
)

writer = Agent(
    name="Writer",
    instructions="Write a polished article",
    llm="gpt-4o-mini",
    handoffs=[researcher],
)

# Swap the researcher to a different model at any point
researcher.llm = "claude-3-sonnet"

# The next invocation automatically uses claude-3-sonnet for the researcher
writer.start("Research and write about ocean currents")

Use get_runtime_cache and clear_runtime_cache to debug or force a fresh resolution.

from praisonaiagents.runtime import get_runtime_cache, clear_runtime_cache

# See what runtimes are cached across sessions
cache = get_runtime_cache()
for session_id, entries in cache.items():
    for cache_key, (runtime, cached_at) in entries.items():
        print(f"{cache_key}: {runtime.provider}/{runtime.model_ref}")

# Force fresh resolution for a specific session
clear_runtime_cache(session_id="session_123")

# Clear all cached runtimes
clear_runtime_cache()

Override the built-in resolver to control how models map to runtimes.

from praisonaiagents import Agent
from praisonaiagents.runtime import (
    set_global_resolver,
    SessionContext,
)
from praisonaiagents.runtime.resolve import (
    RuntimeResolver,
    AgentRuntimeProtocol,
    LLMRuntimeWrapper,
)
from praisonaiagents.llm.llm import LLM

class MyResolver(RuntimeResolver):
    def supports_model(self, model_ref: str) -> bool:
        return model_ref.startswith(("gpt-", "claude-", "my-model-"))

    def resolve(self, agent_id, model_ref, session_ctx, **kwargs):
        # Route "my-model-*" to a custom endpoint
        if model_ref.startswith("my-model-"):
            llm = LLM(model="gpt-4o-mini", api_base="https://my.endpoint/v1")
        else:
            llm = LLM(model=model_ref)
        return LLMRuntimeWrapper(llm=llm, model_ref=model_ref, agent_id=agent_id)

set_global_resolver(MyResolver())

agent = Agent(name="MyAgent", instructions="Help users", llm="my-model-fast")
agent.start("Hello!")

How It Works

The subsystem reads the agent's current llm (or model) attribute at invocation time, not at construction time.

sequenceDiagram
    participant User
    participant Agent
    participant RuntimeSubsystem
    participant Cache

    User->>Agent: start("Research and write...")
    Agent->>RuntimeSubsystem: resolve(agent_id, current_llm)
    RuntimeSubsystem->>Cache: check TTL cache
    alt cache hit (< 5 min)
        Cache-->>RuntimeSubsystem: cached runtime
    else cache miss or expired
        RuntimeSubsystem->>RuntimeSubsystem: create LLMRuntimeWrapper
        RuntimeSubsystem->>Cache: store & return
    end
    RuntimeSubsystem-->>Agent: runtime instance
    Agent->>Agent: agent.chat(prompt)
    Agent-->>User: response

Handoffs always execute via `agent.chat()` / `agent.achat()` directly — they do not call into this subsystem. Runtime resolution is used for agent-level model configuration and caching, not for handoff execution.

Configuration Options

`SessionContext`

Passed to resolve_runtime to scope caching and track depth.

Field	Type	Default	Description
`session_id`	`str`	—	Required. Used as the first segment of the cache key
`timestamp`	`float`	`time.time()` if `<= 0`	Session start time
`parent_agent_id`	`Optional[str]`	`None`	Name of the agent that triggered resolution
`handoff_depth`	`int`	`0`	Current nesting depth

Cache constants

Constant	Value	Meaning
`_cache_ttl_seconds`	`300`	Each cached runtime lives for 5 minutes
`_cleanup_interval`	`600`	Background cleanup daemon runs every 10 minutes

Cache keys use the format "{session_id}:{agent_id}:{model_ref}" — caches are session-isolated so different conversations never share runtimes.

Common Patterns

Mid-conversation model swap

from praisonaiagents import Agent

analyst = Agent(name="Analyst", instructions="Analyse data", llm="gpt-4o-mini")
coordinator = Agent(name="Coordinator", instructions="Coordinate", handoffs=[analyst])

# First few turns use gpt-4o-mini
coordinator.start("Quick summary of Q1 sales")

# Upgrade to a more capable model for a detailed report
analyst.llm = "gpt-4o"
coordinator.start("Full analysis of Q1 vs Q2 with trend forecasting")

Force cache refresh

from praisonaiagents.runtime import clear_runtime_cache

# After rotating API keys or changing model config
clear_runtime_cache()

agent.start("Continue with the updated model settings")

Introspect resolved runtimes

from praisonaiagents.runtime import get_runtime_cache

cache = get_runtime_cache()
total = sum(len(entries) for entries in cache.values())
print(f"Active runtimes: {total} across {len(cache)} sessions")

Best Practices

Model re-resolution happens at each invocation boundary. Changing `agent.llm` is effective immediately for the next call — no restart needed. The 5-minute TTL means old runtimes may linger after you rotate API keys. Call `clear_runtime_cache()` to evict all entries and force fresh connections. Return `False` from `supports_model` for models you do not handle. The built-in `DefaultRuntimeResolver` acts as the final fallback, so returning `False` simply delegates back to it. If you need to change how a handoff target executes, configure the target agent itself (instructions, tools, llm). The target agent's full `chat()` pipeline runs on every handoff — this subsystem is not in that path.

Public API

All names are importable from praisonaiagents.runtime:

from praisonaiagents.runtime import (
    resolve_runtime,
    SessionContext,
    RuntimeProtocol,
    get_runtime_cache,
    clear_runtime_cache,
    set_global_resolver,
)

`resolve_runtime`

def resolve_runtime(
    agent_id: str,
    model_ref: str,
    session_ctx: SessionContext,
    **kwargs,
) -> AgentRuntimeProtocol:
    ...

The main entry point. Checks the TTL cache first; creates and caches a new runtime if none exists or the entry expired.

`RuntimeProtocol` / `AgentRuntimeProtocol`

Protocols that custom runtimes must satisfy. Only relevant when building a custom resolver.

class RuntimeProtocol(Protocol):
    def execute(self, prompt: str, **kwargs) -> Any: ...
    async def aexecute(self, prompt: str, **kwargs) -> Any: ...
    @property
    def model_ref(self) -> str: ...
    @property
    def provider(self) -> str: ...

class AgentRuntimeProtocol(RuntimeProtocol):
    @property
    def supports_streaming(self) -> bool: ...
    @property
    def supports_tools(self) -> bool: ...

Agent-to-agent delegation — handoffs always use agent.chat() HandoffConfig reference Secure tool boundaries during handoff Filter context passed during handoff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick Start

How It Works

Configuration Options

`SessionContext`

Cache constants

Common Patterns

Mid-conversation model swap

Force cache refresh

Introspect resolved runtimes

Best Practices

Public API

`resolve_runtime`

`RuntimeProtocol` / `AgentRuntimeProtocol`

Related

FilesExpand file tree

runtime-resolution.mdx

Latest commit

History

runtime-resolution.mdx

File metadata and controls

Quick Start

How It Works

Configuration Options

SessionContext

Cache constants

Common Patterns

Mid-conversation model swap

Force cache refresh

Introspect resolved runtimes

Best Practices

Public API

resolve_runtime

RuntimeProtocol / AgentRuntimeProtocol

Related

`SessionContext`

`resolve_runtime`

`RuntimeProtocol` / `AgentRuntimeProtocol`