| title | Runtime Resolution |
|---|---|
| sidebarTitle | Runtime Resolution |
| description | General runtime-resolution subsystem for resolving agent runtimes at turn-time |
| icon | rotate |
The runtime-resolution subsystem maps agent and model references to concrete runtime instances at turn-time, enabling dynamic model switching and custom routing logic.
**Handoffs no longer use this subsystem.** As of [PR #2178](MervinPraison/PraisonAI#2178), handoffs always delegate directly to `agent.chat()` / `agent.achat()`. The runtime-resolution layer is used by other parts of the SDK for agent-level runtime configuration. If you came here for handoff execution behaviour, see [Agent Handoffs](/docs/features/handoffs).graph LR
subgraph "Runtime Resolution"
A[🤖 Agent] --> B[🔍 Resolve Runtime]
B --> C[(Cache)]
C -->|hit| E[✅ Execute]
C -->|miss| F[🏗️ Create Runtime]
F --> C
F --> E
end
classDef agent fill:#8B0000,stroke:#7C90A0,color:#fff
classDef process fill:#F59E0B,stroke:#7C90A0,color:#fff
classDef cache fill:#189AB4,stroke:#7C90A0,color:#fff
classDef success fill:#10B981,stroke:#7C90A0,color:#fff
class A agent
class B,F process
class C cache
class E success
Update the target model at any time — the next invocation automatically picks up the change because agent.chat() reads the live llm value.
from praisonaiagents import Agent
researcher = Agent(
name="Researcher",
instructions="Research the topic and summarise it",
llm="gpt-4o-mini",
)
writer = Agent(
name="Writer",
instructions="Write a polished article",
llm="gpt-4o-mini",
handoffs=[researcher],
)
# Swap the researcher to a different model at any point
researcher.llm = "claude-3-sonnet"
# The next invocation automatically uses claude-3-sonnet for the researcher
writer.start("Research and write about ocean currents")Use get_runtime_cache and clear_runtime_cache to debug or force a fresh resolution.
from praisonaiagents.runtime import get_runtime_cache, clear_runtime_cache
# See what runtimes are cached across sessions
cache = get_runtime_cache()
for session_id, entries in cache.items():
for cache_key, (runtime, cached_at) in entries.items():
print(f"{cache_key}: {runtime.provider}/{runtime.model_ref}")
# Force fresh resolution for a specific session
clear_runtime_cache(session_id="session_123")
# Clear all cached runtimes
clear_runtime_cache()Override the built-in resolver to control how models map to runtimes.
from praisonaiagents import Agent
from praisonaiagents.runtime import (
set_global_resolver,
SessionContext,
)
from praisonaiagents.runtime.resolve import (
RuntimeResolver,
AgentRuntimeProtocol,
LLMRuntimeWrapper,
)
from praisonaiagents.llm.llm import LLM
class MyResolver(RuntimeResolver):
def supports_model(self, model_ref: str) -> bool:
return model_ref.startswith(("gpt-", "claude-", "my-model-"))
def resolve(self, agent_id, model_ref, session_ctx, **kwargs):
# Route "my-model-*" to a custom endpoint
if model_ref.startswith("my-model-"):
llm = LLM(model="gpt-4o-mini", api_base="https://my.endpoint/v1")
else:
llm = LLM(model=model_ref)
return LLMRuntimeWrapper(llm=llm, model_ref=model_ref, agent_id=agent_id)
set_global_resolver(MyResolver())
agent = Agent(name="MyAgent", instructions="Help users", llm="my-model-fast")
agent.start("Hello!")The subsystem reads the agent's current llm (or model) attribute at invocation time, not at construction time.
sequenceDiagram
participant User
participant Agent
participant RuntimeSubsystem
participant Cache
User->>Agent: start("Research and write...")
Agent->>RuntimeSubsystem: resolve(agent_id, current_llm)
RuntimeSubsystem->>Cache: check TTL cache
alt cache hit (< 5 min)
Cache-->>RuntimeSubsystem: cached runtime
else cache miss or expired
RuntimeSubsystem->>RuntimeSubsystem: create LLMRuntimeWrapper
RuntimeSubsystem->>Cache: store & return
end
RuntimeSubsystem-->>Agent: runtime instance
Agent->>Agent: agent.chat(prompt)
Agent-->>User: response
Passed to resolve_runtime to scope caching and track depth.
| Field | Type | Default | Description |
|---|---|---|---|
session_id |
str |
— | Required. Used as the first segment of the cache key |
timestamp |
float |
time.time() if <= 0 |
Session start time |
parent_agent_id |
Optional[str] |
None |
Name of the agent that triggered resolution |
handoff_depth |
int |
0 |
Current nesting depth |
| Constant | Value | Meaning |
|---|---|---|
_cache_ttl_seconds |
300 |
Each cached runtime lives for 5 minutes |
_cleanup_interval |
600 |
Background cleanup daemon runs every 10 minutes |
Cache keys use the format "{session_id}:{agent_id}:{model_ref}" — caches are session-isolated so different conversations never share runtimes.
from praisonaiagents import Agent
analyst = Agent(name="Analyst", instructions="Analyse data", llm="gpt-4o-mini")
coordinator = Agent(name="Coordinator", instructions="Coordinate", handoffs=[analyst])
# First few turns use gpt-4o-mini
coordinator.start("Quick summary of Q1 sales")
# Upgrade to a more capable model for a detailed report
analyst.llm = "gpt-4o"
coordinator.start("Full analysis of Q1 vs Q2 with trend forecasting")from praisonaiagents.runtime import clear_runtime_cache
# After rotating API keys or changing model config
clear_runtime_cache()
agent.start("Continue with the updated model settings")from praisonaiagents.runtime import get_runtime_cache
cache = get_runtime_cache()
total = sum(len(entries) for entries in cache.values())
print(f"Active runtimes: {total} across {len(cache)} sessions")Model re-resolution happens at each invocation boundary. Changing `agent.llm` is effective immediately for the next call — no restart needed. The 5-minute TTL means old runtimes may linger after you rotate API keys. Call `clear_runtime_cache()` to evict all entries and force fresh connections. Return `False` from `supports_model` for models you do not handle. The built-in `DefaultRuntimeResolver` acts as the final fallback, so returning `False` simply delegates back to it. If you need to change how a handoff target executes, configure the target agent itself (instructions, tools, llm). The target agent's full `chat()` pipeline runs on every handoff — this subsystem is not in that path.
All names are importable from praisonaiagents.runtime:
from praisonaiagents.runtime import (
resolve_runtime,
SessionContext,
RuntimeProtocol,
get_runtime_cache,
clear_runtime_cache,
set_global_resolver,
)def resolve_runtime(
agent_id: str,
model_ref: str,
session_ctx: SessionContext,
**kwargs,
) -> AgentRuntimeProtocol:
...The main entry point. Checks the TTL cache first; creates and caches a new runtime if none exists or the entry expired.
Protocols that custom runtimes must satisfy. Only relevant when building a custom resolver.
class RuntimeProtocol(Protocol):
def execute(self, prompt: str, **kwargs) -> Any: ...
async def aexecute(self, prompt: str, **kwargs) -> Any: ...
@property
def model_ref(self) -> str: ...
@property
def provider(self) -> str: ...
class AgentRuntimeProtocol(RuntimeProtocol):
@property
def supports_streaming(self) -> bool: ...
@property
def supports_tools(self) -> bool: ...Agent-to-agent delegation — handoffs always use agent.chat() HandoffConfig reference Secure tool boundaries during handoff Filter context passed during handoff