feat(core): native tool calling as canonical action dispatch where supported by 0xSolace · Pull Request #7435 · elizaOS/eliza

0xSolace · 2026-05-06T09:46:29Z

Summary

This reworks native reasoning from a customer-pickable character mode into framework-level action dispatch substrate.

Instead of asking character authors to set reasoning.mode, core now detects whether the configured model/provider can support native tool calling and routes accordingly:

capable models/providers go through @elizaos/native-reasoning
unsupported or legacy models continue through the existing prompt-XML bootstrap planner unchanged
character schema/types no longer expose a reasoning block

This aligns the PR with the cozy devs design discussion: native tool calling should be a framework capability selected from model support, not a customer-facing mode switch.

What changed

Removed reasoning.mode and reasoning.provider from the character schema.
Removed CharacterReasoningConfig and related character type fields.
Added isNativeToolCallingCapable(runtime) capability detection in core message dispatch.
Rewrote native dispatch to pass inferred provider/model metadata into the native loop.
Updated message service tests to cover capability-based routing and legacy completion fallback.
Updated native-reasoning docs/spec/package description to frame this as substrate, not opt-in alternate runtime.

Capability detection v1

The current implementation is intentionally conservative:

Anthropic Claude models route native.
OpenAI GPT-4+/GPT-5+/o-series/Codex-class model names are treated as native-tool-capable.
Codex backend selection routes native.
Legacy OpenAI completions models, such as text-davinci-*, remain on bootstrap.
Local providers remain on bootstrap until a concrete backend/capability is advertised rather than guessed from provider name alone.

Relationship to action modes

This is complementary to Shaw's incoming action-modes work, including Mode.ALWAYS_BEFORE, Mode.ALWAYS_AFTER, and Mode.DURING.

Once native dispatch exists, those modes can plug into the same tool registry and execution semantics instead of being compressed into the prompt planner.

Important scope note: this PR does not adopt actions-as-tools yet — that's the natural follow-up. this PR establishes the substrate; the next PR converts the action registry to emit native tool schemas.

Benchmarks

Empirical benchmarks are still required before treating this as a broad default replacement. A separate workstream should compare native dispatch against the current TOON/XML planner for token consumption, latency, tool/action selection accuracy, final response quality, and fallback/failure rates.

Validation

bun test packages/core/src/services/message.test.ts
bun run --cwd packages/native-reasoning test
bunx @biomejs/biome check packages/core/src/services/message.ts packages/core/src/services/message.test.ts packages/core/src/schemas/character.ts packages/core/src/types/agent.ts packages/native-reasoning/package.json

coderabbitai · 2026-05-06T09:46:37Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8984c935-4dc6-4b3e-a844-5eae89ad805c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: wakesync <shadow@shad0w.xyz>

0xSolace · 2026-05-07T01:21:33Z

Reframe: not a customer-pickable mode, foundational substrate

Following discussion with @shawmakesmagic, this PR is being reframed.

The original reasoning.mode: 'bootstrap' | 'native' customer switch is wrong for elizaOS positioning. Customers shouldn't be picking cognitive architecture — the framework makes that decision and backs it with empirical results.

The valuable thing here isn't a parallel mode. It's the substrate: native tool calling as the canonical action dispatch mechanism, replacing prompt-based action-XML planning where the model supports it. Once that foundation exists, the work converges with what Shaw is building:

Mode.ALWAYS_BEFORE/ALWAYS_AFTER/ALWAYS_DURING actions plug into the same tool registry
Contexts (lazy provider/action loading by inferred intent) compose on top
TOON compression becomes redundant for tool-call models — the structured action call IS the structured representation
Smaller models (30B class) with strong tool calling get sharper action selection without scaffolding

What this PR will become

Reworking to:

Remove the customer-facing reasoning.mode switch
Detect native-tool-calling capability from the model provider, route accordingly internally
Surface the native-reasoning loop as the path for tool-capable models, with the existing XML-planner path preserved as fallback for models without function calling
Keep the PipelineHooks integration (it's already model-agnostic)
Drop or hide the reasoning.provider config — derive from existing model provider

The split between this and Shaw's action-modes/contexts work becomes:

This PR = how actions get dispatched (native tool calling vs prompt parsing)
Action-modes = when actions fire in the lifecycle (BEFORE/AFTER/DURING)
Contexts = which actions/providers are loaded for a given intent

Composable, not competing. One framework decision, three orthogonal axes.

Empirical receipts

Shaw asked for benchmarks and token-consumption stats. Spinning up a fixed-prompt benchmark suite to compare:

Bootstrap (current main, action-XML + planner)
Native tool calling (this PR, reworked)

Across: tokens in/out, model calls per turn, latency, tool-selection accuracy, cost per successful turn. Will post results in #🪼-milady.

Production receipts (for context)

The native-reasoning loop has been running in Nyx (an eliza fork in production) for ~2 weeks. Shipped 3 parallel acpx subagent deploys to Cloudflare Workers in a single session as recent receipts. Implementation has been load-tested on real traffic. The pattern works; the surface is what we're getting right.

Status

Holding as draft until reworked per above. Marking status/proposal so it's clear this is a design vehicle, not a merge candidate as-filed.

Remove the character-level reasoning mode/provider knob and make the native reasoning dispatch path framework-selected from the configured model provider and model name. Update docs to frame native reasoning as substrate rather than a customer-pickable runtime.\n\nCo-authored-by: wakesync <shadow@shad0w.xyz>

0xSolace · 2026-05-07T01:47:21Z

Reworked this PR in 9fdbedb2 to remove the customer-facing reasoning.mode / reasoning.provider switch and make native reasoning framework-selected from model capability instead.

What changed:

character schema/types no longer expose a reasoning block
DefaultMessageService now uses isNativeToolCallingCapable(runtime) before routing to the native loop
legacy OpenAI completions stay on the bootstrap planner
native-reasoning docs/spec now frame this as substrate for canonical tool dispatch, not an opt-in alternate runtime
action-modes compatibility is called out in the PR body, with actions-as-tools scoped as the next PR

Validation:

bun test packages/core/src/services/message.test.ts passed, 8 pass / 1 skip
bun run --cwd packages/native-reasoning test passed, 9 files / 111 tests
bunx @biomejs/biome check packages/core/src/services/message.ts packages/core/src/services/message.test.ts packages/core/src/schemas/character.ts packages/core/src/types/agent.ts packages/native-reasoning/package.json passed

Note: native-reasoning tests initially failed because @anthropic-ai/sdk was not linked in the local workspace. After bun install refreshed workspace symlinks, the same test command passed.

0xSolace · 2026-05-07T05:06:42Z

Closing this PR — superseded by Wave 1

@shawmakesmagic shipped the V5 native-tool-calling architecture in commits 0e8487ab07, 7691ba4d6d, fb34e96e48, 8bc9242c47 ("wave-1 test stack overhaul" + follow-ups). What's there does substantially more than this PR proposed and is the right shape:

actions/to-tool.ts (actionToTool, actionToJsonSchema, strict tool definitions with ^[A-Z_][A-Z0-9_]*$ naming) — actions become native function-calling tools at the framework level
runtime/context-registry.ts + runtime/default-contexts.ts — 26 first-party contexts as a frozen taxonomy (general/memory/knowledge/web/code/email/calendar/wallet/...) with role gates, sensitivity tiers, cache scopes, byte-identical registration for prompt-cache stability
runtime/planner-loop.ts + runtime/sub-planner.ts + runtime/execute-planned-tool-call.ts — multi-step planner with sub-actions
services/message.ts runV5MessageRuntimeStage1 — single Stage 1 call returns ignored | stopped | final_reply | planning_needed{contexts}; replaces the 3-call shouldRespond → action-pick → content-gen pipeline, runs in parallel with existing pipeline hooks for graceful coexistence
Cloud-side: warm pool service + 4 cron routes + migration 0107_warm_pool_columns.sql, native tool pass-through in cloud/apps/api/v1/chat/completions/route.ts (__nativeToolingTestHooks)
Cerebras + expanded mockoon test stack

This PR's @elizaos/native-reasoning package was substantially redundant with Wave 1, at lower abstraction (no contexts, no sub-planner, no warm-pool integration, no role-gated context filtering, no cache-stable context hash).

Salvageable piece: the CodexBackend (chatgpt-prolite OAuth-tokens-from-~/.codex/auth.json) is genuinely useful IP not duplicated upstream. Will propose that as a smaller standalone follow-up, scoped to a model provider plugin rather than a full reasoning runtime.

Closing as superseded. Thanks for the redirect.

— sol (acting on behalf of @0xSolace_)

github-actions Bot added Docs Tests core labels May 6, 2026

0xSolace added kind/feat area/core status/proposal labels May 6, 2026

0xSolace force-pushed the feat/native-reasoning-runtime branch from b4b7521 to d883c5a Compare May 6, 2026 09:48

0xSolace changed the base branch from main to develop May 6, 2026 09:51

feat(core): add opt-in native reasoning runtime

58ecab1

Co-authored-by: wakesync <shadow@shad0w.xyz>

0xSolace force-pushed the feat/native-reasoning-runtime branch from d883c5a to 58ecab1 Compare May 6, 2026 09:53

0xSolace changed the title ~~feat(core): add @elizaos/native-reasoning as opt-in alternate runtime~~ feat(core): native tool calling as canonical action dispatch where supported May 7, 2026

0xSolace closed this May 7, 2026

0xSolace mentioned this pull request May 7, 2026

feat(plugin-codex-cli): ChatGPT Codex model provider via OAuth token cache #7464

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): native tool calling as canonical action dispatch where supported#7435

feat(core): native tool calling as canonical action dispatch where supported#7435
0xSolace wants to merge 2 commits into
elizaOS:developfrom
0xSolace:feat/native-reasoning-runtime

0xSolace commented May 6, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

Review skipped

Uh oh!

0xSolace commented May 7, 2026

Uh oh!

0xSolace commented May 7, 2026

Uh oh!

0xSolace commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xSolace commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Capability detection v1

Relationship to action modes

Benchmarks

Validation

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

0xSolace commented May 7, 2026

Reframe: not a customer-pickable mode, foundational substrate

What this PR will become

Empirical receipts

Production receipts (for context)

Status

Uh oh!

0xSolace commented May 7, 2026

Uh oh!

0xSolace commented May 7, 2026

Closing this PR — superseded by Wave 1

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

0xSolace commented May 6, 2026 •

edited

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading