Skip to content

fix(core): preserve speaker names in prior dialogue context#7895

Open
RemilioNubilio wants to merge 2 commits into
elizaOS:developfrom
RemilioNubilio:nubs/prior-dialogue-speaker-context-20260523
Open

fix(core): preserve speaker names in prior dialogue context#7895
RemilioNubilio wants to merge 2 commits into
elizaOS:developfrom
RemilioNubilio:nubs/prior-dialogue-speaker-context-20260523

Conversation

@RemilioNubilio
Copy link
Copy Markdown
Contributor

@RemilioNubilio RemilioNubilio commented May 23, 2026

Summary

  • preserves connector-provided sender names when RECENT_MESSAGES history is rendered as structured prior_message:user blocks
  • keeps the existing provider-text dedupe behavior, so full # Conversation Messages provider text still stays out of Stage 1 prompts
  • adds a regression test for the Discord failure shape where a later turn asks about a named participant from earlier chat context

Why

A live Discord exchange showed Remilio/Nubilio losing speaker attribution after earlier chat context was converted into anonymous prior_message:user blocks. The model saw prior text, but not who said it, so it failed follow-up questions like "look in the chat" / "her and botdick" even though the relevant messages were present.

This patch is connector-agnostic: it reads the existing metadata.sender / entity-name fields already written by connectors and prefixes prior dialogue content with the sender name. Reply references already use this shape.

Validation

  • Synced branch to latest origin/develop (fa83ddee6f) via fast-forward before patching
  • bun vitest run src/__tests__/message-runtime-stage1.test.ts src/features/basic-capabilities/providers/recentMessages.test.ts src/runtime/__tests__/planner-loop.test.ts from packages/core: 94 tests passed
  • bun run typecheck from packages/core: passed
  • bun run build from packages/core: passed
  • Baseline after sync before patch:
    • core focused suite: 119 tests passed
    • orchestrator focused suite: 120 tests passed
  • Live Discord smoke after restart from local source:
    • setup message: 1507674411552084059
    • prompt: 1507674422939750474
    • bot reply: 1507674432225939507
    • trajectory: tj-224f2528525703
    • model: gpt-oss-120b
    • tool calls: 0, failures: 0
    • result: answered the shebotdick/botdick compatibility question using the recent channel context instead of claiming missing information

Greptile Summary

This PR fixes speaker attribution in prior-dialogue context by prefixing each prior_message:user block with the sender's name drawn from metadata.sender / entity-name fields, and adds a core.contextual_identity_lookup_requires_recall evaluator that reroutes short, locally-scoped "who is X?" questions through memory/messaging rather than letting the simple shortcut guess.

  • Speaker attribution (priorDialogueSpeakerName + priorDialogueContent): reads connector-supplied sender metadata and prepends \"name: text\" to prior-dialogue segments, preserving existing dedup behaviour for provider text.
  • Identity-lookup evaluator (extractContextualIdentityLookupSubject + isShortContextualLookupSubject): detects single-turn questions about short, local-looking handles and injects an identity_lookup_policy context slice, routing the planner toward recall before answering.
  • Tests: three new Stage 1 test cases cover the Discord failure shape, the routing trigger, and a negative case for public-entity questions.

Confidence Score: 4/5

Safe to merge for the speaker-attribution fix; the identity-lookup evaluator has a case-sensitivity inconsistency that causes it to silently skip capitalized chat handles.

The speaker-attribution change is narrow and well-tested. The identity-lookup evaluator's isShortContextualLookupSubject tests the third looksLikeLocalName condition against the original-case subject instead of the lowercased normalized, so a Discord display name like Shebotdick (capital S) would not trigger recall routing while the lowercase spelling would, quietly reverting to the guessing path the feature is meant to prevent.

packages/core/src/services/message.ts — specifically the isShortContextualLookupSubject function and its looksLikeLocalName regex.

Important Files Changed

Filename Overview
packages/core/src/services/message.ts Adds speaker-name attribution to prior-dialogue context blocks and a new core.contextual_identity_lookup_requires_recall evaluator; contains a case-sensitivity inconsistency in isShortContextualLookupSubject where the last looksLikeLocalName regex tests the original-case subject instead of normalized, causing capitalized chat handles to bypass recall routing.
packages/core/src/tests/message-runtime-stage1.test.ts Adds three new test cases: speaker-name preservation in prior dialogue blocks, contextual-identity routing through recall, and a negative case for public-entity questions; tests are well-scoped and directly exercise the changed code paths.

Reviews (2): Last reviewed commit: "fix(core): route local identity lookups ..." | Re-trigger Greptile

@github-actions github-actions Bot added the Tests label May 23, 2026
Comment thread packages/core/src/services/message.ts
Comment thread packages/core/src/services/message.ts
@RemilioNubilio
Copy link
Copy Markdown
Contributor Author

Additional trajectory evidence from the live Discord investigation:

Before

Source channel: 1481030966565797888

Failing exchange:

  • user prompt 1507667963766116494: whats the compatibility between her and botdick
  • bot reply 1507667972901044325: said it did not have enough information about who her and botdick were
  • trajectory tj-0acd867623935c

Follow-up:

  • user prompt 1507668050311254127: look in the chat bozo
  • bot reply 1507668083551244350: claimed it scanned recent chat but found no mention
  • trajectory tj-0b1dd2e55d91ad

What the trajectories showed: the relevant text was present in the Stage 1 prompt, but structured history had been reduced to anonymous prior_message:user blocks, so the model saw message contents without reliable speaker identity.

After

Smoke channel: 1490836448755060920

Live verification after restarting from local source:

  • setup message 1507674411552084059: botdick, meet shebotdick...
  • prompt 1507674422939750474: what is the compatibility between her and botdick?
  • bot reply 1507674432225939507: correctly answered that shebotdick and botdick complement each other
  • trajectory tj-224f2528525703
  • model: gpt-oss-120b
  • tool calls: 0
  • tool failures: 0

Note: the live after smoke had the setup message in the connector-provided recent channel context, which already includes names. The new unit regression directly covers the lower-level prior_message:user path from the failing trajectories and asserts it now renders botdick: ... / 1gig: ... while still suppressing full provider text duplication.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 067b59bf-f33e-4c14-86c0-bb0bfc3878a0

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RemilioNubilio
Copy link
Copy Markdown
Contributor Author

Follow-up pushed in b2d62be0cd after another live Discord failure exposed a related routing gap.

What changed:

  • Added a built-in response-handler evaluator for short local/chat identity lookups such as who is queuebot?.
  • If Stage 1 tries to answer these through the simple shortcut, the evaluator clears the ungrounded reply and routes through the planner with general context plus a MESSAGE action hint when available.
  • The guard is generic and intentionally narrow: local-looking short subjects only. A regression keeps ordinary public questions like who is obama? on the direct reply path.
  • Added planner context-slice rendering so evaluator-added grounding policy is visible to Stage 2.

Evidence:

  • Before: Discord prompt 1507683002614812734, bot reply 1507683017735278716, trajectory tj-41867f0b8ac1e7. Stage 1 used gpt-oss-120b, plannerIterations=0, toolCallsExecuted=0, and guessed that shebotdick was the same bot/persona as botdick. The original defining Discord message 1507664178456825866 was outside the bounded Stage 1 input.
  • Second failure found while inspecting latest trajectories: tj-6054febf8f622e showed the first version cleared the bad reply but selected unavailable memory/messaging contexts in the live guest role, producing an empty send. Fixed by always including general as the portable planner context.
  • After: test-channel prompt 1507693238914387978, bot reply 1507693264726130698, trajectory tj-66c28d4517bbf8. Route is now messageHandler -> toolSearch -> planner, plannerIterations=1, toolCallsExecuted=0, toolCallFailures=0, and the trajectory contains core.contextual_identity_lookup_requires_recall, identity_lookup_policy, and MESSAGE in the action surface.

Validation:

  • bunx @biomejs/biome check --write src/services/message.ts src/__tests__/message-runtime-stage1.test.ts
  • bun vitest run src/__tests__/message-runtime-stage1.test.ts src/features/basic-capabilities/providers/recentMessages.test.ts src/runtime/__tests__/response-handler-evaluators.test.ts — 66 tests passed
  • bun run typecheck from packages/core
  • bun run build from packages/core
  • Branch rechecked against upstream: origin/develop...HEAD is 0 behind / 2 ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants