Skip to content

feat: implement chatroom for multi-llm conversation#162

Merged
dolaameng merged 30 commits into
cifrom
dolaameng/chatroom
Jun 2, 2026
Merged

feat: implement chatroom for multi-llm conversation#162
dolaameng merged 30 commits into
cifrom
dolaameng/chatroom

Conversation

@dolaameng

@dolaameng dolaameng commented May 14, 2026

Copy link
Copy Markdown
Collaborator

ChatRoom — Multi-agent conversation primitive

What

ChatRoom is a shared conversation context for kaggle-benchmarks that lets
multiple LLMs converse with full awareness of each other's identities and roles.

Key capabilities:

  • Perspective-aware message routing — each LLM sees its own messages as
    assistant and peers' messages as attributed user messages, built fresh
    each turn from a single ground-truth log.
  • Shared-LLM, distinct-identity participants — the same LLMChat instance
    can back multiple participants in multiple rooms without cloning or shared
    mutable state. A lightweight Participant wrapper owns all per-room identity.
  • Private information — single-message visible_to restrictions for
    one-shot directives, plus full private_channel subrooms for multi-turn
    private conversations that interleave chronologically with the public timeline.
  • Narrator/system messages without role confusionroom.post() sends
    directives from a named narrator that LLMs are explicitly told (via the
    roster) to treat as system instructions, not peer speech.

Why

Multi-agent evaluation (debate, negotiation, social deduction, cooperative games)
is an increasingly important dimension for frontier LLM benchmarking, but the
existing Chat API doesn't support it natively:

  1. LLMs are unaware of each other. Each agent has an isolated chat context.
    The user manually forwards messages between them, stripping and re-injecting
    roles. LLMs have no idea they are talking to another LLM.

  2. Boilerplate is high. Existing multi-agent benchmarks in this repo each
    re-implement ~40–160 lines of manual orchestration.

  3. No conversation memory. Some benchmarks create a brand new Chat
    every turn, leaving LLMs with zero memory of previous turns.


How

Core Abstraction

A ChatRoom is a shared conversation space. Users register participants via
add_participant(), then drive the conversation inside a with room: block
using two primitives:

Primitive Purpose LLM call?
room.post(msg) Narrator broadcast (system-level directives) No
participant.reply() A participant generates an LLM response Yes
room = kbench.ChatRoom(system_prompt="A structured debate.")
alice = room.add_participant(llm, name="Alice", system_prompt="Argue FOR.")
bob   = room.add_participant(llm, name="Bob",   system_prompt="Argue AGAINST.")

with room:
    room.post("Topic: Should we phase out fossil fuels by 2035?")
    alice.reply()
    bob.reply()

Two Classes, One Responsibility Each

  • Participant — a lightweight identity wrapper around a (possibly shared)
    LLMChat. Holds name, avatar, and per-room system_prompt. Its sole
    interaction method reply() is a guard + delegate to the room.
  • ChatRoom — owns the participant roster, ground-truth transcript,
    narrator, perspective projection, and all turn orchestration.

The split matters: the room owns all per-participant customizations, the
backing LLMChat stays shared and stateless. The same llm object can back
many participants in many rooms simultaneously without interference, because
identity lives on the Participant, not on the LLMChat.

Participant Registration (add_participant)

room.add_participant(llm, *, name=, avatar=, system_prompt=):

  • No cloning. A fresh Participant wrapper is created; the underlying
    LLMChat is reused as-is. This is intentional — object identity (is) on
    the Participant is what _build_perspective uses to distinguish "my
    messages" from "their messages", so the wrapper must be fresh per participant
    but the engine underneath need not be.
  • Type-guarded. Only LLMChat instances are accepted; scripted/code-driven
    peers are explicitly not routed through here (use room.post() for
    scripted narration). This was a deliberate refactor to remove the
    "peer-and-narrator-at-once" roster ambiguity.
  • Unique names within a room. Duplicate names would break the [Name]:
    prefix convention used in perspective projection.

Perspective Projection — the core mechanism

When participant.reply() runs, ChatRoom._generate_reply() does exactly three
things:

  1. Build the system promptroster + --- + room prompt + --- + personal prompt. Rebuilt from scratch every turn (no caching) so each turn reflects
    the current roster (after any remove_participant).
  2. Build the perspective — walk room.history once. For each message:
    • If item.sender is viewer → wrap with role="assistant", no prefix.
    • Otherwise → wrap with role="user", content prefixed [Name]: ....
    • Nested ChatRoom items (private channels) are recursively inlined for
      members and skipped for non-members.
    • Messages with _meta["visible_to"] excluding the viewer are filtered out.
  3. Call respond() with system=..., input_messages=<perspective>, and
    sender=<participant>. The response is appended directly into
    room.history by respond() — no double-write, no orphan chats, no
    _meta patching.

The original messages in room.history are never mutated; projection always
creates new objects via a cached pool of synthetic Actor instances (avoids
O(N) allocations per call).

The Roster — what the LLM is told (and not told)

The auto-generated roster injected at the top of every system prompt tells the
LLM:

  • Who it is ("You are Alice.").
  • Who the narrator is, by name, front-loaded before the peer list so the
    LLM has the system-vs-peer rule in hand before it learns peer names.
  • The peer list — names only. Peers' system_prompt is never exposed,
    which is the anti-leak property hidden-role games like Werewolf depend on.
  • The [Name]: prefix convention, plus an explicit "do not prefix your own
    reply"
    instruction (LLMs commonly mirror the format they see, which would
    otherwise produce double-prefixed messages).
  • The [private: ChannelName]: convention and the rule that a private
    directive should be answered alone, not bundled with a public reply.

Private Information

Two mechanisms with different weights:

  • room.post(msg, visible_to=[...]) — single-message audience filter.
    Lightweight; right for one-shot directives (e.g. handing each player a
    secret role at game start). The message lives in the parent room's history
    with a _meta["visible_to"] tag.
  • room.private_channel([alice, bob], name="Wolf Night") — a child
    ChatRoom (full reuse, no special class). Used for multi-turn private
    conversations. The child is registered into the parent's history lazily on
    first entry; child messages are interleaved chronologically into members'
    perspectives and invisible to non-members. Validations enforce that channel
    participants are parent-room members, that channel names don't collide with
    the parent or with siblings, and that the participant list has no duplicates.

Hard-Delete Removal

remove_participant() is a hard delete — the participant disappears from
peers' rosters next turn, .reply() from a removed Participant raises
RuntimeError, and historical messages stay attributed to them. Removal does
not cascade into private channels by design (one knob per call).

Integration with Existing Framework

  • Context managerwith room: integrates with the existing
    contexts.enter() system, so chats.get_current_chat() returns the active
    room. A small _cm_stack supports reentrant with blocks (e.g. the same
    private channel re-entered each loop iteration).
  • Chat hierarchy — entering a ChatRoom lazily registers it into the
    parent chat's history the first time, so the full transcript renders in the
    Panel UI without UI changes.
  • Structured outputsreply(schema=...) works for typed responses (e.g.
    game moves, votes). Peers see the response stringified via str(content);
    override __str__ on your schema if you need to hide a private field.
  • Streaming-friendly identity_generate_reply passes sender=<participant>
    into respond() so the response Message carries the right identity from
    construction. Subscribers (UI, loggers) observe the correct sender on the
    very first new_message event, not as an after-the-fact patch.

Example: Before & After

Tic-Tac-Toe

Before — fresh chat each turn, zero memory:

while not game.is_game_over():
    with kbench.chats.new(...):           # brand new context every turn
        move = llm.prompt(state, schema=action_schema)

After — full history, attributed turns, narrator-driven game state:

room = kbench.ChatRoom()
player_x = room.add_participant(llm, name="Player X")
player_o = room.add_participant(llm, name="Player O")

with room:
    while not game.is_game_over():
        room.post(f"Board:\n{game.get_board()}")
        move = players[game.current].reply(schema=TicTacToeMove)
        game.make_move(move)

Game state is broadcast by the room's narrator (room.post) instead of by a
separate scripted Actor, matching the current "narrator owns scripted
content, participants own LLM content" split.

dolaameng added 13 commits May 14, 2026 00:47
…om framework

- Implemented ChatRoom context manager for multi-agent perspective-aware message routing.
- Added identity awareness, system prompt enrichment, and automatic roster injection.
- Added support for private channels, visible_to restrictions, and interleaved histories.
- Refactored game_werewolf_chatroom.py to dynamically scale up to 7 players.
- Added unit tests verifying multi-directional privacy and sealed bid isolation.
…oting bugs

- Fixed cache_id in runs.py and slug in serialization.py to resolve the actual model version identifier instead of participant name.
- Fixed werewolf game vote extraction robustness in game_werewolf_chatroom.py to prevent false-positives on mentioned names.
- Fixed test_corporate_takeover_chatroom assertion types in test_chatroom.py.
…fix streaming typeerrors

- Switched game_werewolf_chatroom.py to use structured WerewolfVote outputs.
- Enabled kbench.config.enable_interactive_mode() and player.stream_responses = True for live onstream rendering.
- Added survival/existence validation to voting loops to prevent crashes during ties or votes on dead players.
- Fixed panel.py new_chunk streaming TypeError by extracting string content from LLMResponse chunk objects.
- Updated tests/test_chatroom.py werewolf mock responses to return structured JSON.
…update design doc

- Remove 4-player backward compat; run_werewolf now strictly requires 7 players.
- Upgrade Alice/Bob wolf prompts with double-bluff and distancing strategies.
- Update test_werewolf_chatroom to simulate a full 2-round, 7-player game.
- Add section 9.5 to design.md for Panel streaming bug fix.
- Remove example-specific structured voting details from design doc.
…vatars

- Replace fuzzy name matching with explicit eligible name lists in vote prompts.
- Use neutral role-agnostic avatars to avoid spoiling werewolf identities.
@dolaameng dolaameng force-pushed the dolaameng/chatroom branch 2 times, most recently from bc7701a to 633ced9 Compare May 27, 2026 22:34
@dolaameng dolaameng force-pushed the dolaameng/chatroom branch from ca3ee6b to 2e84931 Compare May 28, 2026 22:04
@dolaameng dolaameng requested review from develra and s-alexey May 29, 2026 03:33
@dolaameng dolaameng marked this pull request as ready for review May 29, 2026 03:34
@dolaameng dolaameng force-pushed the dolaameng/chatroom branch from e8e876e to 9b3ec0e Compare May 29, 2026 04:02

@develra develra left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at a high level to the best of my ability in a time-boxed way (30 minutes) - mostly looking at the examples and tests. LGTM - neat feature, but def might have missed some more subtle issues.

Comment thread src/kaggle_benchmarks/chats.py Outdated
Comment thread src/kaggle_benchmarks/actors/base.py Outdated
Comment thread src/kaggle_benchmarks/actors/llms.py Outdated
def __init__(
self,
*,
system_prompt: str | None = None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the reason you was have to copy LLMChat instanses while adding them to a room. I find it confusing that we are adding system_prompt llm-wide attribute that is not used in prompt

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This is also related to #162 (comment), killed two birds

Comment thread src/kaggle_benchmarks/actors/llms.py Outdated
Comment on lines +206 to +211
system = room._build_system_prompt(self)
perspective = room._build_perspective(self)

response = self.respond(
system=system, schema=schema, input_messages=perspective, **kwargs
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it belongs to room as it uses its private methods. Maybe something like room.interact(llm). This way you can store system_prompt per participant in eg ChatRoom.system_prompts.

@dolaameng dolaameng Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great insight! I have refactored the design

  1. moved application of system_prompt to chatroom.
  2. used Participant as a simple wrapper class for just llm, so we don't need to distinguish and clone
  3. moved them to a dedicated module rooms.

Comment thread src/kaggle_benchmarks/__init__.py Outdated
from kaggle_benchmarks._config import ExecutionMode, config
from kaggle_benchmarks.actors import Actor, LLMChat, system, user
from kaggle_benchmarks.chats import last_reasoning_traces
from kaggle_benchmarks.chats import ChatRoom, last_reasoning_traces

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer forcing using it like kbench.chats.ChatRoom or even move it to a separate module kbench.rooms.ChatRoom

Suggested change
from kaggle_benchmarks.chats import ChatRoom, last_reasoning_traces

@dolaameng dolaameng May 29, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Keep it in chats module for now so it's simpler on user side, and also makes potential circular imports easier. If we have more different use cases, will move it to dedicated module.

Wait, I think your other comments on system_prompt should belong to the chatroom will makes this design simpler, and so we should move them to a dedicated module. Let me try this refactoring first.

Comment thread documentation/examples/chatroom_pizza_order.py Outdated
Comment thread documentation/examples/chatroom_synthetic_turing_test.py
@dolaameng dolaameng force-pushed the dolaameng/chatroom branch 2 times, most recently from ffcf4c1 to 1760c02 Compare June 1, 2026 22:02
@dolaameng dolaameng force-pushed the dolaameng/chatroom branch from 1760c02 to 7101247 Compare June 1, 2026 22:03
@dolaameng dolaameng force-pushed the dolaameng/chatroom branch from 89ced10 to e0aeff2 Compare June 2, 2026 02:28
@dolaameng dolaameng requested a review from s-alexey June 2, 2026 03:09
@dolaameng dolaameng merged commit 7b52ceb into ci Jun 2, 2026
8 checks passed
@dolaameng dolaameng deleted the dolaameng/chatroom branch June 2, 2026 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants