Skip to content

add: LLM Prisoner's Dilemma — when agents reason about trust, not just strategy#378

Open
abhinavk0220 wants to merge 2 commits intomesa:mainfrom
abhinavk0220:add/llm-prisoners-dilemma
Open

add: LLM Prisoner's Dilemma — when agents reason about trust, not just strategy#378
abhinavk0220 wants to merge 2 commits intomesa:mainfrom
abhinavk0220:add/llm-prisoners-dilemma

Conversation

@abhinavk0220
Copy link

What this is

An iterated Prisoner's Dilemma simulation where agents use LLM
Chain-of-Thought reasoning to decide whether to cooperate or defect
instead of following fixed strategies like tit-for-tat or always-defect.

Why this is scientifically interesting

The Prisoner's Dilemma has been studied for 70+ years because it
captures a fundamental tension in social systems: individual rationality
(defect for personal gain) conflicts with collective optimality
(cooperate for mutual benefit).

Classical ABM approaches this with fixed rules:

  • Always defect : pure self-interest, Nash equilibrium
  • Tit-for-tat : copy partner's last move
  • Pavlov : win-stay, lose-shift

These strategies are elegant but behaviorally brittle. Real humans
don't follow fixed rules : they reason about context, history,
reputation, and intent.

LLM agents do the same:

"My partner cooperated last 3 rounds but defected once when I
defected first. They seem reactive rather than purely selfish.
Cooperating now signals good faith and likely leads to stable
mutual cooperation long-term."

This produces emergent social dynamics : trust building, strategic
betrayal, forgiveness, reputation management : that no fixed strategy
can replicate.

Technical implementation

  • PrisonerAgent extends LLMAgent with CoT reasoning
  • Each round agents receive their full interaction history as
    internal state : score, last action, partner's recent moves
  • Random pairing each round prevents fixed-partner exploitation
  • Payoff matrix applied after both agents decide independently
    (simultaneous decision : no sequential advantage)
  • Tracks cooperation_rate and avg_score as model-level metrics

Payoff matrix

Partner Cooperates Partner Defects
You Cooperate 3, 3 0, 5
You Defect 5, 0 1, 1

Visualization

  • 🟢 Green agents : cooperated last round
  • 🔴 Red agents : defected last round
  • Agent size scales with cumulative score
  • Cooperation rate plot : tracks social trust over time
  • Average score plot : tracks collective welfare over time

Setup

cp .env.example .env   # fill in your API key
pip install -r requirements.txt
solara run app.py

Supports Gemini, OpenAI, Anthropic, and Ollama via .env.example.

Reference

Axelrod, R. (1984). The Evolution of Cooperation. Basic Books.

abhinavKumar0206 and others added 2 commits March 12, 2026 10:40
…terated Prisoner's Dilemma simulation where agents useLLM Chain-of-Thought reasoning to decide whether to cooperateor defect each round, instead of fixed strategies like tit-for-tator always-defect.Agents reason about partner history, trust signals, and long-termpayoff before deciding — producing emergent negotiation andtrust-building behavior that fixed-strategy models cannot capture.Payoff matrix:- Both cooperate: 3, 3- Defect vs cooperate: 5, 0- Both defect: 1, 1Visualization tracks cooperation rate and average score over rounds.Includes .env.example for Gemini, OpenAI, Anthropic, and Ollama.Reference: Axelrod, R. (1984). The Evolution of Cooperation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant