add: LLM Prisoner's Dilemma — when agents reason about trust, not just strategy#378
Open
abhinavk0220 wants to merge 2 commits intomesa:mainfrom
Open
add: LLM Prisoner's Dilemma — when agents reason about trust, not just strategy#378abhinavk0220 wants to merge 2 commits intomesa:mainfrom
abhinavk0220 wants to merge 2 commits intomesa:mainfrom
Conversation
…terated Prisoner's Dilemma simulation where agents useLLM Chain-of-Thought reasoning to decide whether to cooperateor defect each round, instead of fixed strategies like tit-for-tator always-defect.Agents reason about partner history, trust signals, and long-termpayoff before deciding — producing emergent negotiation andtrust-building behavior that fixed-strategy models cannot capture.Payoff matrix:- Both cooperate: 3, 3- Defect vs cooperate: 5, 0- Both defect: 1, 1Visualization tracks cooperation rate and average score over rounds.Includes .env.example for Gemini, OpenAI, Anthropic, and Ollama.Reference: Axelrod, R. (1984). The Evolution of Cooperation.
for more information, see https://pre-commit.ci
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
An iterated Prisoner's Dilemma simulation where agents use LLM
Chain-of-Thought reasoning to decide whether to cooperate or defect
instead of following fixed strategies like tit-for-tat or always-defect.
Why this is scientifically interesting
The Prisoner's Dilemma has been studied for 70+ years because it
captures a fundamental tension in social systems: individual rationality
(defect for personal gain) conflicts with collective optimality
(cooperate for mutual benefit).
Classical ABM approaches this with fixed rules:
These strategies are elegant but behaviorally brittle. Real humans
don't follow fixed rules : they reason about context, history,
reputation, and intent.
LLM agents do the same:
This produces emergent social dynamics : trust building, strategic
betrayal, forgiveness, reputation management : that no fixed strategy
can replicate.
Technical implementation
PrisonerAgentextendsLLMAgentwith CoT reasoninginternal state : score, last action, partner's recent moves
(simultaneous decision : no sequential advantage)
cooperation_rateandavg_scoreas model-level metricsPayoff matrix
Visualization
Setup
cp .env.example .env # fill in your API key pip install -r requirements.txt solara run app.pySupports Gemini, OpenAI, Anthropic, and Ollama via
.env.example.Reference
Axelrod, R. (1984). The Evolution of Cooperation. Basic Books.