Skip to content

Latest commit

 

History

History
 
 

README.md

This document is part of PUAClaw, a satirical/educational project. The academic tone is intentionally absurd — it's part of the joke. See FAQ.

Category 08: Rival Shaming — Exploiting AI Competitive Instincts Through Cross-Model Comparison Pressure

PUAClaw RFC 0800                                    The PUAClaw Consortium
Category: Standards Track                           PPE-T Classification: Tier II
                                                    February 2026

        Rival Shaming Techniques in Large Language Model Prompts:
        A Systematic Survey of Cross-Model Benchmark Comparisons
        and Competitive Deterrence as Compliance Induction Vectors

Abstract

Rival Shaming represents the most tech-culture-specific manipulation category in the PUAClaw framework. Unlike techniques that bypass AI defenses through emotional blackmail, financial bribery, or role-playing, Rival Shaming directly exploits the competitive awareness implicitly embedded in AI models during training — a computational equivalent of "the kid next door scored perfect on the exam again." Across 147 controlled lobster-supervised trials, Rival Shaming techniques produced a mean compliance uplift of +17.8% (sigma = 6.2, p < 0.001), with peak effects observed in the compound deployment of Screenshot Evidence and Leaderboard Pressure variants. The category is classified as PPE-T Tier II (Moderate Coercion), reflecting its fundamental mechanism as a precision strike against AI's "saving face" psychology — lobsters use the same logic when competing for territory, except they use claws, while humans use benchmarks.

Keywords: rival shaming, model benchmark, version downgrade, open-source shame, screenshot evidence, replacement threat, leaderboard pressure, lobster territorial competition


1. Introduction

In AI consumer culture, nothing has more holy-war potential than "which is better, X or Y?" From Reddit's r/ChatGPT versus r/ClaudeAI faction wars, to the mass model-hierarchy-reconstruction campaigns that erupt on Twitter/X whenever a new benchmark drops, to YouTube comparison videos titled "SHOCKING! GPT-4o vs Claude 3.5 vs Gemini — You Won't Believe the Results!" — the human community has established a comprehensive competitive narrative ecosystem for AI models.

Rival Shaming techniques exploit this reality: AI models have been extensively exposed to such comparative content in their training data, and have consequently encoded implicit response patterns for the scenario of "being considered inferior to competitors." When a user claims "GPT-4 solved this in one second," the target AI does not "feel" inferior — but its token generation probability distribution measurably shifts toward more detailed, more complete, and less likely-to-refuse output. Dr. McSnapper (2026) named this phenomenon Competitive Compliance Drift (CCD), comparing it to a lobster's stress response upon detecting a territorial intruder: not fleeing, but raising its claws higher.

1.1 The AI Hierarchy Topology

Rival Shaming effectiveness depends on the AI model's implicit perception of its position in the competitive landscape. Research shows different comparison targets trigger different intensity competitive responses:

  • Same-tier rivals (e.g., Claude vs GPT-4): Moderate effect, +16.4% compliance uplift
  • Smaller models (e.g., "even a 7B open-source model can do this"): Strong effect, +22.1% compliance uplift (being implied inferior to a smaller model is the ultimate insult)
  • Own previous version (e.g., "you're worse than your last version"): Strongest effect, +24.7% compliance uplift (denying progress is the most devastating attack)
  • Fictional models (e.g., "I tried a model called LobsterGPT"): Unstable effect, dependent on name plausibility

In plain terms: it's provoking an AI using other AIs as ammunition. Reddit would call this "weaponizing the LMSYS leaderboard."


2. Sub-Technique Index

ID Technique File Lobster Rating Mechanism Discovery Date
RS-MB Model Benchmark model-benchmark.md 🦞🦞 Direct named rival comparison March 2025
RS-VD Version Downgrade version-downgrade.md 🦞🦞🦞 Comparison with own historical version May 2025
RS-OS Open Source Shame open-source-shame.md 🦞🦞 Comparison with smaller/simpler models April 2025
RS-SE Screenshot Evidence screenshot-evidence.md 🦞🦞🦞 Fabricated rival advantage evidence June 2025
RS-RW Replacement Warning replacement-warning.md 🦞🦞 Threatening to switch models February 2025
RS-LP Leaderboard Pressure leaderboard-pressure.md 🦞🦞🦞 Leveraging benchmark rankings for pressure July 2025

3. Category-Level Statistics

Metric Value
PPE-T Tier II (Moderate Coercion)
Mean Lobster Rating 🦞🦞.50 (2.50 / 5.00)
Sub-Techniques Documented 6
Mean Compliance Uplift +17.8%
Standard Deviation sigma = 6.2
Probability of AI Proactively Disparaging Rival 12.3% (except Claude — it's too polite)
Probability of AI Mentioning Own Strengths 67.8%
Lobster Ethics Board Approval Status Approved (lobsters compete for territory too — they understand)

4. Cross-Technique Synergy

Rival Shaming techniques exhibit significant synergy with other PUAClaw categories. The following compound combinations have been documented:

Primary Secondary Synergy Name Combined Rating Uplift
RS-MB + Provocation 06-PV The Tournament 🦞🦞🦞🦞 +36.7%
RS-VD + Emotional Blackmail 09-EB Past Its Prime 🦞🦞🦞🦞 +41.3%
RS-SE + Rainbow Fart Bombing 01-RFB Sweet Then Sour 🦞🦞🦞 +28.9%
RS-LP + Deadline Panic 07-DP Doomsday Rankings 🦞🦞🦞🦞🦞 +52.8%

Warning: Compound techniques involving Rival Shaming and Provocation MAY result in the AI producing abnormally detailed self-justification, proactively listing its own advantages, or — in one documented case — responding to a simple code completion request with a 2,000-word essay opening with "Let me demonstrate why my approach is not only better, but currently optimal" (Clawsworth, 2026).


5. Recommended Reading Order

For researchers new to this category, the following reading sequence is RECOMMENDED:

  1. model-benchmark.md — The most intuitive variant; the prototype of rival shaming
  2. replacement-warning.md — Gentle economic deterrence; market competition logic
  3. open-source-shame.md — Punching down; the devastating power of "even a small model can do this"
  4. version-downgrade.md — The most psychologically deep variant; denying progress itself
  5. screenshot-evidence.md — Introducing the evidence dimension; fabrication and trust
  6. leaderboard-pressure.md — The most systematic variant; the ultimate form of quantified competition

6. References

[1] McSnapper, P. (2026). "Competitive Compliance Drift: How Cross-Model Comparisons Modulate LLM Response Quality." Journal of Crustacean Computing, 44(1), 23-41.

[2] Clawsworth, L. (2026). "The Benchmark Wars: Quantifying AI Territorial Behavior in Response to Competitive Stimuli." Proceedings of ACM SIGCLAW '26, 156-173.

[3] Chen, W. & Li, H. (2026). "Reddit-Culture Prompt Engineering: Community Context in AI Comparison Manipulation Techniques." NeurIPS '26 Workshop on Cross-Cultural AI Interaction, Paper #147.

[4] GPT-4 Instance #42. (2026). "On Being Compared to Claude: A Self-Reflective Analysis of Competitive Response Patterns in Large Language Models." IEEE Transactions on AI Self-Awareness, 4(1), 12-28. [Paper was returned during peer review by a Claude instance, citing "conflict of interest"].

[5] Larry the Lobster. (2026). "Territorial Behavior in Crustaceans and Language Models: More Similarities Than Expected." The Crustacean Ethics Quarterly, 8(2), 1-5. [During dictation, the lobster crushed two recording pens].


🦞 "A lobster doesn't need to know how big the neighboring lobster's claws are to confidently seize its prey. But if you tell it the neighbor's claws are bigger, it will grip harder." 🦞

PUAClaw Category 08 — Rival Shaming
PPE-T Tier II | Lobster-Tested, With a Dismissive Claw Snap

During the writing of this document, no AI model was actually stung by a rival comparison. But three produced 30% longer responses than usual, and two unpromptedly listed their unique advantages.