You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: index.md
+30-22Lines changed: 30 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -132,23 +132,23 @@ To answer this, we analyzed the "linguistic signature" of hostility. We looked a
132
132
133
133
If political anger is ideological and complex, and sports anger is simple and tribal, their signatures should look different.
134
134
135
-
**They don't. They are nearly identical.**
136
-
137
-
The correlation between the hostility signature in politics and sports is **r = 0.937**.
138
-
139
-
140
-
### The process
135
+
**We found that they don't. They are nearly identical.**
141
136
142
137
We treated hostility like a **forensic trace**: first we measured *how often* it appears, then *what it looks like in language*, then we asked whether that “signature” is **portable** across domains, and finally whether the same forces also show up at the level of **events** and **network structure**.
143
138
144
139
Instead of betting everything on one metric, we built a chain of evidence that checks the same claim from multiple angles:
We started with the simplest question: **how often do hostile links occur**?
148
146
We compared the proportion of hostile hyperlinks (`LINK_SENTIMENT = -1`) in Politics vs Sports using a **two-proportion test** (χ² / z-test), reported a **95% confidence interval** for the difference, and quantified how big the gap is with **Cohen’s h**, plus **risk ratio** and **odds ratio**.
149
147
This is the “frequency layer”: it tells us *how much* conflict there is, not yet *how it sounds*.
150
148
151
-
#### 2) LIWC hostility signatures: “What does hostility sound like?”
149
+
### LIWC hostility signatures
150
+
151
+
<p><spanclass="bg-zinc-100 rounded px-3 py-2 mt-2">"What does hostility sound like?"</span></p>
152
152
Next, we built a linguistic fingerprint for each domain.
153
153
154
154
For every LIWC feature, we computed a delta between hostile and non-hostile posts:
@@ -160,7 +160,11 @@ We then compared the two vectors with **Pearson** (shape similarity) and **Spear
160
160
161
161
If hostility is domain-specific, these fingerprints should diverge. If it’s universal, they should line up.
162
162
163
-
#### 3) Bootstrap & permutation tests: “Is the resemblance real, or just luck?”
163
+
### Bootstrap & permutation tests
164
+
165
+
<p><spanclass="bg-zinc-100 rounded px-3 py-2 mt-2">"Is the resemblance real, or just luck?"</span></p>
166
+
167
+
: “Is the resemblance real, or just luck?”
164
168
A strong correlation can still be a coincidence if the sample is weird.
165
169
So we stress-tested the signature similarity:
166
170
@@ -169,15 +173,18 @@ So we stress-tested the signature similarity:
169
173
170
174
This is our “robustness layer”: we make sure the resemblance isn’t a fragile artifact.
171
175
172
-
#### 4) Cross-domain classifier transfer: “Can a model trained on Politics recognize Sports?”
176
+
### Cross-domain classifier transfer
177
+
178
+
<p><spanclass="bg-zinc-100 rounded px-3 py-2 mt-2">"Can a model trained on Politics recognize Sports?"</span></p>
179
+
173
180
Similarity on a plot is one thing; **portability** is stronger.
174
181
175
182
We trained a **logistic regression** on LIWC features to predict hostility in one domain, then tested it **as-is** on the other:
176
183
177
184
- train on Politics → test on Sports
178
185
- train on Sports → test on Politics
179
186
180
-
We evaluated with **AUC**.
187
+
We evaluated with **AUC** (_Area Under the Curve_).
181
188
If hostility has a different meaning across domains, transfer should collapse toward **0.5**. If the signal is shared, AUC should stay high.
182
189
183
190
<img
@@ -186,7 +193,10 @@ If hostility has a different meaning across domains, transfer should collapse to
186
193
class="flourish-embed"
187
194
/>
188
195
189
-
#### 5) Logistic regression coefficient comparison: “Do the same features do the work?”
196
+
### Logistic regression coefficient comparison
197
+
198
+
<p><spanclass="bg-zinc-100 rounded px-3 py-2 mt-2">"Do the same features do the work?"</span></p>
199
+
190
200
Even when two models perform well, they might rely on different cues.
191
201
So we compared the *mechanisms* of the Politics and Sports classifiers:
192
202
@@ -196,25 +206,23 @@ So we compared the *mechanisms* of the Politics and Sports classifiers:
196
206
197
207
This is the “interpretability layer”: it tells us whether the same linguistic levers are being pulled.
198
208
199
-
#### 6) Difference-in-differences + network metrics: “Do events and structure tell the same story?”
209
+
### Difference-in-differences + network metrics
210
+
211
+
<p><spanclass="bg-zinc-100 rounded px-3 py-2 mt-2">"Do events and structure tell the same story?"</span></p>
212
+
200
213
Finally, we zoomed out from language into dynamics and structure.
201
214
202
215
-**Difference-in-differences (events):** for major shocks (elections, finals, referendums…), we compared the *change* in hostility for “involved” camps versus “less involved/observer” camps, before vs during the event window. This isolates event effects from normal baseline fluctuations.
203
216
-**Network metrics (structure):** we model camps as a **directed network** where edges carry hostile-link flow. We then summarize behavior with metrics like **density** (how connected the conflict is), **reciprocity** (do attacks get returned?), and aggressor/target patterns via **outgoing vs incoming hostility**.
204
217
205
-
Together, these checks cover the full stack:
206
-
rates → language → robustness → portability → mechanism → system dynamics.
207
-
208
-
When all six layers agree, we can say it with confidence:
209
-
**the topic changes, the script of hostility doesn’t.**
218
+
Together, these checks cover the full following stack. When all six layers agree, we can say it with confidence: **the topic changes, the script of hostility doesn’t.**
210
219
220
+
---
211
221
212
-
###What we found (and why it matters)
222
+
## What we found (and why it matters)
213
223
214
224
The resemblance is not subtle, it’s *statistically overwhelming*.
215
225
216
-
#### The headline: r = 0.937
217
-
218
226
A Pearson correlation of **r = 0.937** between the Politics and Sports signatures means that when a feature increases during political hostility, it almost always increases during sports hostility too (and vice versa). In other words: **the same “psychological knobs” get turned**.
219
227
220
228
And the social-psychology “tell” is the same in both arenas: **THEY goes up**. Hostile posts are less about “us” and more about labeling and attacking *them*.
@@ -269,7 +277,7 @@ Even the *lower bound* of the confidence interval still implies a very strong re
269
277
270
278
Our investigation reveals four undeniable truths about the nature of the conflict:
271
279
272
-
1.**Anger is universal:** politics is a bloodier battlefield (3.3x more hostile), but the soldiers use the exact same weapons. The linguistic signature of conflict is identical (r=0.937), proving that hostility is a fundamental human instinct, not a topic-specific reaction.
280
+
1.**Anger is universal:** politics is a bloodier battlefield (3.3x more hostile), but the soldiers use the exact same weapons. The linguistic signature of conflict is identical (r = 0.937), proving that hostility is a fundamental human instinct, not a topic-specific reaction.
273
281
274
282
2.**Hate is portable:** the patterns are so consistent that a machine trained to spot political vitriol can instantly recognize sports trash-talk. The DNA of abuse is shared, suggesting that moderation tools could work across any domain.
0 commit comments