Longtail + resolution-reading research plan (v0.1) + open questions by Soli22de · Pull Request #1 · WW-shan/poly_strategy

Soli22de · 2026-05-11T14:37:06Z

Summary

Drafts the next research phase: finding mispricings in long-tail Polymarket markets via LLM-driven resolution-criteria reading and relation discovery. The plan is research-mode only — no live execution path is touched.

Two docs added under docs/plans/:

2026-05-11-longtail-resolution-thesis.md (622 lines) — full spec.
- Why this direction now: main-market YES+NO arb half-life is <1 min post-Oct 2024 (Anatomy of Polymarket paper); 63% of short-term markets had zero 24h volume; new category-tiered Polymarket fees in 2026 make our current fees.py constant wrong.
- Four work streams: T1 long-tail tier filter, T2 resolution-criteria reader, T3 internal same-event detector, T4 LLM-as-judge rule-eval harness.
- Cross-cutting: fee model upgrade (Polymarket 2026 category fees).
- Decision gates G1-G6, kill criteria, risk table.
- All research-mode; no live execution path.
2026-05-11-longtail-thesis-open-questions.md (8 open decisions) — Gate G1 needs all 8 filled before any work stream is dispatched. Use this PR's comment thread to discuss; commit decisions back to the branch as we converge.

Also adds .deepseek/ to .gitignore (local DS scratch dir, not for sharing).

Key facts driving the plan (all cited in §10 of the main doc)

Polymarket 2026 fees are category-tiered (Crypto 1.80% / Politics 1.00% / Sports 0.75% / Geopolitics 0%) — fees.py needs an update.
YES+NO arb half-life <1 min by Oct 2024 (arxiv 2603.03136).
~$39.6M of NegRisk rebalancing arb already captured by pros (2024-04 → 2025-04).
Lopez-Lira: LLM headline strategies decay Sharpe 6.54→1.22 in 3 years — kill criteria are explicit.
Paradigm 2025-12: Dune dashboards double-count Polymarket volume ~2x — don't source thresholds from Dune.
LLM-as-judge: balanced accuracy / Youden's J, not F1; ensemble 3 models (arxiv 2512.08121, 2512.16041).

What this PR is NOT

Not an implementation PR. No code changes outside .gitignore and docs/plans/.
Not a request to merge before discussion. Treat this as the start of Gate G1.

How to review

Read the main thesis doc top-to-bottom (≈30 min). The §1 "why this direction" and §7 "kill criteria" are the most important sections.
Go through the 8 questions in the open-questions doc. Drop comments inline on each **Decision:** line.
Once all 8 have consensus, commit the decisions to this branch and we merge.

Test plan

Q1 — T1 tier thresholds confirmed
Q2 — T2 model choice confirmed
Q3 — T3 embedding model confirmed
Q4 — T4 labeling protocol confirmed
Q5 — code-review flow confirmed
Q6 — DS task-pack granularity confirmed
Q7 — sync cadence confirmed
Q8 — failure logging method confirmed
4 reviewers acknowledged → Gate G1 passes

Drafts the v0.1 spec for the next research phase: long-tail watchlist tiers, resolution criteria reader, internal multi-market detector, and LLM-as-judge evaluation harness. Gate G1 (threshold confirmation) must pass before any work stream is dispatched. Also ignores local .deepseek/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Extracts the 8 open decisions from §9 of the v0.1 plan into a focused discussion artifact. Gate G1 passes once all 8 Decision fields are filled and acknowledged by 4 reviewers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brings in WW-shan's 3 commits from 2026-05-11 (maker no-fill diagnostics, opportunity optimization ranking, opportunity chain diagnostics) so the cross-fork PR shows only the new plan + open-questions docs.

Copilot

Pull request overview

Adds a research-only planning spec for the next phase of long-tail Polymarket mispricing research (LLM-driven resolution-criteria reading + intra-event relation discovery), plus an open-questions checklist to drive Gate G1 alignment. Also updates .gitignore to exclude a local DeepSeek scratch directory.

Changes:

Add a detailed long-tail resolution-reading research thesis/spec (v0.1).
Add a Gate G1 “open questions” document with eight decisions to resolve via PR discussion.
Ignore .deepseek/ local scratch directory.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File	Description
docs/plans/2026-05-11-longtail-resolution-thesis.md	New long-tail research thesis/spec and proposed workstreams (T1–T4) + gates/kill-criteria.
docs/plans/2026-05-11-longtail-thesis-open-questions.md	New Gate G1 decision checklist to converge on thresholds, models, labeling, cadence, and logging.
.gitignore	Add `.deepseek/` to ignored paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+# 2026-05-11 长尾 + 规则细读 研究方案
+
+本文件是下一阶段研究工作的规格说明（spec）。在与团队对齐 *待决问题* 一节之前，**不要把任意一节拆给 DS / 其他 agent 执行**。所有 I/O 契约、阈值、验收标准必须先定下来。
+
+---


+# 待决问题：长尾 + 规则细读方案 (v0.1)
+
+**关联文档**：[`2026-05-11-longtail-resolution-thesis.md`](./2026-05-11-longtail-resolution-thesis.md)
+
+本文件是该方案 §9 的独立讨论稿，供团队评论和决议。每条问题下方留 `**Decision:**` 一行；讨论收敛后填入，并同步更新主方案至 v1.0。
+
+讨论方式：在本 PR 上对每条问题做评论。**Gate G1 通过条件**：8 条 Decision 全部填入，4 人书面确认。
+
+---
+
+## Q1. T1 长尾 Tier 阈值
+
+主方案初稿：
+
+| Tier | 24h 量 | 7d 量 | spread | 距离 resolution |
+|---|---|---|---|---|
+| headline | ≥ $50k | ≥ $200k | ≤ 1¢ | 任意 |
+| mid | $5k-$50k | $20k-$200k | 1-3¢ | 任意 |
+| longtail | $100-$5k | $1k-$20k | 3-10¢ | 14-90 天 |
+| dead | < $100 | < $1k | > 10¢ | 任意 |
+
+**问题**：
+- 数字是否合理？还是应该先拉一周 Gamma 实际分布数据，看百分位数后再定？
+- "距离 resolution 14-90 天" 是基于学术文献的 30-14 天最低效区间，但我们放宽到 14-90 天保留更多样本。这个范围对吗？
+- 是否需要单独区分 neg-risk 子市场 tier（neg-risk 整组可能流动性好，但单个子市场长尾）？
+
+**Decision:**
+
+---
+
+## Q2. T2 模型选择
+
+主方案初稿：Haiku 4.5 主跑（提取），Sonnet 4.6 在 ambiguity 高时复核。
+
+**问题**：
+- 同意这个分层吗？
+- 是否试 DeepSeek？理由：成本可能更低，且当前项目本来就是 DS 帮我们干活的语境。代价：英文金融文本理解 vs Claude 系列的对比未知。
+- 是否需要在 prompt 调优阶段双跑（Haiku + DeepSeek）做 head-to-head 对比，再决定主跑？
+
+**Decision:**
+
+---
+
+## Q3. T3 Embedding 模型
+
+主方案初稿：OpenAI `text-embedding-3-small`（$0.00002/1k token，预算 ~$0.20 单次完整跑）。


+
+8 条 Decision 全部填入后：
+
+1. 我（Claude）把决议同步到主方案 `2026-05-11-longtail-resolution-thesis.md`，版本号从 v0.1 升到 v1.0。


Adds DS task pack #1 for feeSchedule metadata preservation and maker rebate diagnostics.

张靖恒 and others added 3 commits May 11, 2026 22:23

Merge upstream/main into discuss/longtail-thesis

2cf08f4

Brings in WW-shan's 3 commits from 2026-05-11 (maker no-fill diagnostics, opportunity optimization ranking, opportunity chain diagnostics) so the cross-fork PR shows only the new plan + open-questions docs.

WW-shan requested a review from Copilot May 11, 2026 14:40

Copilot started reviewing on behalf of WW-shan May 11, 2026 14:41 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Fix longtail plan fee and labeling assumptions

b6aba62

WW-shan merged commit c7e0ec3 into WW-shan:main May 11, 2026
1 check passed

Soli22de deleted the discuss/longtail-thesis branch May 12, 2026 03:07

WW-shan pushed a commit that referenced this pull request May 12, 2026

Add fee schedule metadata task pack

94b90e4

Adds DS task pack #1 for feeSchedule metadata preservation and maker rebate diagnostics.

Soli22de mentioned this pull request May 13, 2026

Arb-persistence study + James Bond depth verdict (94/yr ceiling) #9

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Longtail + resolution-reading research plan (v0.1) + open questions#1

Longtail + resolution-reading research plan (v0.1) + open questions#1
WW-shan merged 4 commits into
WW-shan:mainfrom
Soli22de:discuss/longtail-thesis

Soli22de commented May 11, 2026 •

edited by WW-shan

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		8 条 Decision 全部填入后：

		1. 我（Claude）把决议同步到主方案 `2026-05-11-longtail-resolution-thesis.md`，版本号从 v0.1 升到 v1.0。

Conversation

Soli22de commented May 11, 2026 • edited by WW-shan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key facts driving the plan (all cited in §10 of the main doc)

What this PR is NOT

How to review

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Soli22de commented May 11, 2026 •

edited by WW-shan

Loading