Longtail + resolution-reading research plan (v0.1) + open questions#1
Merged
Merged
Conversation
Drafts the v0.1 spec for the next research phase: long-tail watchlist tiers, resolution criteria reader, internal multi-market detector, and LLM-as-judge evaluation harness. Gate G1 (threshold confirmation) must pass before any work stream is dispatched. Also ignores local .deepseek/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extracts the 8 open decisions from §9 of the v0.1 plan into a focused discussion artifact. Gate G1 passes once all 8 Decision fields are filled and acknowledged by 4 reviewers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings in WW-shan's 3 commits from 2026-05-11 (maker no-fill diagnostics, opportunity optimization ranking, opportunity chain diagnostics) so the cross-fork PR shows only the new plan + open-questions docs.
There was a problem hiding this comment.
Pull request overview
Adds a research-only planning spec for the next phase of long-tail Polymarket mispricing research (LLM-driven resolution-criteria reading + intra-event relation discovery), plus an open-questions checklist to drive Gate G1 alignment. Also updates .gitignore to exclude a local DeepSeek scratch directory.
Changes:
- Add a detailed long-tail resolution-reading research thesis/spec (
v0.1). - Add a Gate G1 “open questions” document with eight decisions to resolve via PR discussion.
- Ignore
.deepseek/local scratch directory.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| docs/plans/2026-05-11-longtail-resolution-thesis.md | New long-tail research thesis/spec and proposed workstreams (T1–T4) + gates/kill-criteria. |
| docs/plans/2026-05-11-longtail-thesis-open-questions.md | New Gate G1 decision checklist to converge on thresholds, models, labeling, cadence, and logging. |
| .gitignore | Add .deepseek/ to ignored paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+1
to
+5
| # 2026-05-11 长尾 + 规则细读 研究方案 | ||
|
|
||
| 本文件是下一阶段研究工作的规格说明(spec)。在与团队对齐 *待决问题* 一节之前,**不要把任意一节拆给 DS / 其他 agent 执行**。所有 I/O 契约、阈值、验收标准必须先定下来。 | ||
|
|
||
| --- |
Comment on lines
+1
to
+46
| # 待决问题:长尾 + 规则细读方案 (v0.1) | ||
|
|
||
| **关联文档**:[`2026-05-11-longtail-resolution-thesis.md`](./2026-05-11-longtail-resolution-thesis.md) | ||
|
|
||
| 本文件是该方案 §9 的独立讨论稿,供团队评论和决议。每条问题下方留 `**Decision:**` 一行;讨论收敛后填入,并同步更新主方案至 v1.0。 | ||
|
|
||
| 讨论方式:在本 PR 上对每条问题做评论。**Gate G1 通过条件**:8 条 Decision 全部填入,4 人书面确认。 | ||
|
|
||
| --- | ||
|
|
||
| ## Q1. T1 长尾 Tier 阈值 | ||
|
|
||
| 主方案初稿: | ||
|
|
||
| | Tier | 24h 量 | 7d 量 | spread | 距离 resolution | | ||
| |---|---|---|---|---| | ||
| | headline | ≥ $50k | ≥ $200k | ≤ 1¢ | 任意 | | ||
| | mid | $5k-$50k | $20k-$200k | 1-3¢ | 任意 | | ||
| | longtail | $100-$5k | $1k-$20k | 3-10¢ | 14-90 天 | | ||
| | dead | < $100 | < $1k | > 10¢ | 任意 | | ||
|
|
||
| **问题**: | ||
| - 数字是否合理?还是应该先拉一周 Gamma 实际分布数据,看百分位数后再定? | ||
| - "距离 resolution 14-90 天" 是基于学术文献的 30-14 天最低效区间,但我们放宽到 14-90 天保留更多样本。这个范围对吗? | ||
| - 是否需要单独区分 neg-risk 子市场 tier(neg-risk 整组可能流动性好,但单个子市场长尾)? | ||
|
|
||
| **Decision:** | ||
|
|
||
| --- | ||
|
|
||
| ## Q2. T2 模型选择 | ||
|
|
||
| 主方案初稿:Haiku 4.5 主跑(提取),Sonnet 4.6 在 ambiguity 高时复核。 | ||
|
|
||
| **问题**: | ||
| - 同意这个分层吗? | ||
| - 是否试 DeepSeek?理由:成本可能更低,且当前项目本来就是 DS 帮我们干活的语境。代价:英文金融文本理解 vs Claude 系列的对比未知。 | ||
| - 是否需要在 prompt 调优阶段双跑(Haiku + DeepSeek)做 head-to-head 对比,再决定主跑? | ||
|
|
||
| **Decision:** | ||
|
|
||
| --- | ||
|
|
||
| ## Q3. T3 Embedding 模型 | ||
|
|
||
| 主方案初稿:OpenAI `text-embedding-3-small`($0.00002/1k token,预算 ~$0.20 单次完整跑)。 |
|
|
||
| 8 条 Decision 全部填入后: | ||
|
|
||
| 1. 我(Claude)把决议同步到主方案 `2026-05-11-longtail-resolution-thesis.md`,版本号从 v0.1 升到 v1.0。 |
WW-shan
pushed a commit
that referenced
this pull request
May 12, 2026
Adds DS task pack #1 for feeSchedule metadata preservation and maker rebate diagnostics.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Drafts the next research phase: finding mispricings in long-tail Polymarket markets via LLM-driven resolution-criteria reading and relation discovery. The plan is research-mode only — no live execution path is touched.
Two docs added under
docs/plans/:2026-05-11-longtail-resolution-thesis.md(622 lines) — full spec.fees.pyconstant wrong.2026-05-11-longtail-thesis-open-questions.md(8 open decisions) — Gate G1 needs all 8 filled before any work stream is dispatched. Use this PR's comment thread to discuss; commit decisions back to the branch as we converge.Also adds
.deepseek/to.gitignore(local DS scratch dir, not for sharing).Key facts driving the plan (all cited in §10 of the main doc)
fees.pyneeds an update.What this PR is NOT
.gitignoreanddocs/plans/.How to review
**Decision:**line.Test plan