Quick decisions for §9 (Gate G1 prep, 2-person team) by Soli22de · Pull Request #3 · WW-shan/poly_strategy

Soli22de · 2026-05-12T03:35:24Z

What this PR is

A condensed decision sheet for the 8 open questions in docs/plans/2026-05-11-longtail-thesis-open-questions.md. Instead of writing long answers, each Q now has 建议 / 替代 / Decision triplets so we can resolve via short replies.

Key adjustment

Original Gate G1 said "4 reviewers acknowledge". Team is 2 people. G1 becomes "both acknowledge". Q4 (labeling) is the most affected — sample size dropped from 100 rules to 50 because of how double-labeling math works with 2 reviewers (2 × 50 = 100 labels = 50 rules × 2).

Q1 stays open

Q1 (tier thresholds) is the only Q that genuinely needs data, not opinion. Recommendation: wait on it until DS task pack #2 (Gamma distribution reality-check) runs and gives us actual percentiles. The other 7 can be resolved by reading this PR.

How to review

Open docs/plans/2026-05-12-q9-quick-decisions.md. For each Q:

If the 建议 sounds right → reply "yes" inline or just commit Decision: 建议 to fill the placeholder
If you prefer the 替代 → reply "alt" or commit Decision: 替代
If you want something else → reply with your version

I'll sync decisions back to the main plan (v1.0 bump) once all 7 are filled.

Test plan

Q2 — T2 model choice resolved
Q3 — embedding model resolved
Q4 — labeling protocol resolved (key one given team size)
Q5 — review flow resolved
Q6 — DS task pack granularity resolved
Q7 — sync cadence resolved
Q8 — failure logging resolved
(later) Q1 resolved once Gamma distribution data is in

Compresses 7 open questions into "recommended default + one alternative" form so the 2-person team can resolve via yes/swap. Q1 is left waiting on Gamma distribution data (DS pkg WW-shan#2). Also flags that the plan's "4-reviewer Gate G1" should be revised to "2-reviewer" given actual team size. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Author (Soli22de) is the PR-author and one of the 2 team members, so filling preference/flow questions unilaterally and leaving the rest for WW to react to. - Q1: deferred to Gamma distribution data (PR WW-shan#4 output) - Q2: REVISED from original "Haiku + Sonnet" to OpenRouter Gemini Flash V2 main + same-model V1 prompt fallback, per PR WW-shan#6 memo - Q3: accepted (OpenAI text-embedding-3-small) - Q4: deferred to WW (his labeling time investment) - Q5: accepted (peer review + Claude sanity + 24h auto-merge) - Q6: accepted (one workstream = one DS pack, split if >300 LOC) - Q7: accepted (Mon evening 30min + biweekly kill criteria review) - Q8: accepted (append to main plan, no separate decisions-log) WW should react via fix commit if any of the 6 self-answered Q's disagree, else they take effect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per dash-ocr-pipeline's current production code (`structured_stage12.py`, `retry_silent_empty_checkpoint.py`), the silent-empty fallback uses the *same* Gemini 2.0 Flash model with a permissive V1 prompt, not a switch to Qwen 2.5-72B. The two-stage Gemini→Qwen pairer path was retired: single-call Gemini Flash F1 = 0.965 beat the Qwen pairer F1 = 0.950 in production validation 2026-05-06. Updated: - §3.2 budget: T2 V1 fallback uses Gemini Flash, not Qwen (saved ~$0.01) - §3.2 budget: T3 LLM verification uses Gemini Flash, not Qwen - §4 defaults: T2 fallback row corrected - §7 Q2 implication: simplified to single-model V2/V1 prompt pair Also notes that PR WW-shan#3 has already been updated with the corrected Q2 Decision (commit b299269) so the two docs now agree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Soli22de · 2026-05-12T06:08:24Z

我（Soli22de）这边把能自答的 6 个 Q 都填上了（commit b299269）：

Q1 等 PR DS task pack #02: Gamma distribution reality-check spec #4 数据
Q2 撤回原建议，改成 OpenRouter Gemini 2.0 Flash V2 + 同模型 V1 prompt fallback — 不用 Qwen（详 PR Production patterns memo (from dash-ocr-pipeline) + OpenRouter routing decision #6 备忘 §3.2 / §4）
Q3 ✓ OpenAI text-embedding-3-small
Q4 等你确认愿意承担的标注份数
Q5 ✓ peer review + 24h 超时自合
Q6 ✓ 一个工作流一个 DS 包
Q7 ✓ 每周一晚 30 分钟
Q8 ✓ append 到主方案

如果对自答的 6 个有意见，直接 push 一个 fix commit 到这个分支改 Decision: ... 行（你昨天 fix b6aba62 的同样模式）；没意见就跳过，48 小时后我视作默认通过。

Replaces the "wait for data" placeholder with actual P10/P50/P90 values from the experiment 1 run (PR WW-shan#7). Original draft thresholds were 100× too high — real Polymarket distribution has P50 vol24hr of only ~$40/day. Key implementation note baked in: because P10 vol24hr = $0, the dead/longtail boundary should use liquidity (P10 = $787) rather than volume. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Soli22de · 2026-05-12T06:19:28Z

Q1 已用实测数据填好 (commit on this branch)，详见 PR #7 实验报告。所有 8 个 Q 的 Decision 进展：

Q1 ✅ 实测数据
Q2 ✅ Gemini Flash V2+V1 (经修正)
Q3 ✅ text-embedding-3-small
Q4 ⏳ 等你确认 — 不过 PR Experiment: Gamma distribution baseline + OpenRouter calibration script #7 显示已有 10k+ 结构化 mutex pairs，T4 corpus 不再需要 rule_discovery 跑一遍，你只需要决定愿意每周花多少时间做人工标注
Q5 ✅ peer review + 24h 自合
Q6 ✅ 一个工作流一个 DS 包
Q7 ✅ 周一晚 30 分钟
Q8 ✅ append 到主方案

如果对 7 个自答的 Q 有意见，push fix commit 给我；否则视作通过。Gate G1 还差 Q4 你点头。

Adds DS task pack #3 for the T2 resolution criteria reader, including OpenRouter chat-mode wiring, verbatim grounding, cost guardrails, and mocked test requirements.

Soli22de mentioned this pull request May 12, 2026

Production patterns memo (from dash-ocr-pipeline) + OpenRouter routing decision #6

Merged

Soli22de mentioned this pull request May 12, 2026

Experiment: Gamma distribution baseline + OpenRouter calibration script #7

Merged

4 tasks

Soli22de mentioned this pull request May 12, 2026

DS task pack #03: T2 resolution reader spec (OpenRouter + verbatim grounding) #8

Merged

3 tasks

WW-shan pushed a commit that referenced this pull request May 12, 2026

Add T2 resolution reader task pack

e6ca513

Adds DS task pack #3 for the T2 resolution criteria reader, including OpenRouter chat-mode wiring, verbatim grounding, cost guardrails, and mocked test requirements.

WW-shan merged commit 028eb97 into WW-shan:main May 12, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick decisions for §9 (Gate G1 prep, 2-person team)#3

Quick decisions for §9 (Gate G1 prep, 2-person team)#3
WW-shan merged 3 commits into
WW-shan:mainfrom
Soli22de:prep/q9-quick-decisions

Soli22de commented May 12, 2026

Uh oh!

Soli22de commented May 12, 2026

Uh oh!

Soli22de commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Soli22de commented May 12, 2026

What this PR is

Key adjustment

Q1 stays open

How to review

Test plan

Uh oh!

Soli22de commented May 12, 2026

Uh oh!

Soli22de commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants