Academic basis (arXiv → gate mapping)

Top spec: claude-code-parity-apr-poc.md | Falsification conditions | Scope extensions

Per-paper mapping to the gates (CCPA-001..013) and sub-extensions (numerical-parity work) it grounds. Hinton 2015 (1503.02531) provides the distillation framing; SWE-bench (2310.06770) the corpus methodology; METTLE (1807.10453) + LLMORPH (2603.23611) the metamorphic-relations framework for tool-call equivalence.

Academic basis (arXiv → gate mapping)

arXiv	Title (short)	Applies to gate(s)	Why
1503.02531	Hinton et al., Distilling the Knowledge in a Neural Network	CCPA-008	The parity_score is a distillation loss on the action stream. Framing gates the convergence criterion.
1807.10453	Segura et al., METTLE — Metamorphic Testing of ML Systems	CCPA-004, CCPA-005	Tool-call equivalence and post-state equivalence are metamorphic relations under the operational semantics of each tool — the canonical way to test ML systems without ground truth.
2207.11976	Differential Testing of DL Frameworks	CCPA-002, CCPA-004	Teacher↔student parity is the textbook differential-testing setup; replay determinism is the precondition.
2310.06770	Jimenez et al., SWE-bench	CCPA-007	Justifies fixture-corpus methodology: ≥1 task per capability row, recorded once, replayed forever, no live network.
2505.03096	Chaos Engineering for LLM Systems	CCPA-006	Sovereignty enforcement under fault injection — "what if the env leaks an Anthropic key on a replay run" is exactly a chaos-test class.
2603.23611	LLMORPH — Cataloged Metamorphic Relations for NLP	CCPA-004	191 catalogued metamorphic relations directly populate per-tool `equality` rules in `tool_equivalence_rules`.
2102.05351 (referenced in apr-cli-qa-spec)	Coverage-completeness invariants	CCPA-010, CCPA-011	Background for 100 %-coverage / 100 %-comply invariants.
2605.03546	Yang et al., ProgramBench — Can Language Models Rebuild Programs From Scratch?	CCPA-016 (M152 outcome-parity gate); future CCPA-017 project-scale outcome parity	Validates the M154 test-survival methodology at full-project scale: 200 real-world programs (FFmpeg / SQLite / PHP interpreter) where LMs must reconstruct codebases from executable + docs, scored via agent-generated behavioral-equivalence tests. Headline empirical finding: 0% of tasks fully resolved; best model 95%-test-pass on only 3% of tasks. Validates Phase 3 outcome-parity-results.md "what this does NOT prove" caveats (project-scale architecture reconstruction is OOS for M150-M154's function-level POC). Authoring methodology — coverage-guided fuzzing for behavioral test generation — is the natural M158+ extension of our `phase-3-test-survival.sh` swap pattern.

Scope-extension citations (M32 numerical-parity work)

arXiv	Title (short)	Applies to	Why
1701.06538	Shazeer et al., Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer	qwen3-moe-forward-v1 (Sub-extension 1+3)	Original gated-router top-k MoE formulation. Defines the softmax-then-top-k semantics the `moe-router-v1` and `moe-expert-dispatch-v1` contracts compose, and the per-expert SwiGLU dispatch the M32c.2.2.* chain implements.
2101.03961	Fedus et al., Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity	qwen3-moe-forward-v1 (Sub-extension 1+2)	Modern MoE forward-path conventions (per-token routing, per-expert FFN, weighted aggregation) — the algorithmic shape that `forward_qwen3_moe` mirrors for Qwen3-Coder-30B-A3B.
2202.09368	Zoph et al., ST-MoE: Designing Stable and Transferable Sparse Expert Models	qwen3-moe-forward-v1 (Sub-extension 3)	Router-stability instrumentation precedent: per-token router-output logging, top-k expert ids, and per-expert contribution norms. The diagnostic surface M32d Step 4 was scoped to add.
1910.07467	Zhang & Sennrich, Root Mean Square Layer Normalization	qwen3-moe-forward-v1 (Sub-extension 1, M32d Step 5)	Per-head Q/K RMSNorm was empirically load-bearing in M32d (rank-3 prior, 15 % of FAST PATH cost). The contract's `qkv_q_norm` + `qkv_k_norm` equations follow this paper's RMSNorm definition.
2104.09864	Su et al., RoFormer: Enhanced Transformer with Rotary Position Embedding	qwen3-moe-forward-v1 (Sub-extension 1, M32d Step 5b)	Rotary Position Embedding (RoPE). M32d Step 5b's `rope_theta` 10K → 1M default for `qwen3_moe`/`qwen3` arches stems from this paper's θ scaling for long-context positional encoding (rank-4 prior, 10 % of FAST PATH cost).
2210.17323	Frantar et al., GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers	qwen3-moe-forward-v1 (Sub-extension 1, formal cosine gate)	Quantization-aware reference-comparison framework: cosine ≥ 0.99 vs FP16 ground truth at the LM-head logits is the cross-implementation parity criterion the formal flip of v1.4.0 ACTIVE_ALGORITHM_LEVEL → ACTIVE_RUNTIME relied on. DISCHARGED 2026-05-09 at M109: cos_sim 0.995384 ≥ 0.99 measured on lambda-vector RTX 4090; aprender contract flipped to v1.5.0 ACTIVE_RUNTIME via aprender PR #1597 squash `3fb04ef86`.
2305.18398	Dao, FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning	qwen3-moe-forward-v1 (Sub-extension 2)	Fused-kernel parity discipline: GPU MoE kernel for `forward_qwen3_moe_gpu` should preserve numerical equivalence with the CPU LAZY-FUSED-MATVEC path, validated via the same cosine ≥ 0.99 gate (Sub-extension 1).
2305.05176	Aminabadi et al., DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale	qwen3-moe-forward-v1 (Sub-extension 2)	Sparse-MoE GPU dispatch precedent. Production-grade `forward_qwen3_moe_gpu` will need expert-parallel scheduling analogous to DeepSpeed-MoE's expert-slot routing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Academic basis (arXiv → gate mapping)

Academic basis (arXiv → gate mapping)

Scope-extension citations (M32 numerical-parity work)

Uh oh!

FilesExpand file tree

academic-basis.md

Latest commit

History

academic-basis.md

File metadata and controls

Academic basis (arXiv → gate mapping)

Academic basis (arXiv → gate mapping)

Scope-extension citations (M32 numerical-parity work)