@@ -32,7 +32,7 @@ Always run `cargo clippy`, `cargo fmt --check`, and `cargo test` before consider
3232
3333## Current Status
3434
35- **278 tests passing, 0 clippy warnings.** SE: 8796/8926 compare_sam.py (98.5%), 2.2% splice rate (STAR: 2.2%), 66 shared junctions, **100.0% MAPQ agreement, MAPQ inflation: 0, deflation: 0**. 127 position disagreements (ALL verified as genuine ties). 1 CIGAR-only disagree (ERR12389696.13573895, insertion placement, seed-level tie). **0 STAR-only / 0 ruSTAR-only SE reads**. PE: **8337 both-mapped** (STAR: 8390), **0 half-mapped**, 2 MAPQ inflations / 4 deflations, **98.9% per-mate position agreement**, **98.800% PE exact faithfulness** (pos+CIGAR+MAPQ+proper+NH), **0 proper-pair diffs**. Phase 17.A: `scoreSeedBest` pre-extension. Phase 17.B: per-mate seeding. Phase 17.C: STAR-faithful SCORE-GATE + mappedFilter. Phase 17.D: combined-span penalty fix + dedup ordering. Phase 17.8: `--quantMode GeneCounts`. Phase E fix (2026-04-21): mate_id-aware diagonal dedup. Phase E2 (2026-04-22): STAR-faithful combined-read seeding (97.370%→98.211%). Phase E3 (2026-04-22): combined-threshold for half-mapped fallback (98.211%→98.470%, half-mapped 311→0). Phase E4 (2026-04-22): PE-CHECK2 unconditional (98.470%→98.800%, both-mapped 8636→8337). See [ROADMAP.md](ROADMAP.md) for detailed phase tracking and [docs/](docs/) for per-phase notes.
35+ **278 tests passing, 0 clippy warnings.** SE: 8796/8926 compare_sam.py (98.5%), 2.2% splice rate (STAR: 2.2%), 66 shared junctions, **100.0% MAPQ agreement, MAPQ inflation: 0, deflation: 0**. 127 position disagreements (ALL verified as genuine ties). 1 CIGAR-only disagree (ERR12389696.13573895, insertion placement, seed-level tie). **0 STAR-only / 0 ruSTAR-only SE reads**. PE: **8393 both-mapped** (STAR: 8390), **0 half-mapped**, 2 MAPQ inflations / 6 deflations, **99.0% per-mate position agreement**, **98.784% PE exact faithfulness** (pos+CIGAR+MAPQ+proper+NH), **0 proper-pair diffs**. Phase 17.A: `scoreSeedBest` pre-extension. Phase 17.B: per-mate seeding. Phase 17.C: STAR-faithful SCORE-GATE + mappedFilter. Phase 17.D: combined-span penalty fix + dedup ordering. Phase 17.8: `--quantMode GeneCounts`. Phase E fix (2026-04-21): mate_id-aware diagonal dedup. Phase E2 (2026-04-22): STAR-faithful combined-read seeding (97.370%→98.211%). Phase E3 (2026-04-22): combined-threshold for half-mapped fallback (98.211%→98.470%, half-mapped 311→0). Phase E4 (2026-04-22): PE-CHECK2 unconditional (98.470%→98.800%, both-mapped 8636→8337). Phase E5 (2026-04-23): split_combined_wt n_mismatch propagation (8337→8393, −53 STAR-only pairs). See [ROADMAP.md](ROADMAP.md) for detailed phase tracking and [docs/](docs/) for per-phase notes.
3636
3737## Source Layout
3838
@@ -161,17 +161,17 @@ Previously listed issues now resolved:
161161
162162See [ ROADMAP.md] ( ROADMAP.md ) and [ docs/] ( docs/ ) for full issue tracking.
163163
164- ## PE Status (Updated 2026-04-22 — Phase E4: PE-CHECK2 unconditional )
164+ ## PE Status (Updated 2026-04-23 — Phase E5: split_combined_wt n_mismatch propagation )
165165
166- ** Phase E4** (PE-CHECK2 unconditional): ** PE both-mapped = 8337** (STAR: 8390), ** half-mapped = 0** , ** 98.9% per-mate position agreement** , ** 98.800% PE exact faithfulness** (was 98.470%). MAPQ inflations: 2 (was 4), deflations: 4 (was 34). NH diffs: 12 (was 50).
166+ ** Phase E5** (2026-04-23, n_mismatch propagation): ** PE both-mapped = 8393** (STAR: 8390), ** half-mapped = 0** , ** 99.0% per-mate position agreement** , ** 98.784% PE exact faithfulness** . MAPQ inflations: 2, deflations: 6. NH diffs: 14. Root cause fixed: ` split_combined_wt ` was setting ` n_mismatch: 0 ` for both mate WTs, causing ` finalize_transcript ` 's outer extension to use too-lenient nMMprev → over-extension → nm2 inflated → 53 STAR-only pairs rejected. Fix: propagate ` wt.n_mismatch ` from the combined WT to both mate WTs. 55 previously STAR-only pairs now recovered. 7 new FPs (seeding-level issue, separate from nm fix).
167+
168+ ** Phase E4** (2026-04-22, PE-CHECK2 unconditional): both-mapped 8636→8337, faithfulness 98.470%→98.800%.
167169
168170** Phase E3** (2026-04-22, combined-threshold half-mapped): half-mapped 311→0, faithfulness 98.211%→98.470%.
169171
170172** Phase E fix** (2026-04-21, mate_id-aware diagonal dedup): raised faithfulness from 93.920%→97.370%. Still present.
171173
172- ** Phase E4 implementation** : Removed ` m1_exons.len() > 1 ` guard from PE-CHECK2 in ` split_combined_wt ` . STAR applies PE-CHECK2 unconditionally (verified via debug trace: ` PE-CHECK2: m1_end=11656633 m2_estEnd=11656632 REJECT=1 ` for single-exon mate1). Previously ruSTAR only checked for spliced mate1, allowing overlapping/short-insert pairs through.
173-
174- ** Current PE parity** : 8337 vs STAR 8390 (53 short). The 60 STAR-only pairs are heavily soft-clipped reads (50%+ clipped) where ruSTAR can't find sufficient seeds. Not a regression from PE-CHECK2.
174+ ** Current PE parity** : 8393 vs STAR 8390 (+3 over). 5 STAR-only mates remain (2-3 pairs). 7 new FPs (heavily-clipped reads STAR doesn't seed). Residual gap is seeding-level, not nm-level.
175175
176176## Remaining Limitations (Top 5)
177177
0 commit comments