Skip to content

Commit 549b5b9

Browse files
committed
update docs
1 parent 9462c11 commit 549b5b9

2 files changed

Lines changed: 40 additions & 4 deletions

File tree

CLAUDE.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Always run `cargo clippy`, `cargo fmt --check`, and `cargo test` before consider
3232

3333
## Current Status
3434

35-
**268 tests passing, 0 clippy warnings.** SE: 8796/8926 compare_sam.py (98.5%), 2.2% splice rate (STAR: 2.2%), 66 shared junctions, **100.0% MAPQ agreement, MAPQ inflation: 0, deflation: 0**. 127 position disagreements (ALL verified as genuine ties). 1 CIGAR-only disagree (ERR12389696.13573895, insertion placement, seed-level tie). **0 STAR-only / 0 ruSTAR-only SE reads**. PE: **8390/8390 both-mapped (0 gap, exact STAR match)**, 0 half-mapped, **0 MAPQ inflations** (fixed Phase 17.C), **98.915% PE faithfulness** (Phase 17.C). Phase 17.A complete: `scoreSeedBest` pre-extension stored as `pre_ext_score` on each `WindowAlignment`. Phase 17.C complete: STAR-faithful SCORE-GATE (relax per-WT threshold by `outFilterMultimapScoreRange`) and STAR-faithful `mappedFilter` (quality checks on trBest only, not per-pair). See [ROADMAP.md](ROADMAP.md) for detailed phase tracking and [docs/](docs/) for per-phase notes.
35+
**274 tests passing, 0 clippy warnings.** SE: 8796/8926 compare_sam.py (98.5%), 2.2% splice rate (STAR: 2.2%), 66 shared junctions, **100.0% MAPQ agreement, MAPQ inflation: 0, deflation: 0**. 127 position disagreements (ALL verified as genuine ties). 1 CIGAR-only disagree (ERR12389696.13573895, insertion placement, seed-level tie). **0 STAR-only / 0 ruSTAR-only SE reads**. PE: **8390/8390 both-mapped (0 gap, exact STAR match)**, 0 half-mapped, **0 MAPQ inflations** (fixed Phase 17.C), **98.915% PE faithfulness** (Phase 17.C). Phase 17.A complete: `scoreSeedBest` pre-extension stored as `pre_ext_score` on each `WindowAlignment`. Phase 17.C complete: STAR-faithful SCORE-GATE + STAR-faithful `mappedFilter`. Phase 17.8 complete: `--quantMode GeneCounts` outputs `ReadsPerGene.out.tab` with 3 independent counting passes; 0 col1 gene disagreements vs STAR on 10k SE yeast. See [ROADMAP.md](ROADMAP.md) for detailed phase tracking and [docs/](docs/) for per-phase notes.
3636

3737
## Source Layout
3838

@@ -69,6 +69,8 @@ src/
6969
mod.rs -- GTF parsing, junction database, motif detection, two-pass filtering
7070
sj_output.rs -- SJ.out.tab writer
7171
gtf.rs -- GTF parser (internal)
72+
quant/
73+
mod.rs -- Gene-level read counting (--quantMode GeneCounts, ReadsPerGene.out.tab)
7274
chimeric/
7375
mod.rs -- Module exports
7476
detect.rs -- Chimeric detection (Tier 1: soft-clip, Tier 2: multi-cluster)
@@ -171,8 +173,8 @@ See [ROADMAP.md](ROADMAP.md) and [docs/](docs/) for full issue tracking.
171173

172174
- No coordinate-sorted BAM output (use `samtools sort`) — Phase 17.2
173175
- No PE chimeric detection — Phase 17.3
174-
- No `--quantMode GeneCounts` — Phase 17.8
175176
- No `--outStd SAM/BAM` (stdout output) — Phase 17.6
177+
- No `--outReadsUnmapped Fastx` — Phase 17.4
176178
- No STARsolo single-cell features — Phase 14 (deferred)
177179

178180
See [docs/phase17_features.md](docs/phase17_features.md) for full feature status.

docs/phase17_features.md

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
# Phase 17: Features + Polish
44

5-
**Status**: In Progress (17.1, 17.5, 17.A, 17.C complete)
5+
**Status**: In Progress (17.1, 17.5, 17.8, 17.A, 17.C complete)
66

77
**Goal**: Production-ready features and quality-of-life improvements.
88

@@ -20,7 +20,7 @@
2020
| 17.5 | Fix clippy warnings (0 warnings) | ✅ Complete |
2121
| 17.6 | `--outStd SAM/BAM` (stdout output for piping) | Planned |
2222
| 17.7 | GTF tag parameters (`sjdbGTFchrPrefix`, etc.) | Planned |
23-
| 17.8 | `--quantMode GeneCounts` | Planned |
23+
| 17.8 | `--quantMode GeneCounts` | ✅ Complete |
2424
| 17.9 | `--outBAMcompression` / `--limitBAMsortRAM` | Planned |
2525
| 17.10 | Chimeric Tier 3 (re-map soft-clipped regions) | Planned |
2626
| 17.11 | `--chimOutType WithinBAM` (supplementary FLAG 0x800) | Planned |
@@ -80,6 +80,40 @@
8080

8181
---
8282

83+
## Phase 17.8: `--quantMode GeneCounts` ✅ (2026-04-17)
84+
85+
**Goal**: Output `ReadsPerGene.out.tab` matching STAR's HTSeq-union gene-level counting.
86+
87+
**Implementation**: New `src/quant/mod.rs` with:
88+
- `GeneAnnotation`: per-chromosome sorted interval list (absolute genome coords) built from GTF exons
89+
- `GeneCounts`: atomic per-gene counters + 3 independent N_noFeature/N_ambiguous arrays
90+
- `QuantContext`: `Arc`-shared bundle for rayon parallel threads
91+
- `--quantMode GeneCounts` + `--sjdbGTFfile` validation in `params.rs`
92+
- SE and PE counting paths in `lib.rs`
93+
94+
**Three bugs fixed vs initial implementation**:
95+
1. **Coordinate mismatch**: GTF exon positions were stored chr-relative; `Transcript.exon.genome_start` uses absolute concatenated-genome coords. Fix: add `genome.chr_start[chr_idx]` offset when converting GTF positions.
96+
2. **Single counting pass**: All 3 columns were identical. STAR runs 3 INDEPENDENT passes — col1 (any strand), col2 (same strand as read), col3 (opposite strand) — each with separate N_noFeature and N_ambiguous.
97+
3. **Too-many-loci bucket**: These were going to N_multimapping. STAR puts them in N_unmapped.
98+
99+
**Results vs STAR (10k SE yeast)**:
100+
101+
| Metric | STAR | ruSTAR |
102+
|--------|------|--------|
103+
| N_unmapped | 1073 | 1074 (+1) |
104+
| N_multimapping | 661 | 661 |
105+
| N_noFeature col1/col2/col3 | 131/3653/4240 | 131/3653/4240 |
106+
| N_ambiguous col1 | 567 | 566 (-1) |
107+
| Gene total col1 | 7568 | 7568 |
108+
| Col1 gene disagreements || **0** |
109+
| Col2/col3 gene disagreements || 1 each (boundary edge case) |
110+
111+
The ±1 discrepancies (N_unmapped + N_ambiguous) are a single read at a gene overlap boundary — likely a minor coordinate boundary difference.
112+
113+
**Files**: `src/quant/mod.rs` (new), `src/params.rs`, `src/junction/mod.rs` (pub(crate) gtf), `src/lib.rs`
114+
115+
**Tests**: 274/274 (added 6 new quant unit tests), 0 clippy warnings.
116+
83117
---
84118

85119
## Phase 17.A: scoreSeedBest Pre-Extension ✅ (2026-04-16)

0 commit comments

Comments
 (0)