|
| 1 | +# salmon (Rust) pre-release testing checklist |
| 2 | + |
| 3 | +A reusable checklist of the manual/semi-automated tests we run before tagging a |
| 4 | +release, in addition to the in-tree `cargo test` suite and the cross-platform |
| 5 | +CI. It exists so each release exercises the same surface area and so regressions |
| 6 | +in paths the unit tests do not cover (real data, decoy indices, alignment mode, |
| 7 | +messy inputs, parity vs C++) are caught before publishing. |
| 8 | + |
| 9 | +Treat every box as "must pass or have an understood, documented reason." The |
| 10 | +commands are templates — substitute your own index/reads paths. Where a test |
| 11 | +guards a specific historical bug, the issue/fix is noted so the guard is not |
| 12 | +silently dropped. |
| 13 | + |
| 14 | +## 0. Datasets used (assemble once per release) |
| 15 | + |
| 16 | +- **`sample_data/`** (in repo): 15 transcripts, 10k paired 50 bp reads with the |
| 17 | + true transcript in each read name. Fast smoke tests + determinism. |
| 18 | +- **Simulated human** (polyester over GENCODE/Ensembl cDNA, ~194k transcripts): |
| 19 | + `clean.fa` transcriptome + `sim_R{1,2}` (easy) / `sim_hard_R{1,2}` (hard) + |
| 20 | + `ground_truth_{easy,hard}.tsv` (`name<TAB>true_count<TAB>length`). Accuracy vs |
| 21 | + a known truth, including short (<k) transcripts. |
| 22 | +- **Real + decoy**: GRCh38 cDNA+ncRNA gentrome with the primary assembly as |
| 23 | + decoy (`salmon index … -d decoys.txt --keepDuplicates`), and a public |
| 24 | + paired-end RNA-seq run (we use SRR1039508, ~22.9M PE). Decoy handling, the |
| 25 | + bias/abundance path at scale, and real-data parity vs C++. |
| 26 | +- **Reference C++ salmon** (the matching `1.x` release / fixed build) for parity. |
| 27 | +- **bowtie2 + samtools** (module/`PATH`) for alignment-mode inputs. |
| 28 | + |
| 29 | +A correlation helper is handy: read two `quant.sf` (or a `quant.sf` and a truth |
| 30 | +file), report Spearman, **log**-Pearson (raw Pearson is dominated by a few |
| 31 | +high-TPM transcripts), MARD, and nonzero-set Jaccard. Use log-scale metrics for |
| 32 | +TPM throughout. |
| 33 | + |
| 34 | +## 1. Build / static gates |
| 35 | + |
| 36 | +- [ ] `cargo build --release` clean |
| 37 | +- [ ] `cargo clippy --workspace --all-targets -- -D warnings` clean |
| 38 | +- [ ] `cargo test --workspace` green |
| 39 | +- [ ] `cargo fmt --all --check` clean |
| 40 | +- [ ] CI green on all target platforms (linux x86_64/aarch64, macOS arm64/x86_64) |
| 41 | + |
| 42 | +## 2. Index build + format |
| 43 | + |
| 44 | +- [ ] Build a plain transcriptome index; build a decoy-aware index |
| 45 | + (`-d decoys.txt --keepDuplicates`); both load. |
| 46 | +- [ ] `info.json` records `index_version`; loading an index built by a release |
| 47 | + older than the current `MIN_READABLE_INDEX_VERSION` is **rejected** with an |
| 48 | + actionable rebuild message (guard for the index-format-version bump). |
| 49 | +- [ ] Short (`< k`) transcripts are retained and appear in `quant.sf` with their |
| 50 | + true `Length`, a sane `EffectiveLength` (≥ 1), and 0 reads. No transcript |
| 51 | + from the input FASTA is missing from `quant.sf`. |
| 52 | +- [ ] Decoys ≤ k, decoys with N-runs, and short transcripts compose correctly |
| 53 | + (covered by the `build_composes_*` index unit test; re-confirm it runs). |
| 54 | + |
| 55 | +## 3. Quantification modes (reads) |
| 56 | + |
| 57 | +Each should complete, conserve reads, and produce a sane `quant.sf`. |
| 58 | + |
| 59 | +- [ ] Selective alignment, paired-end (`-l A`) |
| 60 | +- [ ] Sketch / pseudoalignment, paired-end (`--sketch`) |
| 61 | +- [ ] Single-end (`-r`), both SA and sketch |
| 62 | +- [ ] Multiple input files / lanes (`-1 a.fq b.fq -2 c.fq d.fq`), plain **and** |
| 63 | + gzipped — mapped count matches the concatenated single-file run |
| 64 | +- [ ] Explicit library types `IU` / `ISF` / `ISR` and auto `A` all run; `ISF`+`ISR` |
| 65 | + partition an unstranded set |
| 66 | +- [ ] `--useEM` (vs default VBEM): both run; near-identical ranking on clean data |
| 67 | +- [ ] `--numBootstraps N` and `--numGibbsSamples N`: inferential replicate files |
| 68 | + written under `aux_info/bootstrap/` |
| 69 | +- [ ] `--skipQuant`: equivalence classes emitted, no `quant.sf` |
| 70 | + |
| 71 | +## 4. Bias / fragment-length / length correction |
| 72 | + |
| 73 | +- [ ] `--seqBias`, `--gcBias`, `--posBias`, and all three composed — each |
| 74 | + completes and shifts quant without breaking the expressed set |
| 75 | +- [ ] **`--seqBias --gcBias` on a decoy-aware index completes without a |
| 76 | + single-core stall** (regression guard for the #1019 abundance-phase hang) |
| 77 | +- [ ] **Mass conservation**: `Σ NumReads` in `quant.sf` ≈ `num_mapped` |
| 78 | + (loss < ~1 fragment) with bias on a decoy index (guard for the log-space |
| 79 | + eq-class normalization fix) |
| 80 | +- [ ] Bias-corrected quant vs C++ is at the expected concordance level |
| 81 | +- [ ] `--noFragLengthDist` measurably changes quant (the FLD term is actually |
| 82 | + used) in **both** SA and sketch; `--noSingleFragProb`, `--noLengthCorrection` |
| 83 | + run |
| 84 | +- [ ] FLD is trained from data (observed `flenDist` mean differs from the prior) |
| 85 | + in both SA and sketch |
| 86 | + |
| 87 | +## 5. Output / auxiliary surfaces |
| 88 | + |
| 89 | +- [ ] `quant.sf`, `cmd_info.json`, `lib_format_counts.json`, |
| 90 | + `aux_info/meta_info.json` present and structurally complete |
| 91 | +- [ ] `-g t2g` gene aggregation (`quant.genes.sf`) and `salmon quantmerge` |
| 92 | +- [ ] `--dumpEq` / `--dumpEqWeights` (`eq_classes.txt.gz`), `--writeUnmappedNames` |
| 93 | +- [ ] `--writeMappings` SAM: realistic positions/flags; valid in SA **and** |
| 94 | + sketch (sketch must not emit `POS=1` for every record); does not crash with |
| 95 | + `--gcBias` (incl. on a decoy index — Rust has no analog of the C++ #1010 |
| 96 | + `--writeMappings`+`--gcBias` segfault) |
| 97 | +- [ ] bias model dumps (`obs/exp` seq/gc/pos) written under the bias flags; |
| 98 | + `fld.gz` always |
| 99 | + |
| 100 | +## 6. Alignment-based input (`-a`) |
| 101 | + |
| 102 | +Generate alignments with bowtie2 (transcriptome index), reporting many |
| 103 | +multimappers (`-k 100`). |
| 104 | + |
| 105 | +- [ ] `-k 100` multimapping BAM: completes, mapped count and quant match C++ `-a` |
| 106 | +- [ ] **Messy BAM without `--no-discordant`/`--no-mixed`** (genuine discordant + |
| 107 | + mixed/singleton + secondary records): does **not** panic; mapped count |
| 108 | + matches C++; discordant/mixed mates degrade to orphan placements, nothing |
| 109 | + dropped or assumed-paired |
| 110 | +- [ ] **Coordinate-sorted BAM is rejected up front** with an actionable |
| 111 | + "collate by read name" message (allowing `GO:query` / `SO:queryname`) |
| 112 | +- [ ] Messy-vs-clean (`--no-*` flags) differ only gracefully (no crash/garbage) — |
| 113 | + confirms those flags are advisory, not required |
| 114 | + |
| 115 | +## 7. Determinism / robustness / edge cases |
| 116 | + |
| 117 | +- [ ] `-p 1` run twice → byte-identical `quant.sf` |
| 118 | +- [ ] `-p N` (N>1) run twice → negligible run-to-run TPM drift |
| 119 | +- [ ] gzipped input == plain input (identical `quant.sf`) |
| 120 | +- [ ] empty input → graceful error (no panic) |
| 121 | +- [ ] malformed FASTQ (record boundary broken) → clear error (no panic) |
| 122 | +- [ ] mismatched mate counts → graceful error |
| 123 | +- [ ] all-unmapped input (foreign reads) → 0 mapped, valid all-zero `quant.sf`, |
| 124 | + no crash |
| 125 | + |
| 126 | +## 8. Accuracy & parity (the headline numbers) |
| 127 | + |
| 128 | +- [ ] **Sim vs ground truth** (easy + hard, SA + sketch): Spearman/MARD in the |
| 129 | + expected band; SA ≥ sketch; enabling the FLD improves both |
| 130 | +- [ ] **Real data vs C++** (SA): mapping rate matches C++ to ~0.01%; TPM |
| 131 | + log-Pearson ≈ 0.99 / Spearman ≈ 0.97 |
| 132 | +- [ ] **Real data sketch**: mapping rate sane; concordance with SA in the |
| 133 | + expected band |
| 134 | +- [ ] Runtime / peak RSS in line with the previous release (no large regression) |
| 135 | +- [ ] No errors/panics/unexpected warnings in any run log |
| 136 | + |
| 137 | +## 9. Publish gates |
| 138 | + |
| 139 | +- [ ] All release work merged to `master`; `master` CI green |
| 140 | +- [ ] Release notes finalized (`docs/release-notes-<version>.md`, no "draft" |
| 141 | + marker), covering every behavioral change |
| 142 | +- [ ] External deps (cf1-rs, piscem-rs, ksw2rs) published on crates.io at the |
| 143 | + versions the workspace requires |
| 144 | +- [ ] `scripts/bump_and_publish.sh <version> --dry-run` shows the correct version |
| 145 | + bump, tag, and dependency-ordered crate list; leaf-crate |
| 146 | + `cargo publish -p salmon-core --dry-run` packages + verifies cleanly |
| 147 | +- [ ] Publish: `scripts/bump_and_publish.sh <version> --publish` (bump → commit → |
| 148 | + tag → push → cargo-dist binaries → publish the 9 `salmon-*` crates) |
| 149 | + |
| 150 | +## Known gaps / candidates to add |
| 151 | + |
| 152 | +Not yet part of the routine sweep; add as bandwidth allows or if a release |
| 153 | +touches the relevant code: |
| 154 | + |
| 155 | +- Mate-pair / outward library types (`MSF`/`OSF`/…) end-to-end. |
| 156 | +- `--fullLengthAlignment`, `--mimicBT2` / `--mimicStrictBT2` mapping presets. |
| 157 | +- `--noEffectiveLengthCorrection` (distinct from `--noLengthCorrection`). |
| 158 | +- FLD-prior knobs (`--fldMean`/`--fldSD`/`--fldMax`) and multimapping caps |
| 159 | + (`--maxReadOcc`/`--maxOccsPerHit`). |
| 160 | +- Truncated/corrupt gzip and truncated BAM robustness. |
| 161 | +- Format-level diff of `eq_classes.txt.gz` and the binary bias dumps vs C++ |
| 162 | + (currently presence + shape are checked, not byte content). |
| 163 | +- `--gencode` reference-name munging at index build. |
0 commit comments