You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: changelog.md
+10Lines changed: 10 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,16 @@
1
1
Changelog
2
2
=========
3
3
4
+
Version 0.5.0
5
+
------------
6
+
- Added `n_genes` arguments allowing to return only the top N genes per group when `return_as_scanpy=True`. This allowed to match `scanpy`'s sorting method (partial sort) resulting in better reproducibility of scanpy results.
7
+
- Fixed genes ordering in the scanpy formatter, by removing redundant sorting of perturbation names as `encode_and_count_groups` already returns sorted unique perturbation names. This ensures that gene names are sorted the same way everywhere.
8
+
- Added explicit testing of genes ordering. In the PBMC dataset, lots of genes end up with identical z-scores but different logfoldchanges. This was not caught by previous tests.
9
+
- Improved CSR chunking mechanism for the OVO test, resulting in faster execution and much smaller memory footprint. A direct implication is that `batch_size` can grow much larger now.
10
+
- On TAHOE's `plate3` (in RAM) with `batch_size=1024`, this reduced memory footprint from 35GB to 1.5GB, and runtime from 1:17 to 0:50 with 8 CPUs.
11
+
- The reduced footprint allows to scale more aggressively `n_threads`. With 32 threads, TAHOE's `plate3` runs in 21 seconds, while eating only 2.5GB of RAM.
12
+
- Fold change is now computed with `(numerator + 1.e-9) / (denominator + 1.e-9)` to avoid division by zero, and to be more consistent with scanpy's implementation. This has no effect on the ranking of genes, but allows to get finite fold change values for all genes.
13
+
4
14
Version 0.4.0
5
15
------------
6
16
- Added option to return scanpy-friendly output with `return_as_scanpy` arg. `asymptotic_wilcoxon` returns either:
The fold-change computed by `illico` depends on the value of `is_log1p` and `exp_post_agg` as follows:
31
+
32
+
⚠️ Note that a `1.e-9` will be added to both numerator and denominator to avoid division by zero, and to be more consistent with `scanpy`'s implementation. This has no effect on the ranking of genes, but allows to get finite fold change values for all genes.
⚠️ Please note that by default, `scanpy.rank_genes_groups` assumes that your data is log1p-transformed, and exponentiates after aggregation. Consequently, if you are coming from `scanpy` and want to drop-in replace `scanpy.tl.rank_genes_groups`, you should set:
0 commit comments