Releases: remydubois/illico
0.5.1
This release fixes a bug when sparse matrices contain both zeros and strictly negative values, cf this discussion.
0.5.0rc2
For better compatibility with scanpy, lower bounds on dependencies have been relaxed.
0.5.0rc1
This version improves compatibility with scanpy:
- Added
n_genesarguments allowing to return only the top N genes per group whenreturn_as_scanpy=True. This allowed to matchscanpy's sorting method (partial sort) resulting in better reproducibility of scanpy results. - Fixed genes ordering in the scanpy formatter, by removing redundant sorting of perturbation names as
encode_and_count_groupsalready returns sorted unique perturbation names. - Added explicit testing of genes ordering. In the PBMC dataset, lots of genes end up with identical z-scores but different logfoldchanges. This was not caught by previous tests.
- Fold change is now computed with
(numerator + 1.e-9) / (denominator + 1.e-9)to avoid division by zero, and to be more consistent with scanpy's implementation. This has no effect on the ranking of genes, but allows to get finite fold change values for all genes.
It also includes some performance improvements:
- Improved CSR chunking mechanism for the OVO test, resulting in faster execution and much smaller memory footprint. A direct implication is that
batch_sizecan grow much larger now.- On TAHOE's
plate3(in RAM) withbatch_size=1024, this reduced memory footprint from 35GB to 1.5GB, and runtime from 1:17 to 0:50 with 8 threads. - The reduced footprint allows to scale more aggressively
n_threads. With 32 threads, TAHOE'splate3runs in 21 seconds, while eating only 2.5GB of RAM.
- On TAHOE's
Also, it adds support for OVO test on lazy CSR (h5-based) datasets, through a specific parallelization scenario where groups are processed one by one.
0.4.0
Illico can now be used as a drop-in replacement to sc.tl.rank_genes_groups:
- Added option to return scanpy-friendly output with return_as_scanpy arg. asymptotic_wilcoxon returns either:
- A pandas.DataFrame with columns feature, p_value, fold_change, and statistic (default), if return_as_scanpy=False
- A dictionary containing the same keys as scanpy.tl.rank_genes_groups, if return_as_scanpy=True. Similarly as scanpy, genes are ordered by decreasing z-score.
On a different topic:
- Improved the batching mechanism, fixed the 'auto' mode that was excluding the very last gene in previous versions.
0.3.0
This release adds a Rust backend to all test routines. The Numba backend remains useable by passing use_rust=False to asymptotic_wilcoxon. Future release will progressively drop support for the Numba backend.
0.2.0
- This release adds support for h5-based on-disk backed dense and CSC datasets
- This release adds option to run non tie-corrected tests
0.1.1
This releases fixes the reference_group argument by renaming it to reference, just like scanpy.
0.1.0
First release of illico.