Skip to content

Releases: remydubois/illico

0.5.1

26 Apr 19:11
b76e910

Choose a tag to compare

This release fixes a bug when sparse matrices contain both zeros and strictly negative values, cf this discussion.

0.5.0rc2

20 Apr 18:31
4f7c77a

Choose a tag to compare

For better compatibility with scanpy, lower bounds on dependencies have been relaxed.

0.5.0rc1

12 Apr 18:06
e7fc038

Choose a tag to compare

This version improves compatibility with scanpy:

  • Added n_genes arguments allowing to return only the top N genes per group when return_as_scanpy=True. This allowed to match scanpy's sorting method (partial sort) resulting in better reproducibility of scanpy results.
  • Fixed genes ordering in the scanpy formatter, by removing redundant sorting of perturbation names as encode_and_count_groups already returns sorted unique perturbation names.
  • Added explicit testing of genes ordering. In the PBMC dataset, lots of genes end up with identical z-scores but different logfoldchanges. This was not caught by previous tests.
  • Fold change is now computed with (numerator + 1.e-9) / (denominator + 1.e-9) to avoid division by zero, and to be more consistent with scanpy's implementation. This has no effect on the ranking of genes, but allows to get finite fold change values for all genes.

It also includes some performance improvements:

  • Improved CSR chunking mechanism for the OVO test, resulting in faster execution and much smaller memory footprint. A direct implication is that batch_size can grow much larger now.
    • On TAHOE's plate3 (in RAM) with batch_size=1024, this reduced memory footprint from 35GB to 1.5GB, and runtime from 1:17 to 0:50 with 8 threads.
    • The reduced footprint allows to scale more aggressively n_threads. With 32 threads, TAHOE's plate3 runs in 21 seconds, while eating only 2.5GB of RAM.

Also, it adds support for OVO test on lazy CSR (h5-based) datasets, through a specific parallelization scenario where groups are processed one by one.

0.4.0

26 Mar 11:31
670ca85

Choose a tag to compare

Illico can now be used as a drop-in replacement to sc.tl.rank_genes_groups:

  • Added option to return scanpy-friendly output with return_as_scanpy arg. asymptotic_wilcoxon returns either:
    • A pandas.DataFrame with columns feature, p_value, fold_change, and statistic (default), if return_as_scanpy=False
    • A dictionary containing the same keys as scanpy.tl.rank_genes_groups, if return_as_scanpy=True. Similarly as scanpy, genes are ordered by decreasing z-score.
      On a different topic:
  • Improved the batching mechanism, fixed the 'auto' mode that was excluding the very last gene in previous versions.

0.3.0

11 Mar 19:35
49720ba

Choose a tag to compare

This release adds a Rust backend to all test routines. The Numba backend remains useable by passing use_rust=False to asymptotic_wilcoxon. Future release will progressively drop support for the Numba backend.

0.2.0

07 Jan 22:39
a5faa11

Choose a tag to compare

  1. This release adds support for h5-based on-disk backed dense and CSC datasets
  2. This release adds option to run non tie-corrected tests

0.1.1

23 Dec 14:38
07055a9

Choose a tag to compare

This releases fixes the reference_group argument by renaming it to reference, just like scanpy.

0.1.0

22 Dec 20:23
d2456df

Choose a tag to compare

First release of illico.