Re-implementating co_occurrence() #975

wenjie1991 · 2025-03-19T09:46:18Z

IMPORTANT: Please search among the Pull requests before creating one.

Description

Togeter with @MDLDan we reimplement the squidpy.gr.co_occurrence() function using Numba.
The new algorithm removes the need for a pre-calculated pairwise distance matrix, enabling it to handle large datasets without splitting. Parallel processing is enabled by default, increasing the runtime speed by 40 times.

We also implemented it in Rust using PyO3 and achieved similar performance. We chose to push the Numba implementation.

Following issues are related:
#229
#755
#223
#582

How has this been tested?

All squidpy.gr.co_occurrence() related have passed in squidpy package test.
We also compared the new and old implementations output:
- Until the number of cells do not require squidpy to split the differences are in the 1e-08 range.
- When the number of cells requires squidpy to spit differences are in the order of 1e-02 see (
  .

Closes

closes #755

for more information, see https://pre-commit.ci

…nto numba-co-occurrence

for more information, see https://pre-commit.ci

codecov-commenter · 2025-03-19T15:10:47Z

Codecov Report

Attention: Patch coverage is 53.75000% with 37 lines in your changes missing coverage. Please review.

Project coverage is 66.60%. Comparing base (4a632d6) to head (26200d3).
Report is 189 commits behind head on main.

Files with missing lines	Patch %	Lines
src/squidpy/gr/_ppatterns.py	53.75%	34 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #975      +/-   ##
==========================================
- Coverage   69.99%   66.60%   -3.39%     
==========================================
  Files          39       40       +1     
  Lines        5532     6079     +547     
  Branches     1037     1031       -6     
==========================================
+ Hits         3872     4049     +177     
- Misses       1367     1669     +302     
- Partials      293      361      +68

Files with missing lines	Coverage Δ
src/squidpy/gr/_ppatterns.py	`79.85% <53.75%> (+0.88%)`	⬆️

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Intron7 · 2025-03-21T17:53:19Z

@wenjie1991 This already looks very good and promising. But I believe you can squeeze out even more performance. You can start by adjusting the memory access pattern to be efficent. You can also numba_njit the outer function and parallelize it. Also i would cache the kernel that makes it even more efficent.

wenjie1991 · 2025-03-24T12:59:02Z

@MDLDan

wenjie1991 and others added 6 commits March 18, 2025 17:42

perf implement rust co-occurrence statistics

2023996

misc: change rust-py deps

e51f546

doc: improve the documentation

a5b8226

add python re-implementation

f7ff293

[pre-commit.ci] auto fixes from pre-commit.com hooks

9af4252

for more information, see https://pre-commit.ci

Clean the tests and dependencies

9457767

Intron7 requested review from timtreis and Intron7 March 19, 2025 10:45

wenjie1991 and others added 2 commits March 19, 2025 16:02

Merge branch 'numba-co-occurrence' of github.com:wenjie1991/squidpy i…

0ec6985

…nto numba-co-occurrence

[pre-commit.ci] auto fixes from pre-commit.com hooks

92f3da5

for more information, see https://pre-commit.ci

Merge branch 'main' into numba-co-occurrence

ad674ad

timtreis and others added 2 commits March 28, 2025 11:47

Merge branch 'main' into numba-co-occurrence

057decc

Merge branch 'main' into numba-co-occurrence

26200d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-implementating co_occurrence() #975

Re-implementating co_occurrence() #975

wenjie1991 commented Mar 19, 2025

codecov-commenter commented Mar 19, 2025 •

edited

Loading

Intron7 commented Mar 21, 2025

wenjie1991 commented Mar 24, 2025

Re-implementating co_occurrence() #975

Are you sure you want to change the base?

Re-implementating co_occurrence() #975

Conversation

wenjie1991 commented Mar 19, 2025

Description

How has this been tested?

Closes

codecov-commenter commented Mar 19, 2025 • edited Loading

Codecov Report

Intron7 commented Mar 21, 2025

wenjie1991 commented Mar 24, 2025

codecov-commenter commented Mar 19, 2025 •

edited

Loading