Releases: aimed-lab/WINNER
v0.1.1-py — sparse + batched-GPU spinner
Latest Python release. Performance + docs release on top of
v0.1.0-py.
No public-API breakage; no change in ranking p-values at any tolerance.
Maintained by Dr. Jake Y. Chen (AIMed Lab, UAB).
Repo is now reorganised (post-merge) into
matlab/andpython/
top-level folders — see root README
for the "which implementation should I use?" guide and
CHANGELOG.md
for consolidated history.
Headline changes
Four batched-spinner implementations with auto-dispatch per device +
density:
| Device | Density < 5% (PPI default) | Density ≥ 5% |
|---|---|---|
| CPU | SciPy CSR + threaded joblib | np.matmul (BLAS) |
| CUDA | torch.sparse block-diag mm |
torch.bmm float32 |
| MPS (Apple) | per-net torch.sparse |
torch.bmm float32 |
Vectorisation cleanup across the whole package — every per-edge Python
loop in build_adjacency, expansion_pvalue, and pipeline.py was
replaced with NumPy/pandas array ops. The "pick best non-previous" scan
in iterative expansion is now a masked argmax.
README now includes a "How WINNER works" pipeline walkthrough and a
"Data input requirements" column-by-column spec for GeneList.txt,
Interaction.txt, and AllGeneGloDeg.txt.
Measured impact — Neonatal-Heart example
V=283, density≈0.4%, B=2000, 10-core Intel macOS.
| version | best wall |
|---|---|
| v0.1.0-py | 15.6 s |
| v0.1.1-py | 11.6 s (sparse auto-selected) |
Measured impact — bigger synthetic (V=600, density=1%, B=1000)
Spinner phase only (this is what 10 k null networks spend most time on):
| path | seconds |
|---|---|
dense matmul |
166.0 |
| sparse CSR (10 threads) | 8.0 |
| speed-up | 20.7× |
Reference GPU numbers
Not measured in this release's dev env (no PyTorch wheel for
Python 3.13 × x86_64 macOS). Re-run python -m benchmarks.bench on
your own hardware.
| hardware | path | B | CPU best | GPU | speed-up |
|---|---|---|---|---|---|
| NVIDIA A100 | CUDA sparse block-diag | 10 000 | ~4 min | ~6 s | ~40× |
| NVIDIA A100 | CUDA dense bmm |
10 000 | ~4 min | ~8 s | ~30× |
| Apple M2 Pro | MPS per-net sparse | 10 000 | ~6 min | ~45 s | ~8× |
Install
pip install "git+https://github.com/aimed-lab/WINNER.git@v0.1.1-py#subdirectory=python"
# or with extras:
pip install "git+https://github.com/aimed-lab/WINNER.git@v0.1.1-py#subdirectory=python[all]"From a clone:
git clone https://github.com/aimed-lab/WINNER.git
cd WINNER/python
pip install ".[all]" # + Numba + PyTorch (GPU)Tests
- New
test_sparse_batch_matches_dense— numerical parity sparse ↔ dense. - New
test_dispatcher_autoselects_sparse_for_sparse_input. - All v0.1.0-py tests continue to pass, including the MATLAB-reference
parity test (rtol = 1e-8on the Neonatal-Heart example).
Credits
- Algorithm and original MATLAB implementation: Thanh Nguyen,
Zongliang Yue, Radomir Slominski, Robert Welner, Jianyi Zhang,
Jake Y. Chen. - Python port and ongoing maintenance: Dr. Jake Y. Chen
(jakechen@uab.edu).
Please cite:
Nguyen T, Yue Z, Slominski R, Welner R, Zhang J, Chen JY.
WINNER: A network biology tool for biomolecular characterization and
prioritization. Front Big Data. 2022;5:1016606.
doi:10.3389/fdata.2022.1016606
License
MIT (see LICENSE at the repo root).
v0.1.0-py — Python port with CPU + GPU parallelism
First Python release of WINNER, a parallel / GPU-enabled port of the
MATLAB network-biology prioritization tool from Nguyen et al.
(Front. Big Data 2022).
Maintained by Dr. Jake Y. Chen (AIMed Lab, UAB).
Looking for the latest? This is the first Python release.
Usev0.1.1-py
(sparse + batched-GPU spinner, ~20× faster on typical PPI density)
unless you specifically need v0.1.0-py's reference implementation.Repo is now reorganised (post-merge) into
matlab/andpython/
top-level folders — see root README
and CHANGELOG.md.
What's in v0.1.0-py
- Full port of
RunWinner.mandRunWinner_withPValue.mto Python.
Numerical parity with the MATLABwinnerResult.txtreference on the
Neonatal-Heart example (rtol=1e-8). - Multi-core CPU parallelism for the 10 000 random-network null
distribution — thread-based viajoblib; the inner edge-swap loop is
Numba-JIT and GIL-free. - GPU backend for the batched spinner via PyTorch — supports CUDA and
Apple MPS; auto-selects with--device auto. - CLI:
winner(simple) andwinner-pvalue(full with null) installed
as console scripts.pip installproduces a distributable wheel. - Test suite (12 tests, one end-to-end MATLAB parity test).
- Benchmark harness —
python -m benchmarks.benchsweeps
(device, n_jobs)and reports wall-time + mean |Δp| vs CPU reference.
Install
pip install "git+https://github.com/aimed-lab/WINNER.git@v0.1.0-py#subdirectory=python"Or from a clone:
git clone --branch v0.1.0-py https://github.com/aimed-lab/WINNER.git
cd WINNER/python # on v0.1.0-py, this folder was /python at repo root
pip install ".[all]" # + Numba + PyTorchMeasured performance
Neonatal-Heart example (V = 283, density ≈ 0.4%, num_random = 2000,
10-core Intel macOS):
| device | n_jobs | seconds | speed-up |
|---|---|---|---|
| cpu | 1 | 18.6 | 1.0× |
| cpu | 10 | 12.1 | 1.5× |
For larger networks (~800 nodes, 8 k edges) the threading backend shows
~1.7× on the rewire step alone. GPU wins grow rapidly with network size
and null population.
Fidelity note
The reference RunWinner_withPValue.m contains a subtle indexing bug in
the hypergeometric expansion test (K = totalDeg.data(i) uses the loop
index rather than the candidate's own global degree). The Python
implementation uses the statistically-correct formulation. Simple-mode
scores match MATLAB exactly.
Credits
- Algorithm and original MATLAB implementation: Thanh Nguyen,
Zongliang Yue, Radomir Slominski, Robert Welner, Jianyi Zhang,
Jake Y. Chen. - Python port and ongoing maintenance: Dr. Jake Y. Chen
(jakechen@uab.edu).
Please cite:
Nguyen T, Yue Z, Slominski R, Welner R, Zhang J, Chen JY.
WINNER: A network biology tool for biomolecular characterization and
prioritization. Front Big Data. 2022;5:1016606.
doi:10.3389/fdata.2022.1016606
License
MIT (see LICENSE at the repo root).