Skip to content

Fix/negative values#17

Merged
remydubois merged 3 commits into
mainfrom
fix/negative-values
Apr 26, 2026
Merged

Fix/negative values#17
remydubois merged 3 commits into
mainfrom
fix/negative-values

Conversation

@remydubois
Copy link
Copy Markdown
Owner

@remydubois remydubois commented Apr 25, 2026

This PR fixes a bug in the sparse routines when input data contains negative values. Now, instead of artificially adding ranksum of zero values considering zeros are positioned before any other value, zero position is computed and ranksums is adjusted accordingly, cf this change

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes incorrect rank-sum/tie handling in sparse MWU routines when non-zero values can be negative by computing the true insertion position of implicit zeros and adjusting rank sums accordingly.

Changes:

  • Extend ranking utilities (Rust + Numba) to account for implicit zeros via a zero_values_offset and return zero_pos.
  • Update sparse OVR/OVO kernels to use zero_pos when adding zero contributions to rank sums (instead of assuming zeros come first).
  • Expand test coverage/fixtures to include negative-value scenarios; bump version + changelog entry.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/ranking.rs Adds zero-offset support and returns zero_pos from ranking helpers.
src/sparse_ovr.rs Uses returned zero_pos to correctly add zero rank contributions for sparse OVR.
src/sparse_ovo.rs Uses zero_pos to compute zero ranks correctly for sparse OVO.
src/dense_ovr.rs Adapts to new return type from accumulate_rank_and_tie_sums_from_argsort (currently with a compile issue).
src/dense_ovo.rs Updates call site for new rank_sum_and_ties signature.
illico/utils/ranking.py Adds zero-offset handling and returns zero_pos from Numba ranking helpers.
illico/ovr/sparse_ovr.py Adjusts sparse OVR zero rank contribution using zero_pos.
illico/ovr/dense_ovr.py Updates unpacking for new _accumulate_group_ranksums_from_argsort return.
illico/ovo/sparse_ovo.py Adjusts sparse OVO zero rank contribution using zero_pos.
illico/ovo/dense_ovo.py Updates unpacking for new rank_sum_and_ties_from_sorted return.
tests/utils/test_ranking.py Adds parametrized tests covering dense vs sparse (implicit zeros) with negative values.
tests/conftest.py Modifies randomized fixture to include negative-value columns/group patterns.
pyproject.toml Bumps project version to 0.5.1.
changelog.md Documents the negative-values sparse bugfix in 0.5.1.
Comments suppressed due to low confidence (1)

illico/utils/ranking.py:84

  • The type hint/docstring for this function still suggests it returns a 2-tuple (and even np.ndarray), but it now returns three scalars: (ranksum_B, tie_sum, zero_pos). Please update the return annotation and the "Returns" section of the docstring to reflect the new API (e.g., tuple[float, float, int]).
@njit(nogil=True)
def rank_sum_and_ties_from_sorted(A: np.ndarray, B: np.ndarray, zero_values_offset: int = 0) -> tuple[np.ndarray]:
    """Compute rank sums and tie sums from two 1-d sorted arrays.

    This routine is similar to the leetcode "merge two sorted arrays", except it
    never returns to sorted array, instead it accumulate rank sums of the second array
    and tie sums for the combined arrays.

    This routine sits at the core of the one-versus-one (or one-versus-control) asymptotic
    wilcoxon rank sum test as it allows to sort controls only once.
    Args:
        A (np.ndarray): The first sorted array (controls)
        B (np.ndarray): The second sorted array (perturbed)
        zero_values_offset (int): If > 0, it means that there are zeros not present in the input arrays but that
        they should be accounted for. This is only used when the input adata is sparse, and ranksum is computed
        on non zero values.

    Returns:
        tuple[np.ndarray]: Ranks sum from the second array, and tie sums for the combined
        arrays.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/dense_ovr.rs Outdated
Comment thread src/ranking.rs
Comment thread src/ranking.rs Outdated
Comment thread illico/utils/ranking.py
Comment thread illico/utils/ranking.py Outdated
Comment thread tests/utils/test_ranking.py Outdated
Comment thread tests/conftest.py Outdated
@remydubois remydubois merged commit b76e910 into main Apr 26, 2026
5 checks passed
@remydubois remydubois deleted the fix/negative-values branch April 26, 2026 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants