Skip to content

Add new function merge_array_expressions#791

Merged
ch-kr merged 18 commits intomainfrom
kc/merge_an
Jul 23, 2025
Merged

Add new function merge_array_expressions#791
ch-kr merged 18 commits intomainfrom
kc/merge_an

Conversation

@ch-kr
Copy link
Copy Markdown
Contributor

@ch-kr ch-kr commented Jul 18, 2025

PR refactors merge_freq_arrays for flexibility:

  • Adds new function merge_array_expressions to flexibly merge frequency annotations
  • Updates merge_freq_arrays to call new merge_array_expressions function
  • Adds tests for both functions

@ch-kr ch-kr requested a review from Copilot July 21, 2025 14:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new function merge_array_expressions that provides a more flexible way to merge array expressions compared to the existing merge_freq_arrays function. The new function can handle both integer arrays and struct arrays with customizable field merging, while merge_freq_arrays is refactored to use the new function internally.

Key changes:

  • Adds merge_array_expressions function with support for both integer and struct arrays
  • Refactors merge_freq_arrays to use the new generalized function
  • Adds comprehensive test coverage for both functions

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
tests/utils/test_annotations.py Adds extensive test coverage for the new merge_array_expressions and existing merge_freq_arrays functions
gnomad/utils/annotations.py Implements merge_array_expressions and refactors merge_freq_arrays to use it

Comment thread gnomad/utils/annotations.py
Comment thread gnomad/utils/annotations.py Outdated
Comment thread tests/utils/test_annotations.py Outdated
ch-kr and others added 3 commits July 21, 2025 10:03
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@ch-kr ch-kr marked this pull request as ready for review July 21, 2025 14:35
@ch-kr ch-kr requested a review from a team as a code owner July 21, 2025 14:35
@mike-w-wilson mike-w-wilson self-requested a review July 22, 2025 12:45
@mike-w-wilson mike-w-wilson self-assigned this Jul 22, 2025
Copy link
Copy Markdown
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this probably shouldve been a commit but I find Cursor adds too many basic comments to the code, even when things are obvious. I removed comments that I found uninformative. I think generally for inline comments, if the comment just explains what is happening and its obvious from the code, its unnecessary. If something is complex, a what comment makes sense but in these test cases, I don't think they are needed. Other than those, the logic looks good to me!

Comment thread gnomad/utils/annotations.py Outdated
Comment thread tests/utils/test_annotations.py Outdated
Comment thread tests/utils/test_annotations.py Outdated
Comment thread tests/utils/test_annotations.py Outdated
Comment thread tests/utils/test_annotations.py Outdated
Comment thread tests/utils/test_annotations.py Outdated
expected_meta = [{"group": "A"}, {"group": "B"}, {"group": "C"}]
assert result_meta == expected_meta

# Check that both frequency and count arrays are properly merged.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Check that both frequency and count arrays are properly merged.

Comment thread tests/utils/test_annotations.py Outdated
),
)

# Create metadata.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Create metadata.

Comment thread tests/utils/test_annotations.py Outdated
meta1 = [{"group": "A"}, {"group": "B"}]
meta2 = [{"group": "A"}, {"group": "B"}]

# Merge with set_negatives_to_zero=True.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Merge with set_negatives_to_zero=True.

Comment thread tests/utils/test_annotations.py Outdated
set_negatives_to_zero=True,
)[0]

# Collect results.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Collect results.

Comment thread tests/utils/test_annotations.py Outdated
# Collect results.
result = ht.select(result_freq=result_freq).collect()

# Check that negative values are set to zero.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Check that negative values are set to zero.

ch-kr and others added 2 commits July 23, 2025 15:00
Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
add missed suggestion

Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
@ch-kr ch-kr requested a review from mike-w-wilson July 23, 2025 19:01
Copy link
Copy Markdown
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@ch-kr ch-kr merged commit daf45c0 into main Jul 23, 2025
6 checks passed
@ch-kr ch-kr deleted the kc/merge_an branch July 23, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants