Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds a new function merge_array_expressions that provides a more flexible way to merge array expressions compared to the existing merge_freq_arrays function. The new function can handle both integer arrays and struct arrays with customizable field merging, while merge_freq_arrays is refactored to use the new function internally.
Key changes:
- Adds
merge_array_expressionsfunction with support for both integer and struct arrays - Refactors
merge_freq_arraysto use the new generalized function - Adds comprehensive test coverage for both functions
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tests/utils/test_annotations.py | Adds extensive test coverage for the new merge_array_expressions and existing merge_freq_arrays functions |
| gnomad/utils/annotations.py | Implements merge_array_expressions and refactors merge_freq_arrays to use it |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
So this probably shouldve been a commit but I find Cursor adds too many basic comments to the code, even when things are obvious. I removed comments that I found uninformative. I think generally for inline comments, if the comment just explains what is happening and its obvious from the code, its unnecessary. If something is complex, a what comment makes sense but in these test cases, I don't think they are needed. Other than those, the logic looks good to me!
| expected_meta = [{"group": "A"}, {"group": "B"}, {"group": "C"}] | ||
| assert result_meta == expected_meta | ||
|
|
||
| # Check that both frequency and count arrays are properly merged. |
There was a problem hiding this comment.
| # Check that both frequency and count arrays are properly merged. |
| ), | ||
| ) | ||
|
|
||
| # Create metadata. |
There was a problem hiding this comment.
| # Create metadata. |
| meta1 = [{"group": "A"}, {"group": "B"}] | ||
| meta2 = [{"group": "A"}, {"group": "B"}] | ||
|
|
||
| # Merge with set_negatives_to_zero=True. |
There was a problem hiding this comment.
| # Merge with set_negatives_to_zero=True. |
| set_negatives_to_zero=True, | ||
| )[0] | ||
|
|
||
| # Collect results. |
There was a problem hiding this comment.
| # Collect results. |
| # Collect results. | ||
| result = ht.select(result_freq=result_freq).collect() | ||
|
|
||
| # Check that negative values are set to zero. |
There was a problem hiding this comment.
| # Check that negative values are set to zero. |
Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
add missed suggestion Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
PR refactors
merge_freq_arraysfor flexibility:merge_array_expressionsto flexibly merge frequency annotationsmerge_freq_arraysto call newmerge_array_expressionsfunction