Skip to content

Handle empty inputs in parsing and mapping#61

Merged
Ulthran merged 2 commits intomasterfrom
codex/fix-summarize_all.py-to-handle-empty-inputs
Dec 12, 2025
Merged

Handle empty inputs in parsing and mapping#61
Ulthran merged 2 commits intomasterfrom
codex/fix-summarize_all.py-to-handle-empty-inputs

Conversation

@Ulthran
Copy link
Contributor

@Ulthran Ulthran commented Dec 12, 2025

Summary

  • ensure parsers gracefully handle empty or header-only inputs and preserve expected columns
  • reindex mapped outputs so final summaries include required fields even when inputs are empty
  • add regression tests covering empty inputs for parsers and map helpers

Testing

  • pytest (fails: missing pandas dependency in environment)

Codex Task

Copilot AI review requested due to automatic review settings December 12, 2025 16:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the robustness of data parsing and mapping functions by ensuring they gracefully handle empty and header-only input files while preserving expected column structures. The changes prevent downstream failures when processing empty datasets by using DataFrame.reindex() to guarantee consistent schemas even when all inputs are empty.

Key Changes

  • Added empty-case handling in parse_sylph() and parse_mash_winning_sorted_tab() to return DataFrames with complete column definitions after filtering operations
  • Updated all mapping functions to use DataFrame.reindex() instead of direct column selection, ensuring required columns exist even with empty inputs
  • Added comprehensive regression tests covering empty inputs for both parsers and mapping helpers

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

File Description
scripts/test_parse.py Added tests for header-only TSV files and empty sylph files to verify column structure preservation
scripts/test_map.py Added comprehensive test verifying all mapping functions handle empty parsed outputs correctly
scripts/parse.py Enhanced parse_mash_winning_sorted_tab() and parse_sylph() to return DataFrames with expected columns when inputs become empty after filtering
scripts/map.py Replaced direct column selection with DataFrame.reindex() in all mapping functions to guarantee column presence regardless of input emptiness

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Ulthran Ulthran merged commit c16bbf9 into master Dec 12, 2025
1 check passed
@Ulthran Ulthran deleted the codex/fix-summarize_all.py-to-handle-empty-inputs branch December 12, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants