Skip to content

Conversation

@holmeso
Copy link
Contributor

@holmeso holmeso commented Nov 6, 2025

This pull request refactors and improves utility methods in the tiled aligner codebase, focusing on performance, readability, and logging. The most significant changes include optimizing overlap detection logic, modernizing Java idioms, and enhancing logging control for debugging and information output. The changes are grouped below by theme.

Overlap Detection and Logic Improvements

  • Refactored the doRecordsOverlapReference method in BLATRecordUtil.java to use a more robust and standard interval overlap check (max(start) < min(end)), improving correctness and clarity.
  • Updated logic in removeOverlappingRecords to optimize list allocation and removed unnecessary code comments.

Logging Enhancements

  • Introduced conditional logging based on log level (INFO/DEBUG) throughout TiledAlignerUtil.java, reducing unnecessary log output and improving performance. This includes checks before logging and more detailed debug information during Smith-Waterman processing. [1] [2] [3] [4] [5] [6] [7] [8] [9]
  • Added import for QLevel to support new logging level checks.

Modern Java Idioms and Code Simplification

  • Replaced deprecated or verbose Java constructs with modern equivalents, such as using List.of() instead of Arrays.asList(), and method references in stream operations and Optional.ifPresent. [1] [2] [3]
  • Used list.getFirst() instead of list.get(0) for clarity and safety in retrieving the first element of a list.
  • Simplified lambda expressions and method references for better readability.

Algorithmic Improvements

  • Rewrote the homopolymer repeat check in doesSequenceHaveMostlySingleBaseRepeats for efficiency and clarity, replacing a stream-based approach with a direct scan and threshold logic.

Minor Cleanups

  • Removed unnecessary or commented-out code, such as redundant checks in loops and method bodies, to improve maintainability. [1] [2]
  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Unit tests have been added.
qmule's ISizeClusters has been run against failing input (due to unpaired reads)
qsv tested against failing BAMs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants