You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactoring of OMR response processing to reduce cyclomatic complexity
and improve code maintainability without changing behavior.
Motivation
This pull request was created as part of an educational assignment
focused on code quality analysis and working with pull requests.
The repository was analyzed using Radon, which revealed a very high
cyclomatic complexity in the read_omr_response method.
Cyclomatic Complexity (Radon)
Before
ImageInstanceOps.read_omr_response: E (39)
ImageInstanceOps (class): B (9)
Average complexity: B (8.7)
Blocks analyzed: 10
After
ImageInstanceOps.read_omr_response: B (7)
ImageInstanceOps (class): A (4)
Average complexity: A (3.92)
Blocks analyzed: 25
Improvement
read_omr_response: −32 (−82%)
Average complexity: −4.78 (−55%)
What was done
Split a large, monolithic method into focused private helper methods
Reduced nesting and localized decision logic
Preserved all original comments for readability and future work
Fixed a potential UnboundLocalError caused by variable scope
Kept algorithmic behavior and output format unchanged
Notes
Refactoring only; no functional or behavioral changes intended
No performance claims
Changes were validated using Radon before and after refactoring
PR Type
Enhancement
Description
Split monolithic read_omr_response method into 11 focused helper methods
Reduced cyclomatic complexity from E(39) to B(7) in main method
Improved code maintainability and readability through method extraction
Fixed potential UnboundLocalError in _init_debug_boxplots and _plot_box_types_if_needed
Diagram Walkthrough
flowchart LR
A["read_omr_response<br/>monolithic method"] -->|extract| B["_prepare_response_images"]
A -->|extract| C["_prepare_morph_for_alignment"]
A -->|extract| D["_auto_align_field_blocks"]
A -->|extract| E["_collect_q_stats"]
A -->|extract| F["_compute_thresholds"]
A -->|extract| G["_detect_marked_bubbles"]
G -->|extract| H["_compute_qstrip_threshold"]
G -->|extract| I["_scan_and_draw_bubbles"]
G -->|extract| J["_apply_detected_bubbles_to_response"]
A -->|extract| K["_plot_box_types_if_needed"]
A -->|extract| L["_show_final_align_if_needed"]
A -->|result| M["Reduced complexity<br/>E39 → B7"]
Loading
File Walkthrough
Relevant files
Refactoring
core.py
Extract 11 helper methods from monolithic OMR processing
src/core.py
Refactored read_omr_response method by extracting 11 private helper methods
Created _prepare_response_images to handle image copying, resizing, and normalization
Created _prepare_morph_for_alignment to prepare morphological image for alignment
Created _init_debug_boxplots to initialize debug visualization structures with null-safety
Created _auto_align_field_blocks to handle field block alignment logic
Created _render_alignment_debug to render alignment visualization
Created _collect_q_stats to collect bubble value statistics
Created _compute_thresholds to calculate global thresholds
Created _detect_marked_bubbles as main bubble detection orchestrator
Created _compute_qstrip_threshold to compute per-strip thresholds
Created _scan_and_draw_bubbles to detect and draw marked bubbles
Created _apply_detected_bubbles_to_response to update response with detected values
Created _apply_empty_if_none_detected to handle empty field cases
Created _collect_c_box_debug to collect debug box plot data
Created _plot_box_types_if_needed to render box plots with null-safety check
Created _show_final_align_if_needed to display alignment visualization
Fixed potential UnboundLocalError by initializing all_c_box_vals and q_nums to None
Preserved all original comments and algorithmic behavior
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation
Status: Missing null validation: Method _scan_and_draw_bubbles doesn't validate if field_block_bubbles is empty before accessing bubble properties, risking IndexError
Referred Code
def_scan_and_draw_bubbles(
self,
*,
field_block,
field_block_bubbles,
per_q_strip_threshold,
all_q_vals,
total_q_box_no,
final_marked,
box_w,
box_h,
):
# TODO: get rid of total_q_box_nodetected_bubbles= []
forbubbleinfield_block_bubbles:
bubble_is_marked=per_q_strip_threshold>all_q_vals[total_q_box_no]
total_q_box_no+=1# Use these in both branchesx, y, field_value= (
bubble.x+field_block.shift,
... (clipped39lines)
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities
Status: Missing input validation: Methods like _prepare_response_images and _collect_q_stats don't validate input parameters (template, image) for null or invalid values before processing
Referred Code
def_prepare_response_images(self, template, image):
config=self.tuning_configimg=image.copy()
# origDim = img.shape[:2]img=ImageUtils.resize_util(img, template.page_dimensions[0], template.page_dimensions[1])
ifimg.max() >img.min():
img=ImageUtils.normalize_util(img)
# Processing copiestransp_layer=img.copy()
final_marked=img.copy()
returnimg, transp_layer, final_markeddef_prepare_morph_for_alignment(self, img, auto_align):
config=self.tuning_configmorph=img.copy()
self.append_save_img(3, morph)
ifauto_align:
# Note: clahe is good for morphology, bad for thresholdingmorph=CLAHE_HELPER.apply(morph)
self.append_save_img(3, morph)
... (clipped165lines)
Create a dedicated OMRResponseProcessor class to encapsulate the state and logic of OMR processing. This will replace long parameter lists in helper methods with instance attributes, improving state management and reducing coupling.
classOMRResponseProcessor:
def__init__(self, template, image, name, config, ...):
self.template=templateself.image=image# ... other attributes for state ...self.img=Noneself.final_marked=Noneself.all_q_vals=Noneself.global_thr=None# ...defprocess(self):
self._prepare_response_images()
self._collect_q_stats()
self._compute_thresholds()
self._detect_marked_bubbles()
returnself.omr_response, ...
def_detect_marked_bubbles(self):
# Accesses state via self.img, self.all_q_vals, etc.passclassImageInstanceOps:
defread_omr_response(self, template, image, name, ...):
processor=OMRResponseProcessor(template, image, name, self.tuning_config, self)
returnprocessor.process()
Suggestion importance[1-10]: 9
__
Why: This is an excellent architectural suggestion that addresses a significant design flaw (tight coupling via long parameter lists) introduced by the refactoring, proposing a solution that would greatly improve state management and code structure.
High
Possible issue
Fix uninitialized variable bug
Fix a bug in _detect_marked_bubbles where multi_roll is never updated. Update the function to correctly calculate and return the multi_roll status, restoring logic lost during refactoring.
Why: This suggestion correctly identifies a critical bug introduced by the refactoring where the multi_roll logic was lost, causing it to always be 0. Restoring this functionality is essential for correctness.
High
Restore missing multi-roll detection logic
Restore the missing multi_roll detection logic in _apply_detected_bubbles_to_response. The function should accept multi_roll, update it, and return it to fix a bug introduced during refactoring.
-def _apply_detected_bubbles_to_response(self, *, detected_bubbles, omr_response, multi_marked):+def _apply_detected_bubbles_to_response(self, *, detected_bubbles, omr_response, multi_marked, multi_roll):
for bubble in detected_bubbles:
field_label, field_value = (
bubble.field_label,
bubble.field_value,
)
# Only send rolls multi-marked in the directory
multi_marked_local = field_label in omr_response
omr_response[field_label] = (
(omr_response[field_label] + field_value)
if multi_marked_local
else field_value
)
# TODO: generalize this into identifier
- # multi_roll = multi_marked_local and "Roll" in str(q)+ multi_roll = multi_roll or (multi_marked_local and "Roll" in str(field_label))
multi_marked = multi_marked or multi_marked_local
- return multi_marked+ return multi_marked, multi_roll
Apply / Chat
Suggestion importance[1-10]: 9
__
Why: This suggestion correctly identifies that the multi_roll calculation logic was lost during refactoring. It provides an accurate fix to restore this critical functionality, which is necessary for the overall correctness of the read_omr_response method.
Hey @IlyNosov , thanks for the refactor PR and welcome to OMRChecker!
If possible can you please check the dev branch and see if this logic could be refactored (I have moved that logic into a two parts - detection/interpretation segregation)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Summary
Refactoring of OMR response processing to reduce cyclomatic complexity
and improve code maintainability without changing behavior.
Motivation
This pull request was created as part of an educational assignment
focused on code quality analysis and working with pull requests.
The repository was analyzed using Radon, which revealed a very high
cyclomatic complexity in the
read_omr_responsemethod.Cyclomatic Complexity (Radon)
Before
ImageInstanceOps.read_omr_response: E (39)ImageInstanceOps(class): B (9)After
ImageInstanceOps.read_omr_response: B (7)ImageInstanceOps(class): A (4)Improvement
read_omr_response: −32 (−82%)What was done
UnboundLocalErrorcaused by variable scopeNotes
PR Type
Enhancement
Description
Split monolithic
read_omr_responsemethod into 11 focused helper methodsReduced cyclomatic complexity from E(39) to B(7) in main method
Improved code maintainability and readability through method extraction
Fixed potential
UnboundLocalErrorin_init_debug_boxplotsand_plot_box_types_if_neededDiagram Walkthrough
File Walkthrough
core.py
Extract 11 helper methods from monolithic OMR processingsrc/core.py
read_omr_responsemethod by extracting 11 private helpermethods
_prepare_response_imagesto handle image copying, resizing,and normalization
_prepare_morph_for_alignmentto prepare morphological imagefor alignment
_init_debug_boxplotsto initialize debug visualizationstructures with null-safety
_auto_align_field_blocksto handle field block alignment logic_render_alignment_debugto render alignment visualization_collect_q_statsto collect bubble value statistics_compute_thresholdsto calculate global thresholds_detect_marked_bubblesas main bubble detection orchestrator_compute_qstrip_thresholdto compute per-strip thresholds_scan_and_draw_bubblesto detect and draw marked bubbles_apply_detected_bubbles_to_responseto update response withdetected values
_apply_empty_if_none_detectedto handle empty field cases_collect_c_box_debugto collect debug box plot data_plot_box_types_if_neededto render box plots with null-safetycheck
_show_final_align_if_neededto display alignment visualizationUnboundLocalErrorby initializingall_c_box_valsandq_numstoNone