fix: memory enhancement in import courses #380

reybahl · 2025-10-10T23:46:03Z

more efficient data processing / loading, specifically during the importing courses step during sync

sentry · 2025-10-10T23:46:12Z

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: ferry/transform/import_courses.py

Function	Unhandled Issue
`import_courses`	KeyError: 11423 ferry.transform.import_courses in... `Event Count:` 3
`resolve_cross_listings`	IndexError: single positional indexer is out-of-bounds ferry.transform.import_courses in resolve_cross_li... `Event Count:` 1

Copilot

Pull Request Overview

This PR implements memory optimizations to reduce memory usage during data transformation processes. The changes focus on reducing intermediate data structures, explicit garbage collection, and more efficient data processing patterns.

Key changes:

Implemented explicit garbage collection at strategic points in the transformation pipeline
Optimized professor and course rating computations to avoid large intermediate DataFrames
Added immediate cleanup of temporary data structures with explicit del statements

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
ferry/transform/init.py	Added garbage collection calls after major transformation steps
ferry/transform/transform_compute.py	Refactored rating computations to use dictionaries and direct aggregation instead of lambda functions and intermediate lists
ferry/transform/import_courses.py	Added cleanup of source data immediately after concatenation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T23:57:41Z

ferry/transform/transform_compute.py

+    # Pre-compute aggregated ratings for each same_course group to avoid repeated list creation
+    logging.debug("Pre-computing same-course rating aggregates")
+
+    def compute_aggregate_rating(course_ids: list[int], rating_dict: dict) -> tuple[float | None, int]:


The rating_dict parameter should have a more specific type annotation. Consider using dict[int, float] or dict[int, float | None] to better document the expected key-value types.

Suggested change

def compute_aggregate_rating(course_ids: list[int], rating_dict: dict) -> tuple[float | None, int]:

def compute_aggregate_rating(course_ids: list[int], rating_dict: dict[int, float | None]) -> tuple[float | None, int]:

ferry/transform/transform_compute.py

bearsyankees · 2025-10-12T00:06:52Z

lfg

another memory enhancement attempt

873c108

reybahl added 2 commits October 10, 2025 23:50

rm extraneous changes

ec99393

more reverts

b109228

reybahl requested a review from Copilot October 10, 2025 23:57

Copilot AI reviewed Oct 10, 2025

View reviewed changes

reybahl added 2 commits October 11, 2025 00:05

quick improvement

365b1f3

crn fix

4111452

reybahl changed the title ~~another memory enhancement attempt~~ fix: memory enhancement in course import Oct 11, 2025

reybahl changed the title ~~fix: memory enhancement in course import~~ fix: memory enhancement in import courses Oct 11, 2025

reybahl merged commit a542f8e into master Oct 11, 2025
6 checks passed

reybahl deleted the memory-enhancements-v2 branch October 11, 2025 01:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: memory enhancement in import courses #380

fix: memory enhancement in import courses #380

Uh oh!

reybahl commented Oct 10, 2025 •

edited

Loading

Uh oh!

sentry bot commented Oct 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bearsyankees commented Oct 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	def compute_aggregate_rating(course_ids: list[int], rating_dict: dict) -> tuple[float \| None, int]:
	def compute_aggregate_rating(course_ids: list[int], rating_dict: dict[int, float \| None]) -> tuple[float \| None, int]:

fix: memory enhancement in import courses #380

fix: memory enhancement in import courses #380

Uh oh!

Conversation

reybahl commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sentry bot commented Oct 10, 2025

🔍 Existing Issues For Review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bearsyankees commented Oct 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

reybahl commented Oct 10, 2025 •

edited

Loading